Changelog#
0.6.3 - 2022-12-20#
Changed#
Fixed#
Fixed a bug where PrivateJoin’s privacy relation would only accept string keys in the d_in. It now accepts any type of key.
0.6.2 - 2022-12-07#
This is a maintenance release which introduces a number of documentation improvements, but has no publicly-visible API changes.
Fixed#
check_java11()now has the correct behavior when Java is not installed.
0.6.1 - 2022-12-05#
Added#
Added approximate DP support to interactive mechanisms.
Added support for Spark 3.1 through 3.3, in addition to existing support for Spark 3.0.
Fixed#
Validation for
SparkedGroupDataFrameDomains used to fail with a SparkAnalysisExceptionin some environments. That should no longer happen.
0.6.0 - 2022-11-14#
Added#
Added new
PrivateJoinOnKeytransformation that works withAddRemoveKeys.Added inverse CDF methods to noise mechanisms.
0.5.1 - 2022-11-03#
Fixed#
Domains and metrics make copies of mutable constructor arguments and return copies of mutable properties.
0.5.0 - 2022-10-14#
Changed#
Core no longer depends on the
python-flintpackage, and instead packages libflint and libarb itself. Binary wheels are available, and the source distribution includes scripting to build these dependencies from source.
Fixed#
Equality checks on
SparkGroupedDataFrameDomains used to occasionally fail with a SparkAnalysisExceptionin some environments. That should no longer happen.AddRemoveKeysnow allows different names for the key column in each dataframe.
0.4.3 - 2022-09-01#
Core now checks to see if the user is running Java 11 or higher. If they are, Core either sets the appropriate Spark options (if Spark is not yet running) or raises an informative exception (if Spark is running and configured incorrectly).
0.4.2 - 2022-08-24#
Changed#
Replaced uses of PySpark DataFrame’s
intersectwith inner joins. See https://issues.apache.org/jira/browse/SPARK-40181 for background.
0.4.1 - 2022-07-25#
Added#
Added an alternate prng for non-intel architectures that don’t support RDRAND.
Add new metric
AddRemoveKeysfor multiple tables usingIfGroupedBy(X, SymmetricDifference()).Add new
TransformValuebase class for wrapping transformations to supportAddRemoveKeys.Add many new transformations using
TransformValue:FilterValue,PublicJoinValue,FlatMapValue,MapValue,DropInfsValue,DropNaNsValue,DropNullsValue,ReplaceInfsValue,ReplaceNaNsValue,ReplaceNullsValue,PersistValue,UnpersistValue,SparkActionValue,RenameValue,SelectValue.
Changed#
Fixed bug in
ReplaceNullsto not allow replacing values for grouping column inIfGroupedBy.Changed
ReplaceNulls,ReplaceNaNs, andReplaceInfsto only support specificIfGroupedBymetrics.
0.3.2 - 2022-06-23#
Changed#
Moved
IMMUTABLE_TYPESfromutils/testing.pytoutils/type_utils.pyto avoid importing nose when accessingIMMUTABLE_TYPES.
0.3.1 - 2022-06-23#
Changed#
Fixed
copy_if_mutableso that it works with containers that can’t be deep-copied.Reverted change from 0.3.0 “Add checks in
ParallelCompositionconstructor to only permit L1/L2 over SymmetricDifference or AbsoluteDifference.”Temporarily disabled flaky statistical tests.
0.3.0 - 2022-06-22#
Added#
Added new transformations
DropInfsandReplaceInfsfor handling infinities in data.Added
IfGroupedBy(X, SymmetricDifference())input metric.Added support for this metric to
Filter,Map,FlatMap,PublicJoin,Select,Rename,DropNaNs,DropNulls,DropInfs,ReplaceNulls,ReplaceNaNs, andReplaceInfs.
Added new truncation transformations for
IfGroupedBy(X, SymmetricDifference()):LimitRowsPerGroup,LimitKeysPerGroupAdded
AddUniqueColumnfor switching fromSymmetricDifferencetoIfGroupedBy(X, SymmetricDifference()).Added a topic guide around NaNs, nulls and infinities.
Changed#
Moved truncation transformations used by
PrivateJointo be functions (now inutils/truncation.py).Change
GroupByandPartitionByKeysto have anuse_l2argument instead ofoutput_metric.Fixed bug in
AddUniqueColumn.Operations that group on null values are now supported.
Modify
CountDistinctGroupedandCountDistinctso they work as expected with null values.Changed
ReplaceNulls,ReplaceNaNs, andReplaceInfsto only support specificIfGroupedBymetrics.Fixed bug in
ReplaceNullsto not allow replacing values for grouping column inIfGroupedBy.PrivateJoinhas a new parameter for__init__:join_on_nulls. Whenjoin_on_nullsisTrue, thePrivateJoincan join null values between both dataframes.Changed transformations and measurements to make a copy of mutable constructor arguments.
Add checks in
ParallelCompositionconstructor to only permit L1/L2 over SymmetricDifference or AbsoluteDifference.
Removed#
Removed old examples from
examples/. Future examples will be added directly to the documentation.
0.2.0 - 2022-04-12 (internal release)#
Added#
Added
SparkDateColumnDescriptorandSparkTimestampColumnDescriptor, enabling support for Spark dates and timestamps.Added two exception types,
InsufficientBudgetErrorandInactiveAccountantError, to PrivacyAccountants.Future documentation will include any exceptions defined in this library.
Added
cleanup.remove_all_temp_tables()function, which will remove all temporary tables created by Core.Added new components
DropNaNs,DropNulls,ReplaceNulls, andReplaceNaNs.
0.1.1 - 2022-02-24 (internal release)#
Added#
Added new implementations for SequentialComposition and ParallelComposition.
Added new spark transformations: Persist, Unpersist and SparkAction.
Added PrivacyAccountant.
Installation on Python 3.7.1 through 3.7.3 is now allowed.
Added
DecorateQueryable,DecoratedQueryableandcreate_adaptive_compositioncomponents.
Changed#
Fixed a bug where
create_quantile_measurementwould always be created with PureDP as the output measure.PySparkTestnow runstmlt.core.utils.cleanup.cleanup()duringtearDownClass.Refactored noise distribution tests.
Remove sorting from
GroupedDataFrame.apply_in_pandasandGroupedDataFrame.agg.Repartition DataFrames output by
SparkMeasurementto prevent privacy violation.Updated repartitioning in
SparkMeasurementto use a random column.Changed quantile implementation to use arblib.
Changed Laplace implementation to use arblib.
Removed#
Removed
ExponentialMechanismandPermuteAndFlipcomponents.Removed
AddNoise,AddLaplaceNoise,AddGeometricNoise, andAddDiscreteGaussianNoisefromtmlt.core.measurements.pandas.series.Removed
SequentialComposition,ParallelCompositionand corresponding Queryables fromtmlt.core.measurements.composition.Removed
tmlt.core.transformations.cache.
0.1.0 - 2022-02-14 (internal release)#
Added#
Initial release.