query_expr#

Building blocks of the Tumult Analytics query language. Not for direct use.

Deprecated since version 0.14: This module will be removed in an upcoming release. Import mechanism enums from tmlt.analytics.query_builder instead. QueryExpr will be removed from the Tumult Analytics public API.

Defines the QueryExpr class, which represents expressions in the Tumult Analytics query language. QueryExpr and its subclasses should not be directly constructed or deconstructed by most users; interfaces such as tmlt.analytics.query_builder.QueryBuilder to create them and tmlt.analytics.session.Session to consume them provide more user-friendly features.

Classes#

AnalyticsDefault

Default values for each type of column in Tumult Analytics.

AverageMechanism

Possible mechanisms for the average() aggregation.

CountDistinctMechanism

Enumerating the possible mechanisms used for the count_distinct aggregation.

CountMechanism

Possible mechanisms for the count() aggregation.

DropInfinity

Returns data with rows that contain +inf/-inf dropped.

DropNullAndNan

Returns data with rows that contain null or NaN value dropped.

EnforceConstraint

Enforces a constraint on the data.

Filter

Returns the subset of the rows that satisfy the condition.

FlatMap

Applies a flat map function to each row of a relation.

GetBounds

Returns approximate upper and lower bounds of a column.

GetGroups

Returns groups based on the geometric partition selection for these columns.

GroupByBoundedAverage

Returns bounded average of a column for each combination of groupby domains.

GroupByBoundedSTDEV

Returns bounded stdev of a column for each combination of groupby domains.

GroupByBoundedSum

Returns the bounded sum of a column for each combination of groupby domains.

GroupByBoundedVariance

Returns bounded variance of a column for each combination of groupby domains.

GroupByCount

Returns the count of each combination of the groupby domains.

GroupByCountDistinct

Returns the count of distinct rows in each groupby domain value.

GroupByQuantile

Returns the quantile of a column for each combination of the groupby domains.

JoinPrivate

Returns the join of two private tables.

JoinPublic

Returns the join of a private and public table.

Map

Applies a map function to each row of a relation.

PrivateSource

Loads the private source.

QueryExpr

A query expression, base class for relational operators.

QueryExprVisitor

A base class for implementing visitors for QueryExpr.

Rename

Returns the dataframe with columns renamed.

ReplaceInfinity

Returns data with +inf and -inf expressions replaced by defaults.

ReplaceNullAndNan

Returns data with null and NaN expressions replaced by a default.

Select

Returns a subset of the columns.

StdevMechanism

Possible mechanisms for the stdev() aggregation.

SumMechanism

Possible mechanisms for the sum() aggregation.

SuppressAggregates

Remove all counts that are less than the threshold.

VarianceMechanism

Possible mechanisms for the variance() aggregation.

class AnalyticsDefault#

Default values for each type of column in Tumult Analytics.

INTEGER = 0#

The default value used for integers (0).

DECIMAL = 0.0#

The default value used for floats (0).

VARCHAR = ''#

The default value used for VARCHARs (the empty string).

DATE#

The default value used for dates (datetime.date.fromtimestamp(0)).

See fromtimestamp().

TIMESTAMP#

The default value used for timestamps (datetime.datetime.fromtimestamp(0)).

See fromtimestamp().

class AverageMechanism#

Bases: enum.Enum

Possible mechanisms for the average() aggregation.

Currently, the average() aggregation uses an additive noise mechanism to achieve differential privacy.

DEFAULT#

The framework automatically selects an appropriate mechanism. This choice might change over time as additional optimizations are added to the library.

LAPLACE#

Laplace and/or double-sided geometric noise is used, depending on the column type.

GAUSSIAN#

Discrete and/or continuous Gaussian noise is used, depending on the column type. Not compatible with pure DP.

name()#

The name of the Enum member.

value()#

The value of the Enum member.

class CountDistinctMechanism#

Bases: enum.Enum

Enumerating the possible mechanisms used for the count_distinct aggregation.

Currently, the count_distinct() aggregation uses an additive noise mechanism to achieve differential privacy.

DEFAULT#

The framework automatically selects an appropriate mechanism. This choice might change over time as additional optimizations are added to the library.

LAPLACE#

Double-sided geometric noise is used.

GAUSSIAN#

The discrete Gaussian mechanism is used. Not compatible with pure DP.

name()#

The name of the Enum member.

value()#

The value of the Enum member.

class CountMechanism#

Bases: enum.Enum

Possible mechanisms for the count() aggregation.

Currently, the count() aggregation uses an additive noise mechanism to achieve differential privacy.

DEFAULT#

The framework automatically selects an appropriate mechanism. This choice might change over time as additional optimizations are added to the library.

LAPLACE#

Double-sided geometric noise is used.

GAUSSIAN#

The discrete Gaussian mechanism is used. Not compatible with pure DP.

name()#

The name of the Enum member.

value()#

The value of the Enum member.

class DropInfinity#

Bases: QueryExpr

Returns data with rows that contain +inf/-inf dropped.

child: QueryExpr#

The QueryExpr in which to drop +inf/-inf.

columns: List[str]#

Columns in which to look for and infinite values.

If this list is empty, all columns will be looked at - so if any column contains an infinite value, that row will be dropped.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class DropNullAndNan#

Bases: QueryExpr

Returns data with rows that contain null or NaN value dropped.

Warning

After a DropNullAndNan query has been performed for a column, Tumult Analytics will raise an error if you use a KeySet for that column that contains null values.

child: QueryExpr#

The QueryExpr in which to drop nulls/NaNs.

columns: List[str]#

Columns in which to look for nulls and NaNs.

If this list is empty, all columns will be looked at - so if any column contains a null or NaN value that row will be dropped.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class EnforceConstraint#

Bases: QueryExpr

Enforces a constraint on the data.

child: QueryExpr#

The QueryExpr to which the constraint will be applied.

constraint: tmlt.analytics.constraints.Constraint#

A constraint to be enforced.

options: Dict[str, Any]#

Options to be used when enforcing the constraint.

Appropriate values here vary depending on the constraint. These options are to support advanced use cases, and generally should not be used.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class Filter#

Bases: QueryExpr

Returns the subset of the rows that satisfy the condition.

child: QueryExpr#

The QueryExpr to filter.

condition: str#

A string of SQL expression specifying the filter to apply to the data.

For example, the string “A > B” matches rows where column A is greater than column B.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class FlatMap#

Bases: QueryExpr

Applies a flat map function to each row of a relation.

child: QueryExpr#

The QueryExpr to apply the flat map on.

f: Callable[[Row], List[Row]]#

The flat map function.

schema_new_columns: tmlt.analytics._schema.Schema#

The expected schema for new columns produced by f.

If the schema_new_columns has a grouping_column, that means this FlatMap produces a column that must be grouped by eventually. It also must be the only column in the schema.

augment: bool#

Whether to keep the existing columns.

If True, schema = old schema + schema_new_columns, otherwise only keeps the new columns (schema = schema_new_columns).

max_rows: int | None = None#

The enforced limit on number of rows from each f(row).

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

__eq__(other)#

Returns true iff self == other.

This uses the bytecode of self.f and other.f to determine if the two functions are equal.

Parameters:

other (object) –

Return type:

bool

class GetBounds#

Bases: QueryExpr

Returns approximate upper and lower bounds of a column.

child: QueryExpr#

The QueryExpr to get groups for.

column: str#

The column to get bounds of.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class GetGroups#

Bases: QueryExpr

Returns groups based on the geometric partition selection for these columns.

child: QueryExpr#

The QueryExpr to get groups for.

columns: List[str] | None = None#

The columns used for geometric partition selection.

If empty or none are provided, will use all of the columns in the table for partition selection.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class GroupByBoundedAverage#

Bases: QueryExpr

Returns bounded average of a column for each combination of groupby domains.

If the column to be measured contains null, NaN, or positive or negative infinity, those values will be dropped (as if dropped explicitly via DropNullAndNan and DropInfinity) before the average is calculated.

child: QueryExpr#

The QueryExpr to measure.

groupby_keys: tmlt.analytics.keyset.KeySet | List[str]#

The keys, or columns list to collect keys from, to be grouped on.

measure_column: str#

The column to compute the average over.

low: float#

The lower bound for clamping the measure_column. Should be less than high.

high: float#

The upper bound for clamping the measure_column. Should be greater than low.

output_column: str = 'average'#

The name of the column to store the averages in.

mechanism: AverageMechanism#

Choice of noise mechanism.

By DEFAULT, the framework automatically selects an appropriate mechanism.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class GroupByBoundedSTDEV#

Bases: QueryExpr

Returns bounded stdev of a column for each combination of groupby domains.

If the column to be measured contains null, NaN, or positive or negative infinity, those values will be dropped (as if dropped explicitly via DropNullAndNan and DropInfinity) before the standard deviation is calculated.

child: QueryExpr#

The QueryExpr to measure.

groupby_keys: tmlt.analytics.keyset.KeySet | List[str]#

The keys, or columns list to collect keys from, to be grouped on.

measure_column: str#

The column to compute the standard deviation over.

low: float#

The lower bound for clamping the measure_column. Should be less than high.

high: float#

The upper bound for clamping the measure_column. Should be greater than low.

output_column: str = 'stdev'#

The name of the column to store the stdev in.

mechanism: StdevMechanism#

Choice of noise mechanism.

By DEFAULT, the framework automatically selects an appropriate mechanism.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class GroupByBoundedSum#

Bases: QueryExpr

Returns the bounded sum of a column for each combination of groupby domains.

If the column to be measured contains null, NaN, or positive or negative infinity, those values will be dropped (as if dropped explicitly via DropNullAndNan and DropInfinity) before the sum is calculated.

child: QueryExpr#

The QueryExpr to measure.

groupby_keys: tmlt.analytics.keyset.KeySet | List[str]#

The keys, or columns list to collect keys from, to be grouped on.

measure_column: str#

The column to compute the sum over.

low: float#

The lower bound for clamping the measure_column. Should be less than high.

high: float#

The upper bound for clamping the measure_column. Should be greater than low.

output_column: str = 'sum'#

The name of the column to store the sums in.

mechanism: SumMechanism#

Choice of noise mechanism.

By DEFAULT, the framework automatically selects an appropriate mechanism.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class GroupByBoundedVariance#

Bases: QueryExpr

Returns bounded variance of a column for each combination of groupby domains.

If the column to be measured contains null, NaN, or positive or negative infinity, those values will be dropped (as if dropped explicitly via DropNullAndNan and DropInfinity) before the variance is calculated.

child: QueryExpr#

The QueryExpr to measure.

groupby_keys: tmlt.analytics.keyset.KeySet | List[str]#

The keys, or columns list to collect keys from, to be grouped on.

measure_column: str#

The column to compute the variance over.

low: float#

The lower bound for clamping the measure_column. Should be less than high.

high: float#

The upper bound for clamping the measure_column. Should be greater than low.

output_column: str = 'variance'#

The name of the column to store the variances in.

mechanism: VarianceMechanism#

Choice of noise mechanism.

By DEFAULT, the framework automatically selects an appropriate mechanism.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class GroupByCount#

Bases: QueryExpr

Returns the count of each combination of the groupby domains.

child: QueryExpr#

The QueryExpr to measure.

groupby_keys: tmlt.analytics.keyset.KeySet | List[str]#

The keys, or columns list to collect keys from, to be grouped on.

output_column: str = 'count'#

The name of the column to store the counts in.

mechanism: CountMechanism#

Choice of noise mechanism.

By DEFAULT, the framework automatically selects an appropriate mechanism.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class GroupByCountDistinct#

Bases: QueryExpr

Returns the count of distinct rows in each groupby domain value.

child: QueryExpr#

The QueryExpr to measure.

groupby_keys: tmlt.analytics.keyset.KeySet | List[str]#

The keys, or columns list to collect keys from, to be grouped on.

columns_to_count: List[str] | None = None#

The columns that are compared when determining if two rows are distinct.

If empty, will count all distinct rows.

output_column: str = 'count_distinct'#

The name of the column to store the distinct counts in.

mechanism: CountDistinctMechanism#

Choice of noise mechanism.

By DEFAULT, the framework automatically selects an appropriate mechanism.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class GroupByQuantile#

Bases: QueryExpr

Returns the quantile of a column for each combination of the groupby domains.

If the column to be measured contains null, NaN, or positive or negative infinity, those values will be dropped (as if dropped explicitly via DropNullAndNan and DropInfinity) before the quantile is calculated.

child: QueryExpr#

The QueryExpr to measure.

groupby_keys: tmlt.analytics.keyset.KeySet | List[str]#

The keys, or columns list to collect keys from, to be grouped on.

measure_column: str#

The column to compute the quantile over.

quantile: float#

The quantile to compute (between 0 and 1).

low: float#

The lower bound for clamping the measure_column. Should be less than high.

high: float#

The upper bound for clamping the measure_column. Should be greater than low.

output_column: str = 'quantile'#

The name of the column to store the quantiles in.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class JoinPrivate#

Bases: QueryExpr

Returns the join of two private tables.

Before performing the join, each table is truncated based on the corresponding TruncationStrategy. For a more detailed overview of JoinPrivate’s behavior, see join_private().

child: QueryExpr#

The QueryExpr to join with right operand.

right_operand_expr: QueryExpr#

The QueryExpr for private source to join with.

truncation_strategy_left: tmlt.analytics.truncation_strategy.TruncationStrategy.Type | None = None#

Truncation strategy to be used for the left table.

truncation_strategy_right: tmlt.analytics.truncation_strategy.TruncationStrategy.Type | None = None#

Truncation strategy to be used for the right table.

join_columns: List[str] | None = None#

The columns used for joining the tables, or None to use all common columns.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class JoinPublic#

Bases: QueryExpr

Returns the join of a private and public table.

child: QueryExpr#

The QueryExpr to join with public_df.

public_table: pyspark.sql.DataFrame | str#

A DataFrame or public source to join with.

join_columns: List[str] | None = None#

The columns used for joining the tables, or None to use all common columns.

how: str = 'inner'#

The type of join to perform. Must be either “inner” or “left”.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

__eq__(other)#

Returns true iff self == other.

For the purposes of this equality operation, two dataframes are equal if they contain the same data, in any order.

Calling this on a JoinPublic that includes a very large dataframe could take a long time or consume a lot of resources, and is not recommended.

Parameters:

other (object) –

Return type:

bool

class Map#

Bases: QueryExpr

Applies a map function to each row of a relation.

child: QueryExpr#

The QueryExpr to apply the map on.

f: Callable[[Row], Row]#

The map function.

schema_new_columns: tmlt.analytics._schema.Schema#

The expected schema for new columns produced by f.

augment: bool#

Whether to keep the existing columns.

If True, schema = old schema + schema_new_columns, otherwise only keeps the new columns (schema = schema_new_columns).

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

__eq__(other)#

Returns true iff self == other.

This uses the bytecode of self.f and other.f to determine if the two functions are equal.

Parameters:

other (object) –

Return type:

bool

class PrivateSource#

Bases: QueryExpr

Loads the private source.

source_id: str#

The ID for the private source to load.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class QueryExpr#

Bases: abc.ABC

A query expression, base class for relational operators.

In most cases, QueryExpr should not be manipulated directly, but rather created using tmlt.analytics.query_builder.QueryBuilder and then consumed by tmlt.analytics.session.Session. While they can be created and modified directly, this is an advanced usage and is not recommended for typical users.

QueryExpr are organized in a tree, where each node is an operator which returns a relation.

abstract accept(visitor)#

Dispatch methods on a visitor based on the QueryExpr type.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class QueryExprVisitor#

Bases: abc.ABC

A base class for implementing visitors for QueryExpr.

Methods#

visit_private_source()

Visit a PrivateSource.

visit_rename()

Visit a Rename.

visit_filter()

Visit a Filter.

visit_select()

Visit a Select.

visit_map()

Visit a Map.

visit_flat_map()

Visit a FlatMap.

visit_join_private()

Visit a JoinPrivate.

visit_join_public()

Visit a JoinPublic.

visit_replace_null_and_nan()

Visit a ReplaceNullAndNan.

visit_replace_infinity()

Visit a ReplaceInfinity.

visit_drop_null_and_nan()

Visit a DropNullAndNan.

visit_drop_infinity()

Visit a DropInfinity.

visit_enforce_constraint()

Visit a EnforceConstraint.

visit_get_groups()

Visit a GetGroups.

visit_get_bounds()

Visit a GetBounds.

visit_groupby_count()

Visit a GroupByCount.

visit_groupby_count_distinct()

Visit a GroupByCountDistinct.

visit_groupby_quantile()

Visit a GroupByQuantile.

visit_groupby_bounded_sum()

Visit a GroupByBoundedSum.

visit_groupby_bounded_average()

Visit a GroupByBoundedAverage.

visit_groupby_bounded_variance()

Visit a GroupByBoundedVariance.

visit_groupby_bounded_stdev()

Visit a GroupByBoundedSTDEV.

visit_suppress_aggregates()

Visit a SuppressAggregates.

abstract visit_private_source(expr)#

Visit a PrivateSource.

Parameters:

expr (PrivateSource) –

Return type:

Any

abstract visit_rename(expr)#

Visit a Rename.

Parameters:

expr (Rename) –

Return type:

Any

abstract visit_filter(expr)#

Visit a Filter.

Parameters:

expr (Filter) –

Return type:

Any

abstract visit_select(expr)#

Visit a Select.

Parameters:

expr (Select) –

Return type:

Any

abstract visit_map(expr)#

Visit a Map.

Parameters:

expr (Map) –

Return type:

Any

abstract visit_flat_map(expr)#

Visit a FlatMap.

Parameters:

expr (FlatMap) –

Return type:

Any

abstract visit_join_private(expr)#

Visit a JoinPrivate.

Parameters:

expr (JoinPrivate) –

Return type:

Any

abstract visit_join_public(expr)#

Visit a JoinPublic.

Parameters:

expr (JoinPublic) –

Return type:

Any

abstract visit_replace_null_and_nan(expr)#

Visit a ReplaceNullAndNan.

Parameters:

expr (ReplaceNullAndNan) –

Return type:

Any

abstract visit_replace_infinity(expr)#

Visit a ReplaceInfinity.

Parameters:

expr (ReplaceInfinity) –

Return type:

Any

abstract visit_drop_null_and_nan(expr)#

Visit a DropNullAndNan.

Parameters:

expr (DropNullAndNan) –

Return type:

Any

abstract visit_drop_infinity(expr)#

Visit a DropInfinity.

Parameters:

expr (DropInfinity) –

Return type:

Any

abstract visit_enforce_constraint(expr)#

Visit a EnforceConstraint.

Parameters:

expr (EnforceConstraint) –

Return type:

Any

abstract visit_get_groups(expr)#

Visit a GetGroups.

Parameters:

expr (GetGroups) –

Return type:

Any

abstract visit_get_bounds(expr)#

Visit a GetBounds.

Parameters:

expr (GetBounds) –

Return type:

Any

abstract visit_groupby_count(expr)#

Visit a GroupByCount.

Parameters:

expr (GroupByCount) –

Return type:

Any

abstract visit_groupby_count_distinct(expr)#

Visit a GroupByCountDistinct.

Parameters:

expr (GroupByCountDistinct) –

Return type:

Any

abstract visit_groupby_quantile(expr)#

Visit a GroupByQuantile.

Parameters:

expr (GroupByQuantile) –

Return type:

Any

abstract visit_groupby_bounded_sum(expr)#

Visit a GroupByBoundedSum.

Parameters:

expr (GroupByBoundedSum) –

Return type:

Any

abstract visit_groupby_bounded_average(expr)#

Visit a GroupByBoundedAverage.

Parameters:

expr (GroupByBoundedAverage) –

Return type:

Any

abstract visit_groupby_bounded_variance(expr)#

Visit a GroupByBoundedVariance.

Parameters:

expr (GroupByBoundedVariance) –

Return type:

Any

abstract visit_groupby_bounded_stdev(expr)#

Visit a GroupByBoundedSTDEV.

Parameters:

expr (GroupByBoundedSTDEV) –

Return type:

Any

abstract visit_suppress_aggregates(expr)#

Visit a SuppressAggregates.

Parameters:

expr (SuppressAggregates) –

Return type:

Any

class Rename#

Bases: QueryExpr

Returns the dataframe with columns renamed.

child: QueryExpr#

The QueryExpr to apply Rename to.

column_mapper: Dict[str, str]#

The mapping of old column names to new column names.

This mapping can contain all column names or just a subset. If it contains a subset of columns, it will only rename those columns and keep the other column names the same.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class ReplaceInfinity#

Bases: QueryExpr

Returns data with +inf and -inf expressions replaced by defaults.

child: QueryExpr#

The QueryExpr to replace +inf and -inf values in.

replace_with: Dict[str, Tuple[float, float]]#

New values to replace with, by column. The first value for each column will be used to replace -infinity, and the second value will be used to replace +infinity.

If this dictionary is empty, all columns of type DECIMAL will be changed, with infinite values replaced with a default value (see the AnalyticsDefault class variables).

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class ReplaceNullAndNan#

Bases: QueryExpr

Returns data with null and NaN expressions replaced by a default.

Warning

after a ReplaceNullAndNan query has been performed for a column, Tumult Analytics will raise an error if you use a KeySet for that column that contains null values.

child: QueryExpr#

The QueryExpr to replace null/NaN values in.

replace_with: Mapping[str, int | float | str | datetime.date | datetime.datetime]#

New values to replace with, by column.

If this dictionary is empty, all columns will be changed, with values replaced by a default value for each column’s type (see the AnalyticsDefault class variables).

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class Select#

Bases: QueryExpr

Returns a subset of the columns.

child: QueryExpr#

The QueryExpr to apply the select on.

columns: List[str]#

The columns to select.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class StdevMechanism#

Bases: enum.Enum

Possible mechanisms for the stdev() aggregation.

Currently, the stdev() aggregation uses an additive noise mechanism to achieve differential privacy.

DEFAULT#

The framework automatically selects an appropriate mechanism. This choice might change over time as additional optimizations are added to the library.

LAPLACE#

Laplace and/or double-sided geometric noise is used, depending on the column type.

GAUSSIAN#

Discrete and/or continuous Gaussian noise is used, depending on the column type. Not compatible with pure DP.

name()#

The name of the Enum member.

value()#

The value of the Enum member.

class SumMechanism#

Bases: enum.Enum

Possible mechanisms for the sum() aggregation.

Currently, the sum() aggregation uses an additive noise mechanism to achieve differential privacy.

DEFAULT#

The framework automatically selects an appropriate mechanism. This choice might change over time as additional optimizations are added to the library.

LAPLACE#

Laplace and/or double-sided geometric noise is used, depending on the column type.

GAUSSIAN#

Discrete and/or continuous Gaussian noise is used, depending on the column type. Not compatible with pure DP.

name()#

The name of the Enum member.

value()#

The value of the Enum member.

class SuppressAggregates#

Bases: QueryExpr

Remove all counts that are less than the threshold.

child: QueryExpr#

The aggregate on which to suppress small counts.

Currently, only GroupByCount is supported.

column: str#

The name of the column to suppress.

threshold: float#

Threshold. All counts less than this will be suppressed.

accept(visitor)#

Visit this QueryExpr with visitor.

Parameters:

visitor (QueryExprVisitor) –

Return type:

Any

class VarianceMechanism#

Bases: enum.Enum

Possible mechanisms for the variance() aggregation.

Currently, the variance() aggregation uses an additive noise mechanism to achieve differential privacy.

DEFAULT#

The framework automatically selects an appropriate mechanism. This choice might change over time as additional optimizations are added to the library.

LAPLACE#

Laplace and/or double-sided geometric noise is used, depending on the column type.

GAUSSIAN#

Discrete and/or continuous Gaussian noise is used, depending on the column type. Not compatible with pure DP.

name()#

The name of the Enum member.

value()#

The value of the Enum member.