_query_expr_compiler#
Defines QueryExprCompiler
for compiling query expressions.
Classes#
Compiles a list of query expressions to a single measurement object. |
- class QueryExprCompiler(output_measure=PureDP())#
Compiles a list of query expressions to a single measurement object.
Requires that each query is a groupby-aggregation on a sequence of transformations on a PrivateSource or PrivateView. If there is a PrivateView, the stability of the view is handled when the noise scale is calculated.
A QueryExprCompiler object compiles a list of
QueryExpr
objects into a single object (based on the privacy framework). TheMeasurement
object can be run with a private data source to obtain DP answers to supplied queries.Supported
QueryExpr
s:- Parameters
output_measure (Union[tmlt.core.measures.PureDP, tmlt.core.measures.ApproxDP, tmlt.core.measures.RhoZCDP]) –
- __init__(output_measure=PureDP())#
Constructor.
- Parameters
output_measure (
PureDP
|ApproxDP
|RhoZCDP
Union
[PureDP
,ApproxDP
,RhoZCDP
] (default:PureDP()
)) – Distance measure for measurement’s output.
- property mechanism#
Return the value of Core noise mechanism.
- Return type
tmlt.core.measurements.aggregations.NoiseMechanism
- property output_measure#
Return the distance measure for the measurement’s output.
- Return type
Union[tmlt.core.measures.PureDP, tmlt.core.measures.ApproxDP, tmlt.core.measures.RhoZCDP]
- static query_schema(query, catalog)#
Return the schema created by a given query.
- Parameters
query (tmlt.analytics.query_expr.QueryExpr) –
catalog (tmlt.analytics._catalog.Catalog) –
- Return type
- __call__(queries, privacy_budget, stability, input_domain, input_metric, public_sources, catalog, table_constraints)#
Returns a compiled DP measurement and its noise information.
- Parameters
queries (Sequence[tmlt.analytics.query_expr.QueryExpr]) – Queries representing measurements to compile.
privacy_budget (tmlt.analytics.privacy_budget.PrivacyBudget) – The total privacy budget for answering the queries.
stability (Any) – The stability of the input to compiled query.
input_domain (tmlt.core.domains.collections.DictDomain) – The input domain of the compiled query.
input_metric (tmlt.core.metrics.DictMetric) – The input metric of the compiled query.
public_sources (Dict[str, pyspark.sql.DataFrame]) – Public data sources for the queries.
catalog (tmlt.analytics._catalog.Catalog) – The catalog, used only for query validation.
table_constraints (Dict[tmlt.analytics._table_identifier.Identifier, List[tmlt.analytics.constraints.Constraint]]) – A mapping of tables to the existing constraints on them.
- Return type
Tuple[tmlt.core.measurements.base.Measurement, tmlt.analytics._noise_info.NoiseInfo]
- build_transformation(query, input_domain, input_metric, public_sources, catalog, table_constraints)#
Returns a transformation and reference for the query.
Supported
QueryExpr
s:- Parameters
query (tmlt.analytics.query_expr.QueryExpr) – A query representing a transformation to compile.
input_domain (tmlt.core.domains.collections.DictDomain) – The input domain of the compiled query.
input_metric (tmlt.core.metrics.DictMetric) – The input metric of the compiled query.
public_sources (Dict[str, pyspark.sql.DataFrame]) – Public data sources for the queries.
catalog (tmlt.analytics._catalog.Catalog) – The catalog, used only for query validation.
table_constraints (Dict[tmlt.analytics._table_identifier.Identifier, List[tmlt.analytics.constraints.Constraint]]) – A mapping of tables to the existing constraints on them.
- Return type
Tuple[tmlt.core.transformations.base.Transformation, tmlt.analytics._table_reference.TableReference, List[tmlt.analytics.constraints.Constraint]]