_base_transformation_visitor#

Defines a base class for visitors for transformations.

Classes#

BaseTransformationVisitor

A base visitor to create a transformation from a query expression.

class BaseTransformationVisitor(input_domain, input_metric, mechanism, public_sources, table_constraints)#

Bases: tmlt.analytics.query_expr.QueryExprVisitor

A base visitor to create a transformation from a query expression.

Classes#

Output

A container for the outputs of the visitor.

Methods#

validate_transformation()

Ensure that a query’s transformation is valid on a given catalog.

inner_metric()

Get the inner metric used by this TransformationVisitor.

visit_private_source()

Create a transformation from a PrivateSource query expression.

visit_rename()

Create a transformation from a Rename query expression.

visit_filter()

Create a transformation from a FilterExpr query expression.

visit_select()

Create a transformation from a Select query expression.

visit_map()

Create a transformation from a Map query expression.

build_flat_map()

Build a Transformation for a FlatMap query expression with grouping=False.

build_grouping_flat_map()

Build a Transformation for a FlatMap query expression with grouping=True.

visit_flat_map()

Create a transformation from a FlatMap query expression.

build_private_join_transformation()

Build a Transformation for a private join.

visit_join_private()

Create a transformation from a JoinPrivate query expression.

visit_join_public()

Create a transformation from a JoinPublic query expression.

visit_replace_null_and_nan()

Create a transformation from a ReplaceNullAndNan query expression.

visit_replace_infinity()

Create a transformation from a ReplaceInfinity query expression.

visit_drop_infinity()

Create a transformation from a DropInfinity query expression.

visit_drop_null_and_nan()

Create a transformation from a DropNullAndNan query expression.

visit_enforce_constraint()

Create a transformation from an EnforceConstraint query expression.

visit_get_groups()

Visit a GetGroups query expression (raises an error).

visit_groupby_count()

Visit a GroupByCount query expression (raises an error).

visit_groupby_count_distinct()

Visit a GroupByCountDistinct query expression (raises an error).

visit_groupby_quantile()

Visit a GroupByQuantile query expression (raises an error).

visit_groupby_bounded_sum()

Visit a GroupByBoundedSum query expression (raises an error).

visit_groupby_bounded_average()

Visit a GroupByBoundedAverage query expression (raises an error).

visit_groupby_bounded_variance()

Visit a GroupByBoundedVariance query expression (raises an error).

visit_groupby_bounded_stdev()

Visit a GroupByBoundedSTDEV query expression (raises an error).

Parameters
class Output#

Bases: NamedTuple

A container for the outputs of the visitor.

__init__(input_domain, input_metric, mechanism, public_sources, table_constraints)#

Constructor for a TransformationVisitor.

Parameters
  • input_domain (DictDomainDictDomain) – The input domain that the transformation should have.

  • input_metric (DictMetricDictMetric) – The input metric that the transformation should have.

  • mechanism (NoiseMechanismNoiseMechanism) – The noise mechanism (only used for FlatMaps).

  • public_sources ({str: DataFrame}Dict[str, DataFrame]) – Public sources to use for JoinPublic queries.

  • table_constraints ({Identifier: List[Constraint]}Dict[Identifier, List[Constraint]]) – A mapping of tables to the existing constraints on them.

validate_transformation(query, transformation, reference, catalog)#

Ensure that a query’s transformation is valid on a given catalog.

Parameters
inner_metric()#

Get the inner metric used by this TransformationVisitor.

Return type

Union[tmlt.core.metrics.SumOf, tmlt.core.metrics.RootSumOfSquared]

visit_private_source(expr)#

Create a transformation from a PrivateSource query expression.

Return type

Output

visit_rename(expr)#

Create a transformation from a Rename query expression.

Parameters

expr (tmlt.analytics.query_expr.Rename) –

Return type

Output

visit_filter(expr)#

Create a transformation from a FilterExpr query expression.

Parameters

expr (tmlt.analytics.query_expr.Filter) –

Return type

Output

visit_select(expr)#

Create a transformation from a Select query expression.

Parameters

expr (tmlt.analytics.query_expr.Select) –

Return type

Output

visit_map(expr)#

Create a transformation from a Map query expression.

Parameters

expr (tmlt.analytics.query_expr.Map) –

Return type

Output

build_flat_map(input_metric, row_transformer, max_rows)#

Build a Transformation for a FlatMap query expression with grouping=False.

Parameters
  • input_metric (Union[tmlt.core.metrics.IfGroupedBy, tmlt.core.metrics.SymmetricDifference]) –

  • row_transformer (tmlt.core.transformations.spark_transformations.map.RowToRowsTransformation) –

  • max_rows (Optional[int]) –

Return type

tmlt.core.transformations.base.Transformation

build_grouping_flat_map(inner_metric, row_transformer, max_rows)#

Build a Transformation for a FlatMap query expression with grouping=True.

Parameters
  • inner_metric (Union[tmlt.core.metrics.SumOf, tmlt.core.metrics.RootSumOfSquared]) –

  • row_transformer (tmlt.core.transformations.spark_transformations.map.RowToRowsTransformation) –

  • max_rows (int) –

Return type

tmlt.core.transformations.base.Transformation

visit_flat_map(expr)#

Create a transformation from a FlatMap query expression.

Parameters

expr (tmlt.analytics.query_expr.FlatMap) –

Return type

Output

build_private_join_transformation(input_domain, left_key, right_key, left_truncation_strategy, right_truncation_strategy, left_truncation_threshold, right_truncation_threshold, join_cols=None, join_on_nulls=False)#

Build a Transformation for a private join.

Parameters
  • input_domain (tmlt.core.domains.collections.DictDomain) –

  • left_key (Any) –

  • right_key (Any) –

  • left_truncation_strategy (tmlt.core.transformations.spark_transformations.join.TruncationStrategy) –

  • right_truncation_strategy (tmlt.core.transformations.spark_transformations.join.TruncationStrategy) –

  • left_truncation_threshold (int) –

  • right_truncation_threshold (int) –

  • join_cols (Union[List[str], None]) –

  • join_on_nulls (bool) –

Return type

tmlt.core.transformations.base.Transformation

visit_join_private(expr)#

Create a transformation from a JoinPrivate query expression.

Parameters

expr (tmlt.analytics.query_expr.JoinPrivate) –

Return type

Output

visit_join_public(expr)#

Create a transformation from a JoinPublic query expression.

Parameters

expr (tmlt.analytics.query_expr.JoinPublic) –

Return type

Output

visit_replace_null_and_nan(expr)#

Create a transformation from a ReplaceNullAndNan query expression.

Parameters

expr (tmlt.analytics.query_expr.ReplaceNullAndNan) –

Return type

Output

visit_replace_infinity(expr)#

Create a transformation from a ReplaceInfinity query expression.

Parameters

expr (tmlt.analytics.query_expr.ReplaceInfinity) –

Return type

Output

visit_drop_infinity(expr)#

Create a transformation from a DropInfinity query expression.

Parameters

expr (tmlt.analytics.query_expr.DropInfinity) –

Return type

Output

visit_drop_null_and_nan(expr)#

Create a transformation from a DropNullAndNan query expression.

Parameters

expr (tmlt.analytics.query_expr.DropNullAndNan) –

Return type

Output

visit_enforce_constraint(expr)#

Create a transformation from an EnforceConstraint query expression.

Parameters

expr (tmlt.analytics.query_expr.EnforceConstraint) –

Return type

Output

abstract visit_get_groups(expr)#

Visit a GetGroups query expression (raises an error).

Parameters

expr (tmlt.analytics.query_expr.GetGroups) –

Return type

Any

abstract visit_groupby_count(expr)#

Visit a GroupByCount query expression (raises an error).

Parameters

expr (tmlt.analytics.query_expr.GroupByCount) –

Return type

Any

abstract visit_groupby_count_distinct(expr)#

Visit a GroupByCountDistinct query expression (raises an error).

Parameters

expr (tmlt.analytics.query_expr.GroupByCountDistinct) –

Return type

Any

abstract visit_groupby_quantile(expr)#

Visit a GroupByQuantile query expression (raises an error).

Parameters

expr (tmlt.analytics.query_expr.GroupByQuantile) –

Return type

Any

abstract visit_groupby_bounded_sum(expr)#

Visit a GroupByBoundedSum query expression (raises an error).

Parameters

expr (tmlt.analytics.query_expr.GroupByBoundedSum) –

Return type

Any

abstract visit_groupby_bounded_average(expr)#

Visit a GroupByBoundedAverage query expression (raises an error).

Parameters

expr (tmlt.analytics.query_expr.GroupByBoundedAverage) –

Return type

Any

abstract visit_groupby_bounded_variance(expr)#

Visit a GroupByBoundedVariance query expression (raises an error).

Parameters

expr (tmlt.analytics.query_expr.GroupByBoundedVariance) –

Return type

Any

abstract visit_groupby_bounded_stdev(expr)#

Visit a GroupByBoundedSTDEV query expression (raises an error).

Parameters

expr (tmlt.analytics.query_expr.GroupByBoundedSTDEV) –

Return type

Any