_neighboring_relation#

Module containing supported variants of neighboring relations.

Classes#

NeighboringRelation

Base class for a NeighboringRelation.

AddRemoveRows

A relation of tables differing by a limited number of rows.

AddRemoveRowsAcrossGroups

A relation of tables differing by a limited number of groups.

AddRemoveKeys

A relation of tables differing by a certain number of keys.

Conjunction

A conjunction composed of other neighboring relations.

NeighboringRelationVisitor

A base class for implementing visitors for NeighboringRelation.

class NeighboringRelation#

Bases: abc.ABC

Base class for a NeighboringRelation.

abstract validate_input(dfs)#

Does nothing if input is valid, otherwise raises an informative exception.

Used only for top-level validation.

Exception types and common reasons:
  • TypeError: Input dictionary does not map table names to Spark DataFrames

  • ValueError: Input dictionary contains an invalid number of items or contains invalid values.

  • KeyError: Relation table doesn’t exist in input dictionary

Parameters

dfs (Dict[str, pyspark.sql.DataFrame]) –

Return type

bool

abstract accept(visitor)#

Returns the result of a visit to core for this relation.

Return type

Any

class AddRemoveRows#

Bases: NeighboringRelation

A relation of tables differing by a limited number of rows.

Two tables are considered neighbors under this relation if they differ by at most n rows.

table :str#

The name of the table in this relation.

n :int#

The max number of rows which may be differ for two instances of the table to be neighbors.

validate_input(dfs)#

Does nothing if input is valid, otherwise raises an informative exception.

Used only for top-level validation.

Parameters

dfs (Dict[str, pyspark.sql.DataFrame]) –

Return type

bool

accept(visitor)#

Visit this NeighboringRelation with a Visitor.

Parameters

visitor (NeighboringRelationVisitor) –

Return type

Any

class AddRemoveRowsAcrossGroups#

Bases: NeighboringRelation

A relation of tables differing by a limited number of groups.

Two tables are considered neighbors under this relation if they differ by

at most n groups, with each group differing by no more than m rows.

table :str#

The name of the table in this relation.

grouping_column :str#

The column that must be grouped over for the privacy guarantee to hold.

max_groups :int#

The maximum number of groups which may differ for two instances of the table to be neighbors.

per_group :int#

The max number of rows in any single group that may differ for two instances of the table to be neighbors.

validate_input(dfs)#

Does nothing if input is valid, otherwise raises an informative exception.

Used only for top-level validation.

Parameters

dfs (Dict[str, pyspark.sql.DataFrame]) –

Return type

bool

accept(visitor)#

Visit this NeighboringRelation with a Visitor.

Parameters

visitor (NeighboringRelationVisitor) –

Return type

Any

class AddRemoveKeys#

Bases: NeighboringRelation

A relation of tables differing by a certain number of keys.

Two tables are considered neighbors under this definition if they differ only by the addition/removal of all rows with max_keys distinct values under the columns indicated.

Note that AddRemoveKeys is a neighboring relation that covers multiple tables.

id_space :str#

The identifier space protected in the relation.

table_to_key_column :Dict[str, str]#

A dictionary mapping table names to key columns.

max_keys :int#

The maximum number of keys which may differ for two instances of the table to be neighbors.

validate_input(dfs)#

Does nothing if input is valid, otherwise raises an informative exception.

Used only for top-level validation.

Parameters

dfs (Dict[str, pyspark.sql.DataFrame]) –

Return type

bool

accept(visitor)#

Visit this NeighboringRelation with a Visitor.

Parameters

visitor (NeighboringRelationVisitor) –

Return type

Any

class Conjunction(*children)#

Bases: NeighboringRelation

A conjunction composed of other neighboring relations.

children :List[NeighboringRelation]#

Other neighboring relations to build the Conjunction. Args can be provided as a single list or as separate arguments.

If more than one list is provided, Conjunction will only use the first.

__init__(*children)#

Constructor.

Return type

None

validate_input(dfs)#

Does nothing if input is valid, otherwise raises an informative exception.

Parameters

dfs (Dict[str, pyspark.sql.DataFrame]) –

Return type

bool

accept(visitor)#

Visit this NeighboringRelation with a Visitor.

Parameters

visitor (NeighboringRelationVisitor) –

Return type

Any

class NeighboringRelationVisitor#

Bases: abc.ABC

A base class for implementing visitors for NeighboringRelation.

abstract visit_add_remove_rows(relation)#

Visit a AddRemoveRows.

Parameters

relation (AddRemoveRows) –

Return type

Any

abstract visit_add_remove_rows_across_groups(relation)#

Visit a AddRemoveRowsAcrossGroups.

Parameters

relation (AddRemoveRowsAcrossGroups) –

Return type

Any

abstract visit_add_remove_keys(relation)#

Visit a AddRemoveKeys.

Parameters

relation (AddRemoveKeys) –

Return type

Any

abstract visit_conjunction(relation)#

Visit a Conjunction.

Parameters

relation (Conjunction) –

Return type

Any