protected_change#
Types for programmatically specifying what changes in input tables are protected.
Classes#
A description of the largest change in a dataset that is protected under DP. |
|
Protect the addition or removal of any set of |
|
A shorthand for the common case of |
|
Protect the addition or removal of rows across a finite number of groups. |
|
Protect the addition or removal of rows with a specific identifier. |
- class ProtectedChange#
Bases:
abc.ABC
A description of the largest change in a dataset that is protected under DP.
A
ProtectedChange
describes, for a particular table, the largest change that can be made to that table while still being indistinguishable under Tumult Analytics’ DP guarantee. The appropriate protected change to use is one corresponding to the largest possible change to the table when adding or removing a unit of protection, e.g. a person. For more information, see the privacy promise topic guide.
- class AddMaxRows#
Bases:
ProtectedChange
Protect the addition or removal of any set of
max_rows
rows.This ProtectedChange is a generalization of the standard “add/remove one row” DP guarantee, hiding the addition or removal of any set of at most
max_rows
rows from a table.
- class AddOneRow#
Bases:
AddMaxRows
A shorthand for the common case of
AddMaxRows
withmax_rows = 1
.- max_rows = 1#
The maximum number of rows that may be added or removed.
- class AddMaxRowsInMaxGroups#
Bases:
ProtectedChange
Protect the addition or removal of rows across a finite number of groups.
AddMaxRowsInMaxGroups
provides a similar guarantee toAddMaxRows
, but it uses some additional information to apply less noise in some cases. That information is about groups: collections of rows which share the same value in a particular column. That column would typically be some kind of categorical value, for example a state where a person lives or has lived. Instead of specifying a maximum total number of rows that may be added or removed,AddMaxRowsInMaxGroups
limits the number of rows that may be added or removed in any particular group, as well as the maximum total number of groups that may be affected. If these limits are meant to correspond to the maximum contribution of a specific entity to the dataset, that must be enforced before the data is passed to Tumult Analytics.AddMaxRowsInMaxGroups
is intended for advanced use cases, and its use should be considered carefully. Note that it only provides improved accuracy when used with zCDP – with pure DP, it is equivalent to usingAddMaxRows
with the same total number of rows to be added/removed.The most common case where
AddMaxRowsInMaxGroups
is useful is for dealing with datasets that have already undergone some type of preprocessing before being turned over to an analyst. Where possible, it is preferred to do such processing inside of Tumult Analytics instead, as it allows specifying a simpler protected change (e.g.AddRowsWithID
) and relying on Analytics’ privacy tracking to handle the complex parts of the analysis.
- class AddRowsWithID#
Bases:
ProtectedChange
Protect the addition or removal of rows with a specific identifier.
Instead of limiting the number of rows that may be added or removed,
AddRowsWithID
hides the addition or removal of all rows with the same value in the specified column.The ID column must be a string, integer (or long), or date; it cannot be a float or a timestamp.
- id_space: str = 'default_id_space'#
The identifier space of the rows that may be added or removed. If not specified, a default will be assigned when using this protected change with
Session.from_dataframe()
.