Specifying privacy guarantees#
The Session
is the main object used to specify formal
privacy guarantees on sensitive data. Users specify privacy guarantees at
Session initialization time, using one
protected change per sensitive table, and an overall
privacy budget. Together, these define the formal
guarantee that the Session then enforces.
Once the Session
is initialized, it then ensures that
all future interactions with it satisfy the specified privacy guarantee. In
particular, queries evaluated using evaluate()
cannot consume more than the specified privacy budget.
A simple introduction to Session initialization and use can be found in the
first and second tutorials. More
details on the exact privacy promise provided by the Session
can be found in the Privacy promise topic guide.
Session#
The Session
is the fundamental abstraction used to
enforce formal privacy guarantees on sensitive data.
Allows differentially private query evaluation on sensitive data. |
Initializing the Session#
Sessions can be initialized using the
from_dataframe()
method, or using a
Builder
.
|
Initializes a DP session from a Spark dataframe. |
Builder for |
Protected changes#
Each private table in a Session
needs a protected
change, which describes the maximal change in a table that will be protected by
the privacy guarantees.
Base class describing the change in a dataset that is protected under DP. |
|
Protects the addition or removal of a single row. |
|
|
Protects the addition or removal of any set of |
|
Protects the addition or removal of rows across a finite number of groups. |
|
Protects the addition or removal of rows with a specific identifier. |
Privacy budgets#
Finally, the Session
must be initialized with a
privacy budget, which quantifies the maximum privacy loss of a differentially
private program. There are different kinds of privacy budgets, depending on
which variant of differential privacy is used for this quantification.
Base class for specifying the maximal privacy loss of a Session or a query. |
|
|
A privacy budget under pure differential privacy. |
|
A privacy budget under approximate differential privacy. |
|
A privacy budget under rho-zero-concentrated differential privacy. |
Inspecting Session state#
The Session
provides multiple properties and methods
allowing users to inspect its state.
Returns the IDs of the private sources. |
|
Returns the IDs of the public sources. |
|
Returns a dictionary of public source DataFrames. |
|
Returns the remaining privacy_budget left in the session. |
|
|
Describes this session, or one of its tables, or the result of a query. |
Inspecting specific sources#
The schema and properties of each table in a Session
can be inspected using the following methods.
|
Returns the schema for any data source. |
|
Returns the column types for any data source. |
|
Returns an optional column that must be grouped by in this query. |
|
Returns the ID column of a table, if it has one. |
|
Returns the ID space of a table, if it has one. |
Evaluating queries with the Session#
Once a Session
is initialized, users can
build queries and evaluate them using the
relevant Session methods.