Tuning programs#
Note
The features described in this page are only available on a paid version of the Tumult Platform. If you would like to hear more, please contact us at info@tmlt.io.
Parameter tuning and optimization in Tumult is done with the
SessionProgramTuner
class, an abstract base class that
defines the interface for tuning SessionProgram
s.
To tune a specific program, users should subclass
SessionProgramTuner
, passing their
SessionProgram
as the program
class argument.
>>> class Program(SessionProgram):
... class ProtectedInputs:
... protected_df: DataFrame
... class Outputs:
... b_sum: DataFrame
... class Parameters:
... low: int
... high: int
... def session_interaction(self, session: Session):
... low = self.parameters["low"]
... high = self.parameters["high"]
... a_values = KeySet.from_dict({"a": ["x", "y"]})
... sum_query = QueryBuilder("protected_df").groupby(a_values).sum("b", low, high)
... b_sum = session.evaluate(sum_query, self.privacy_budget)
... return {"b_sum": b_sum}
>>> class Tuner(SessionProgramTuner, program=Program):
... @joined_output_metric(name="root_mean_squared_error", output="b_sum", join_columns=["a"])
... @staticmethod
... def compute_rmse(joined_output: DataFrame):
... err = sf.col("b_sum_dp") - sf.col("b_sum_baseline")
... rmse = joined_output.agg(sf.sqrt(sf.avg(sf.pow(err, sf.lit(2)))).alias("rmse"))
... return rmse.collect()[0]["rmse"]
...
... metrics = [
... MedianRelativeError(
... output="b_sum",
... measure_column="b_sum",
... name=f"mre_{index}",
... join_columns=["a"],
... )
... ]
Just like a SessionProgram
, once a subclass of
SessionProgramTuner
is defined, it can be instantiated using the
automatically-generated builder for that class. Unlike a
SessionProgram
, you can pass Tunable
objects to the builder methods instead of concrete values.
>>> protected_df = spark.createDataFrame([("x", 2), ("y", 4)], ["a", "b"])
>>> tuner = (
... Tuner.Builder()
... .with_privacy_budget(Tunable("budget"))
... .with_private_dataframe("protected_df", protected_df, AddOneRow())
... .with_parameter("low", 0)
... .with_parameter("high", Tunable("high"))
... .build()
... )
The run()
method can be used to run the
program to get the outputs of the DP and baseline programs.
>>> outputs = tuner.run({"budget": PureDPBudget(1), "high": 1})
The error_report()
method on the tuner can be used to
run the program to get the DP and baseline outputs as well as the metrics defined in
the Tuner class.
>>> tuner.error_report({"budget": PureDPBudget(1), "high": 1}).show()
Error report ran with budget PureDPBudget(epsilon=1) and the following tunable parameters:
budget: PureDPBudget(epsilon=1)
high: 1
and the following additional parameters:
low: 0
Metric results:
+---------+-------------------------+-------------------------------------------------------+
| Value | Metric | Description |
+=========+=========================+=======================================================+
| 0.5 | mre | Median relative error for column b_sum of table b_sum |
+---------+-------------------------+-------------------------------------------------------+
| 3.16 | root_mean_squared_error | User-defined metric (no description) |
+---------+-------------------------+-------------------------------------------------------+
Another illustrated example of how to use a SessionProgramTuner
to
tune parameters can be found in the Tuning parameters
tutorial.
Defining a SessionProgramTuner#
Classes and methods that can be used or subclassed to define a
SessionProgramTuner
that can be used to measure error and tune a
specific SessionProgram
.
Base class to define tuners to evaluate and optimize DP programs. |
Defining baselines#
Baselines can be specified using the
baseline_options
class variable, or the
@baseline
decorator.
Configuration for how baseline outputs are computed. |
|
|
Decorator to define a custom baseline in a |
Defining views#
Views can be specified using the views
class
variable, or the @view
decorator.
Defining metrics#
Metrics can be specified using the metrics
class
variable, or by defining a custom method with a metric decorator like
@metric
,
@single_output_metric
, or
@joined_output_metric
. More information about
metrics can be found in the API reference page about metrics.
A list of metrics to compute in each |
Initializing a SessionProgramTuner#
User-defined subclasses of SessionProgramTuner
can be instantiated
with the automatically-generated Builder
.
Each parameter can be specified with a fixed value or with Tunable
,
to measure error for different values of this parameter.
The builder for a specific subclass of SessionProgramTuner. |
|
|
Named placeholder for a single input to a |
Inspecting a SessionProgramTuner#
Methods to get information about an instance of SessionProgramTuner
.
A subclass of |
|
Return all baselines defined in the class. |
|
Returns a list of tunable inputs associated with this tuner. |
|
Returns the program. |
Using a SessionProgramTuner#
Methods to use a SessionProgramTuner
to compute outputs and generate
error reports, and related classes.
|
A parameter value associated with a human-readable name. |
|
Computes all outputs for a single run. |
|
Computes a single error report. |
Runs an error report for each set of values for the |
The results of a single run of the DP program and the baselines. |
|
A protected input that was used for an |
|
An unprotected input that was used for an |
|
Output of a single error report run. |
|
Output of an error report run across multiple input combinations. |
NoPrivacySession#
To compute baselines, the SessionProgramTuner
relies on the
NoPrivacySession
, a class with the same interface as a
Session
, but which can run queries without any privacy guarantees.
Users should generally not use the NoPrivacySession
directly.
|