Tuning programs#

Note

The features described in this page are only available on a paid version of the Tumult Platform. If you would like to hear more, please contact us at info@tmlt.io.

Parameter tuning and optimization in Tumult is done with the SessionProgramTuner class, an abstract base class that defines the interface for tuning SessionPrograms.

To tune a specific program, users should subclass SessionProgramTuner, passing their SessionProgram as the program class argument.

>>> class Program(SessionProgram):
...     class ProtectedInputs:
...         protected_df: DataFrame
...     class Outputs:
...         b_sum: DataFrame
...     class Parameters:
...         low: int
...         high: int
...     def session_interaction(self, session: Session):
...         low = self.parameters["low"]
...         high = self.parameters["high"]
...         a_values = KeySet.from_dict({"a": ["x", "y"]})
...         sum_query = QueryBuilder("protected_df").groupby(a_values).sum("b", low, high)
...         b_sum = session.evaluate(sum_query, self.privacy_budget)
...         return {"b_sum": b_sum}
>>> class Tuner(SessionProgramTuner, program=Program):
...     @joined_output_metric(name="root_mean_squared_error", output="b_sum", join_columns=["a"])
...     @staticmethod
...     def compute_rmse(joined_output: DataFrame):
...         err = sf.col("b_sum_dp") - sf.col("b_sum_baseline")
...         rmse = joined_output.agg(sf.sqrt(sf.avg(sf.pow(err, sf.lit(2)))).alias("rmse"))
...         return rmse.collect()[0]["rmse"]
...
...     metrics = [
...         MedianRelativeError(
...             output="b_sum",
...             measure_column="b_sum",
...             name=f"mre_{index}",
...             join_columns=["a"],
...         )
...     ]

Just like a SessionProgram, once a subclass of SessionProgramTuner is defined, it can be instantiated using the automatically-generated builder for that class. Unlike a SessionProgram, you can pass Tunable objects to the builder methods instead of concrete values.

>>> protected_df = spark.createDataFrame([("x", 2), ("y", 4)], ["a", "b"])
>>> tuner = (
...     Tuner.Builder()
...     .with_privacy_budget(Tunable("budget"))
...     .with_private_dataframe("protected_df", protected_df, AddOneRow())
...     .with_parameter("low", 0)
...     .with_parameter("high", Tunable("high"))
...     .build()
... )

The run() method can be used to run the program to get the outputs of the DP and baseline programs.

>>> outputs = tuner.run({"budget": PureDPBudget(1), "high": 1})

The error_report() method on the tuner can be used to run the program to get the DP and baseline outputs as well as the metrics defined in the Tuner class.

>>> tuner.error_report({"budget": PureDPBudget(1), "high": 1}).show()  
Error report ran with budget PureDPBudget(epsilon=1) and the following tunable parameters:
budget: PureDPBudget(epsilon=1)
high: 1
and the following additional parameters:
low: 0

Metric results:
+---------+-------------------------+-------------------------------------------------------+
|   Value | Metric                  | Description                                           |
+=========+=========================+=======================================================+
|    0.5  | mre                     | Median relative error for column b_sum of table b_sum |
+---------+-------------------------+-------------------------------------------------------+
|    3.16 | root_mean_squared_error | User-defined metric (no description)                  |
+---------+-------------------------+-------------------------------------------------------+

Another illustrated example of how to use a SessionProgramTuner to tune parameters can be found in the Tuning parameters tutorial.

Defining a SessionProgramTuner#

Classes and methods that can be used or subclassed to define a SessionProgramTuner that can be used to measure error and tune a specific SessionProgram.

SessionProgramTuner

Base class to define tuners to evaluate and optimize DP programs.

Defining baselines#

Baselines can be specified using the baseline_options class variable, or the @baseline decorator.

`SessionProgramTuner.baseline_options`	Configuration for how baseline outputs are computed.
`baseline`(name)	Decorator to define a custom baseline in a `SessionProgramTuner`.

Defining views#

Views can be specified using the views class variable, or the @view decorator.

`SessionProgramTuner.views`	A list of `View` on output tables.
`View`(name, func)	Wrapper to allow users to define a view of the output table.
`view`(name)	Views of the output table to be used across metrics in place of program outputs.

Defining metrics#

Metrics can be specified using the metrics class variable, or by defining a custom method with a metric decorator like @metric, @single_output_metric, or @joined_output_metric. More information about metrics can be found in the API reference page about metrics.

SessionProgramTuner.metrics

A list of metrics to compute in each error_report.

Initializing a SessionProgramTuner#

User-defined subclasses of SessionProgramTuner can be instantiated with the automatically-generated Builder. Each parameter can be specified with a fixed value or with Tunable, to measure error for different values of this parameter.

`SessionProgramTuner.Builder`()	The builder for a specific subclass of SessionProgramTuner.
`Tunable`(name)	Named placeholder for a single input to a `SessionProgramTuner.Builder`.

Inspecting a SessionProgramTuner#

Methods to get information about an instance of SessionProgramTuner.

`SessionProgramTuner.program`	A subclass of `SessionProgram` to be tuned.
`SessionProgramTuner.get_baselines`()	Return all baselines defined in the class.
`SessionProgramTuner.tunables`	Returns a list of tunable inputs associated with this tuner.
`SessionProgramTuner.get_concrete_program`()	Returns the program.

Using a SessionProgramTuner#

Methods to use a SessionProgramTuner to compute outputs and generate error reports, and related classes.

`NamedValue`(value, name)	A parameter value associated with a human-readable name.
`SessionProgramTuner.run`([tunable_values])	Computes all outputs for a single run.
`SessionProgramTuner.error_report`([spec])	Computes a single error report.
`SessionProgramTuner.multi_error_report`(...)	Runs an error report for each set of values for the `Tunable`s.

`RunOutputs`	The results of a single run of the DP program and the baselines.
`ProtectedInput`	A protected input that was used for an `ErrorReport`.
`UnprotectedInput`	An unprotected input that was used for an `ErrorReport`.
`ErrorReport`	Output of a single error report run.
`MultiErrorReport`	Output of an error report run across multiple input combinations.

NoPrivacySession#

To compute baselines, the SessionProgramTuner relies on the NoPrivacySession, a class with the same interface as a Session, but which can run queries without any privacy guarantees. Users should generally not use the NoPrivacySession directly.

NoPrivacySession

Session-like class to evaluate queries without privacy guarantees.