_tuner#

Interface for tuning SessionPrograms.

Functions#

baseline()

Decorator to define a custom baseline method for SessionProgramTuner.

metric()

Decorator to define a custom metric method for SessionProgram.

view()

Views of the output table to be used across metrics in place of program outputs.

baseline(name)#

Decorator to define a custom baseline method for SessionProgramTuner.

To use the “default” baseline in addition to this custom baseline, you need to separately specify “default”: NoPrivacySession.Options() in baseline_options class variable.

Parameters

name (str) – A name for the custom baseline.

>>> from tmlt.analytics.session import Session
>>> class Program(SessionProgram):
...     class ProtectedInputs:
...         protected_df: DataFrame
...     class UnprotectedInputs:
...         unprotected_df: DataFrame
...     class Outputs:
...         output_df: DataFrame
...     def session_interaction(self, session: Session):
...         ...
>>> class Tuner(SessionProgramTuner, program=Program):
...     @baseline("custom_baseline")
...     def custom_baseline(
...         protected_inputs: Dict[str, DataFrame],
...     ) -> Dict[str, DataFrame]:
...         ...
...     @baseline("another_custom_baseline")
...     def another_custom_baseline(
...         self,
...         protected_inputs: Dict[str, DataFrame],
...         unprotected_inputs: Dict[str, DataFrame],
...     ) -> Dict[str, DataFrame]:
...         # If the program has unprotected inputs or parameters, the custom
...         # baseline method can take them as an argument.
...         ...
...     baseline_options = {
...         "default": NoPrivacySession.Options()
...     }  # This is required to keep the default baseline
metric(name, output, description=None, baselines=None)#

Decorator to define a custom metric method for SessionProgram.

Alternatively, you can use CustomSingleOutputMetric directly.

To use the built-in metrics in addition to this custom metric, you can separately specify metrics class variable.

Parameters
  • name (str) – A name for the metric.

  • description (Optional[str]) – A description of the metric.

  • output (str) – The output to compute the metric for.

  • baselines (Optional[Union[str, List[str]]]) – The name of the baseline program(s) used for the error report. If None, use all baselines specified as custom baseline and baseline options on tuner class. If no baselines are specified on tuner class, use default baseline. If a string, use only that baseline. If a list, use only those baselines.

>>> from tmlt.analytics.session import Session
>>> from tmlt.analytics.metrics import AbsoluteError
>>> class Program(SessionProgram):
...     class ProtectedInputs:
...         protected_df: DataFrame
...     class UnprotectedInputs:
...         unprotected_df: DataFrame
...     class Outputs:
...         output_df: DataFrame
...     def session_interaction(self, session: Session):
...         return {"output_df": dp_output}
>>> class Tuner(SessionProgramTuner, program=Program):
...     @metric(name="custom_metric", output="output_df")
...     def custom_metric(
...         dp_outputs: DataFrame, baseline_outputs: DataFrame
...     ):
...         # If the program has unprotected inputs and/or parameters, the custom
...         #  metric method can take them as an argument.
...         ...
...     metrics = [
...         AbsoluteError(output="output_df", column="Y"),
...     ]  # This is required to use the built-in metrics
view(name)#

Views of the output table to be used across metrics in place of program outputs.

Parameters

name (str) – A name for the output view.

>>> from tmlt.analytics.session import Session
>>> from tmlt.analytics.metrics import RelativeError
>>> class Program(SessionProgram):
...     class ProtectedInputs:
...         protected_df: DataFrame
...     class UnprotectedInputs:
...         unprotected_df: DataFrame
...     class Outputs:
...         output_df: DataFrame
...     def session_interaction(self, session: Session):
...         ...
>>> class Tuner(SessionProgramTuner, program=Program):
...     @view("output_view")
...     def custom_view1(
...         outputs: Dict[str, DataFrame],
...     ) -> DataFrame:
...         ...
...     @view("another_output_view")
...     def custom_view2(
...         self,
...         outputs: Dict[str, DataFrame],
...         unprotected_inputs: Dict[str, DataFrame],
...     ) -> DataFrame:
...         # If the program has unprotected inputs or parameters, the view method
...         # can take them as an argument.
...         ...
...     metrics = [
...         RelativeError("output_view", column="a_sum"),
...     ] # The view can be used instead of output when metric is defined

Classes#

Tunable

Named placeholder for a single input to a Builder.

TunablePrivateDataFrame

A private dataframe and its protected change.

TunableDataFrameMixin

Add tunable private and public dataframe support to a builder.

TunablePrivacyBudgetMixin

Add support for tunable privacy budgets to a builder.

SessionProgramTuner

Base class for defining an object to tune inputs to a SessionProgram.

class Tunable#

Named placeholder for a single input to a Builder.

Note

This is only available on a paid version of Tumult Analytics. If you would like to hear more, please contact us at info@tmlt.io.

When a Tunable is passed to a Builder, it is replaced with the concrete values for the tunable parameter when building SessionProgram s inside of methods like error_report() and multi_error_report().

name :str#

Name of the tunable parameter.

class TunablePrivateDataFrame#

Bases: NamedTuple

A private dataframe and its protected change.

One or both of the dataframe and protected_change can be a Tunable.

class TunableDataFrameMixin#

Add tunable private and public dataframe support to a builder.

__init__()#

Constructor.

with_private_dataframe(source_id, dataframe, protected_change)#

Add a tunable private dataframe to the builder.

Parameters
with_public_dataframe(source_id, dataframe)#

Add a tunable public dataframe to the builder.

Parameters
with_id_space(id_space)#

Adds an identifier space.

This defines a space of identifiers that map 1-to-1 to the identifiers being protected by a table with the AddRowsWithID protected change. Any table with such a protected change must be a member of some identifier space.

Parameters

id_space (str) –

class TunablePrivacyBudgetMixin#

Add support for tunable privacy budgets to a builder.

__init__()#

Constructor.

with_privacy_budget(privacy_budget)#

Set the privacy budget for the object being built.

Parameters

privacy_budget (Union[tmlt.analytics.privacy_budget.PrivacyBudget, Tunable]) –

class SessionProgramTuner(builder)#

Base class for defining an object to tune inputs to a SessionProgram.

Note

This is only available on a paid version of Tumult Analytics. If you would like to hear more, please contact us at info@tmlt.io.

SessionProgramTuners should not be directly constructed. Instead, users should create a subclass of SessionProgramTuner, then construct their SessionProgramTuner using the auto-generated Builder attribute of the subclass.

Parameters

builder (SessionProgramTuner) –

class Builder#

The builder for a specific subclass of SessionProgramTuner.

with_private_dataframe(source_id, dataframe, protected_change)#

Add a tunable private dataframe to the builder.

Parameters
Return type

SessionProgramTuner

with_public_dataframe(source_id, dataframe)#

Add a tunable public dataframe to the builder.

Parameters
Return type

SessionProgramTuner

with_parameter(name, value)#

Set the value of a parameter.

Parameters
  • name (str) –

  • value (Any) –

build()#

Returns an instance of the matching SessionProgramTuner subtype.

Return type

SessionProgramTuner

with_id_space(id_space)#

Adds an identifier space.

This defines a space of identifiers that map 1-to-1 to the identifiers being protected by a table with the AddRowsWithID protected change. Any table with such a protected change must be a member of some identifier space.

Parameters

id_space (str) –

with_privacy_budget(privacy_budget)#

Set the privacy budget for the object being built.

Parameters

privacy_budget (Union[tmlt.analytics.privacy_budget.PrivacyBudget, Tunable]) –

baseline_options :Optional[Union[Dict[str, tmlt.analytics.no_privacy_session.NoPrivacySession.Options], tmlt.analytics.no_privacy_session.NoPrivacySession.Options]]#

Configuration for how baseline outputs are computed.

By default, a SessionProgramTuner computes both the DP outputs and the baseline outputs for a SessionProgram to compute metrics. The baseline outputs are computed by calling the session_interaction() method with a NoPrivacySession. The baseline_options attribute allows you to override the default options for the NoPrivacySession used to compute the baseline. You can also specify multiple configurations to compute the baselines with different options. When multiple baseline configurations are specified, the metrics are computed with respect to each of the baseline configurations (unless specified otherwise in the metric definitions).

To override the default baseline options (see Options), you can set this to an Options object.

If you want to specify multiple baseline configurations, you can set this to a dictionary mapping baseline names to Options.

metrics :Optional[List[tmlt.analytics.metrics.Metric]]#

A list of metrics to compute in each error_report.

program :Type[tmlt.analytics.program.SessionProgram]#

A subclass of SessionProgram to be tuned.

__init__(builder)#

Constructor.

Warning

This constructor is not intended to be used directly. Use the automatically generated builder instead. It can be accessed using the Builder attribute of the subclass.

Parameters

builder (tmlt.analytics.tuner._tuner.SessionProgramTuner.Builder) –

property tunables#

Returns a list of tunable inputs associated with this tuner.

Return type

List[Tunable]

outputs(tunable_values=None)#

Computes all outputs for a single run.

Parameters

tunable_values (Optional[Dict[str, Any]]) – A dictionary mapping names of Tunables to concrete values to use for this run. Every Tunable used in building this tuner must have a value in this dictionary. This can be None only if no Tunables were used.

Return type

Tuple[Dict[str, pyspark.sql.DataFrame], Dict[str, Dict[str, pyspark.sql.DataFrame]]]

error_report(tunable_values=None)#

Computes DP outputs, baseline outputs, and metrics for a single run.

Parameters

tunable_values (Optional[Dict[str, Any]]) – A dictionary mapping names of Tunables to concrete values to use for this error report. Every Tunable used in building this tuner must have a value in this dictionary. This can be None only if no Tunables were used.

Return type

tmlt.analytics.tuner._error_report.ErrorReport

multi_error_report(tunable_values_list)#

Runs the error_report for each set of values for the Tunables.

Parameters

tunable_values_list (List[Dict[str, Any]]) –

Return type

tmlt.analytics.tuner._error_report.MultiErrorReport