_base#
Base classes for metrics.
Classes#
A generic metric. |
|
An output of a Metric with additional metadata. |
|
Base class for metrics computed from DP outputs and a single baseline’s outputs. |
|
Base class for metrics computed from DP outputs and multiple baseline outputs. |
|
Base class for metrics computed from join between single DP and baseline output. |
|
Base class for metrics computed from outputs containing only one value. |
- class Metric(name, description, baselines)#
Bases:
abc.ABC
A generic metric.
Note
This is only available on a paid version of Tumult Analytics. If you would like to hear more, please contact us at info@tmlt.io.
- __init__(name, description, baselines)#
Constructor.
- Parameters
baselines (
str
|List
[str
] |None
Union
[str
,List
[str
],None
]) – The name of the baseline program(s) used for the error report. If None, use all baselines specified as custom baseline and baseline options on tuner class. If no baselines are specified on tuner class, use default baseline. If a string, use only that baseline. If a list, use only those baselines.
- property baselines#
Returns the baselines used for the metric.
- abstract format(value)#
Converts value to human-readable format.
- Parameters
value (Any) –
- __call__(dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes the given metric on the given DP and baseline outputs.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) – The differentially private outputs of the program.
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) – The outputs of the baseline programs.
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) – Optional public dataframes used in error computation.
program_parameters (Optional[Dict[str, Any]]) – Optional program specific parameters used in error computation.
- Return type
- class MetricOutput#
An output of a Metric with additional metadata.
Note
💡 This is only available on a paid version of Tumult Analytics. If you would like to hear more, please contact us at info@tmlt.io.
- name :str#
The name of the metric.
- description :str#
The description of the metric.
- baseline :Union[str, List[str]]#
The name of the baseline program(s) used for the error report.
- metric :Metric#
The metric that was used.
- value :Any#
The value of the metric applied to the program outputs.
- format()#
Returns a string representation of this object.
- class SingleBaselineMetric(name, description, baselines)#
-
Base class for metrics computed from DP outputs and a single baseline’s outputs.
Note
This is only available on a paid version of Tumult Analytics. If you would like to hear more, please contact us at info@tmlt.io.
Subclasses of
SingleBaselineMetric
define acompute_for_baseline
method from DP outputs and one baseline’s outputs to the metric value.- __init__(name, description, baselines)#
Constructor.
- Parameters
baselines (
str
|List
[str
] |None
Union
[str
,List
[str
],None
]) – The name of the baseline program(s) used for the error report. If None, use all baselines specified as custom baseline and baseline options on tuner class. If no baselines are specified on tuner class, use default baseline. If a string, use only that baseline. If a list, use only those baselines.
- property baselines#
Returns the baselines used for the metric.
- abstract format(value)#
Converts value to human-readable format.
- Parameters
value (Any) –
- __call__(dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes the given metric on the given DP and baseline outputs.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) – The differentially private outputs of the program.
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) – The outputs of the baseline programs.
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) – Optional public dataframes used in error computation.
program_parameters (Optional[Dict[str, Any]]) – Optional program specific parameters used in error computation.
- Return type
- class MultiBaselineMetric(name, description, baselines)#
-
Base class for metrics computed from DP outputs and multiple baseline outputs.
Note
This is only available on a paid version of Tumult Analytics. If you would like to hear more, please contact us at info@tmlt.io.
Subclasses of
MultiBaselineMetric
define acompute_for_multiple_baselines
method from DP outputs and a collection of outputs from several baselines to the metric value.- __init__(name, description, baselines)#
Constructor.
- Parameters
baselines (
str
|List
[str
] |None
Union
[str
,List
[str
],None
]) – The name of the baseline program(s) used for the error report. If None, use all baselines specified as custom baseline and baseline options on tuner class. If no baselines are specified on tuner class, use default baseline. If a string, use only that baseline. If a list, use only those baselines.
- compute(dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes the given metric on the given DP and baseline outputs.
The baseline_outputs will already be filtered to only include the baselines that the metric is supposed to use.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) – The differentially private outputs of the program.
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) – The outputs of the baseline programs, after filtering to only include the baselines that the metric is supposed to use.
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) – Optional public dataframes used in error computation.
program_parameters (Optional[Dict[str, Any]]) – Optional program specific parameters used in error computation.
- Return type
- property baselines#
Returns the baselines used for the metric.
- abstract format(value)#
Converts value to human-readable format.
- Parameters
value (Any) –
- __call__(dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes the given metric on the given DP and baseline outputs.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) – The differentially private outputs of the program.
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) – The outputs of the baseline programs.
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) – Optional public dataframes used in error computation.
program_parameters (Optional[Dict[str, Any]]) – Optional program specific parameters used in error computation.
- Return type
- class JoinedOutputMetric(output, join_columns, name, description, baselines)#
Bases:
SingleBaselineMetric
,abc.ABC
Base class for metrics computed from join between single DP and baseline output.
Note
This is only available on a paid version of Tumult Analytics. If you would like to hear more, please contact us at info@tmlt.io.
Subclasses of
JoinedOutputMetric
define acompute_on_joined_output
method which takes in a single dataframe, the result of joining the DP and baseline output tables with the given name on the given list of columns, and returns the metric value. The joined table is the result of performing an outer join between the DP and baseline tables on the given join columns.- Parameters
- __init__(output, join_columns, name, description, baselines)#
Constructor.
- Parameters
baselines (
str
|List
[str
] |None
Union
[str
,List
[str
],None
]) – The name of the baseline program(s) used for the error report. If None, use all baselines specified as custom baseline and baseline options on tuner class. If no baselines are specified on tuner class, use default baseline. If a string, use only that baseline. If a list, use only those baselines.
- check_join_key_uniqueness(joined_output)#
Check if the join keys uniquely identify rows in the joined DataFrame.
- Parameters
joined_output (pyspark.sql.DataFrame) –
- compute_for_baseline(dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes metric value.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) –
baseline_outputs (Dict[str, pyspark.sql.DataFrame]) –
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) –
program_parameters (Optional[Dict[str, Any]]) –
- property baselines#
Returns the baselines used for the metric.
- abstract format(value)#
Converts value to human-readable format.
- Parameters
value (Any) –
- __call__(dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes the given metric on the given DP and baseline outputs.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) – The differentially private outputs of the program.
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) – The outputs of the baseline programs.
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) – Optional public dataframes used in error computation.
program_parameters (Optional[Dict[str, Any]]) – Optional program specific parameters used in error computation.
- Return type
- class ScalarMetric(output, name, description, column=None, baselines=None)#
Bases:
SingleBaselineMetric
,abc.ABC
Base class for metrics computed from outputs containing only one value.
Note
This is only available on a paid version of Tumult Analytics. If you would like to hear more, please contact us at info@tmlt.io.
Subclasses of
ScalarMetric
define acompute_on_scalar
method which takes two values, each one taken from the given column of the given output, and returns a metric value. The given output must contain a single row in both the DP and baseline outputs.- Parameters
- __init__(output, name, description, column=None, baselines=None)#
Constructor.
- Parameters
column (
str
|None
Optional
[str
] (default:None
)) – The column to take the value from. If the given output has only one column, this argument may be omitted.baselines (
str
|List
[str
] |None
Union
[str
,List
[str
],None
] (default:None
)) – The name of the baseline program(s) used for the error report. If None, use all baselines specified as custom baseline and baseline options on tuner class. If no baselines are specified on tuner class, use default baseline. If a string, use only that baseline. If a list, use only those baselines.
- compute_for_baseline(dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Returns the metric value given the DP outputs and the baseline outputs.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) –
baseline_outputs (Dict[str, pyspark.sql.DataFrame]) –
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) –
program_parameters (Optional[Dict[str, Any]]) –
- Return type
Any
- property baselines#
Returns the baselines used for the metric.
- abstract format(value)#
Converts value to human-readable format.
- Parameters
value (Any) –
- __call__(dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes the given metric on the given DP and baseline outputs.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) – The differentially private outputs of the program.
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) – The outputs of the baseline programs.
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) – Optional public dataframes used in error computation.
program_parameters (Optional[Dict[str, Any]]) – Optional program specific parameters used in error computation.
- Return type