SingleOutputMetric#

from tmlt.tune import SingleOutputMetric
class tmlt.tune.SingleOutputMetric(name, func, description=None, baseline=None, output=None, grouping_columns=None, measure_column=None, empty_value=None)#

Bases: Metric

A metric computed from a single output table, defined using a function.

This metric is defined using a function func. This function must have the following parameters:

  • dp_output: the chosen DP output DataFrame.

  • baseline_output: the chosen baseline output DataFrame.

It may also have the following optional parameters:

  • result_column_name: if the function returns a DataFrame, the metric results should be in a column with this name

  • unprotected_inputs: A dictionary containing the program’s unprotected inputs.

  • parameters: A dictionary containing the program’s parameters.

If the metric does not have grouping columns, the function must return a numeric value, a boolean, or a string. If the metric has grouping columns, then it must return a DataFrame. This DataFrame should contain the grouping columns, and exactly one additional column containing the metric value for each group. This column’s type should be numeric, boolean, or string.

Example

>>> dp_df = spark.createDataFrame(pd.DataFrame({"A": [5]}))
>>> dp_outputs = {"O": dp_df}
>>> baseline_df = spark.createDataFrame(pd.DataFrame({"A": [5]}))
>>> baseline_outputs = {"default": {"O": baseline_df}}
>>> def size_difference(dp_output: DataFrame, baseline_output: DataFrame):
...     return baseline_output.count() - dp_output.count()
>>> metric = SingleOutputMetric(
...     func=size_difference,
...     name="Output size difference",
...     description="Difference in number of rows.",
... )
>>> result = metric(dp_outputs, baseline_outputs).value
>>> result
0
property baseline: str | None#

The name of the baseline specified in the constructor (if any).

property output: str | None#

The name of the output specified in the constructor (if any).

get_baseline(baseline_outputs)#

Returns the name of the single baseline this metric will be applied to.

Return type:

str

get_output(outputs)#

Returns the name of the single output the metric will be applied to.

Return type:

str

get_column_name_from_baselines(baseline_outputs)#

Returns the result column name for a given set of outputs.

required_func_parameters()#

Returns the required parameters to the metric function.

get_parameter_values(dp_outputs, baseline_outputs, unprotected_inputs, parameters)#

Returns values for the function’s parameters.

Return type:

Dict[str, Any]

metric_function_inputs_empty(function_params)#

Determines if the inputs to the metric function are empty.

Return type:

bool

__call__(dp_outputs, baseline_outputs, unprotected_inputs=None, parameters=None)#

Computes the given metric on the given DP and baseline outputs.

Parameters:
Return type:

SingleOutputMetricResult