_base#
Base classes for metrics.
Classes#
A generic metric. |
|
An output of a Metric with additional metadata. |
|
A table version of a metric’s results. |
|
Base class for metrics computed from DP outputs and a single baseline’s outputs. |
|
Base class for metrics computed from DP outputs and multiple baseline outputs. |
|
Base class for metrics computed from join between single DP and baseline output. |
|
Base class for metrics that can be computed on each group in a joined output. |
|
Base class for metrics that are computed on a single measure column. |
|
Base class for metrics computed from outputs containing only one value. |
- class Metric(name, description, baselines)#
Bases:
abc.ABC
A generic metric.
Note
This is only available on a paid version of Tumult Analytics. If you would like to hear more, please contact us at info@tmlt.io.
- __init__(name, description, baselines)#
Constructor.
- Parameters
baselines (
str
|List
[str
] |None
Union
[str
,List
[str
],None
]) – The name of the baseline program(s) used for the error report. If None, use all baselines specified as custom baseline and baseline options on tuner class. If no baselines are specified on tuner class, use default baseline. If a string, use only that baseline. If a list, use only those baselines.
- property baselines#
Returns the baselines used for the metric.
- abstract format(value)#
Converts value to human-readable format.
- Parameters
value (Any) –
- abstract format_as_table_row(result)#
Return a table row summarizing the metric result.
- Parameters
result (tmlt.analytics.metrics.MetricResult) –
- Return type
- format_as_dataframe(result)#
Returns the results of this metric formatted as a dataframe.
- Parameters
result (tmlt.analytics.metrics.MetricResult) –
- Return type
- abstract check_compatibility_with_data(dp_outputs, baseline_outputs)#
Check that the outputs have all the structure the metric expects.
Should throw a ValueError if the metric is not compatible.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) –
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) –
- __call__(dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes the given metric on the given DP and baseline outputs.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) – The differentially private outputs of the program.
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) – The outputs of the baseline programs.
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) – Optional public dataframes used in error computation.
program_parameters (Optional[Dict[str, Any]]) – Optional program specific parameters used in error computation.
- Return type
- class MetricResult#
An output of a Metric with additional metadata.
Note
💡 This is only available on a paid version of Tumult Analytics. If you would like to hear more, please contact us at info@tmlt.io.
- name :str#
The name of the metric.
- description :str#
The description of the metric.
- baseline :Union[str, List[str]]#
The name of the baseline program(s) used for the error report.
- metric :Metric#
The metric that was used.
- value :Any#
The value of the metric applied to the program outputs.
- format_as_table_row()#
Return a table row summarizing the metric result.
- Return type
- format_as_dataframe()#
Returns the results of this metric formatted as a dataframe.
- Return type
- class MetricResultDataframe#
Bases:
NamedTuple
A table version of a metric’s results.
- df :pandas.DataFrame#
The results, formatted as a dataframe.
- value_column :str#
The name of the column containing the metric value.
- class SingleBaselineMetric(name, description, baselines)#
-
Base class for metrics computed from DP outputs and a single baseline’s outputs.
Note
This is only available on a paid version of Tumult Analytics. If you would like to hear more, please contact us at info@tmlt.io.
Subclasses of
SingleBaselineMetric
define acompute_for_baseline
method from DP outputs and one baseline’s outputs to the metric value.- __init__(name, description, baselines)#
Constructor.
- Parameters
baselines (
str
|List
[str
] |None
Union
[str
,List
[str
],None
]) – The name of the baseline program(s) used for the error report. If None, use all baselines specified as custom baseline and baseline options on tuner class. If no baselines are specified on tuner class, use default baseline. If a string, use only that baseline. If a list, use only those baselines.
- abstract check_compatibility_with_outputs(outputs, output_name)#
Check that a particular output is compatible with the metric.
Should throw a ValueError if the metric is not compatible.
- Parameters
outputs (Dict[str, pyspark.sql.DataFrame]) –
output_name (str) –
- check_compatibility_with_data(dp_outputs, baseline_outputs)#
Check that the outputs have all the structure the metric expects.
Should throw a ValueError if the metric is not compatible.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) –
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) –
- property baselines#
Returns the baselines used for the metric.
- abstract format(value)#
Converts value to human-readable format.
- Parameters
value (Any) –
- abstract format_as_table_row(result)#
Return a table row summarizing the metric result.
- Parameters
result (tmlt.analytics.metrics.MetricResult) –
- Return type
- format_as_dataframe(result)#
Returns the results of this metric formatted as a dataframe.
- Parameters
result (tmlt.analytics.metrics.MetricResult) –
- Return type
- __call__(dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes the given metric on the given DP and baseline outputs.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) – The differentially private outputs of the program.
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) – The outputs of the baseline programs.
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) – Optional public dataframes used in error computation.
program_parameters (Optional[Dict[str, Any]]) – Optional program specific parameters used in error computation.
- Return type
- class MultiBaselineMetric(name, description, baselines)#
-
Base class for metrics computed from DP outputs and multiple baseline outputs.
Note
This is only available on a paid version of Tumult Analytics. If you would like to hear more, please contact us at info@tmlt.io.
Subclasses of
MultiBaselineMetric
define acompute_for_multiple_baselines
method from DP outputs and a collection of outputs from several baselines to the metric value.- __init__(name, description, baselines)#
Constructor.
- Parameters
baselines (
str
|List
[str
] |None
Union
[str
,List
[str
],None
]) – The name of the baseline program(s) used for the error report. If None, use all baselines specified as custom baseline and baseline options on tuner class. If no baselines are specified on tuner class, use default baseline. If a string, use only that baseline. If a list, use only those baselines.
- compute(dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes the given metric on the given DP and baseline outputs.
The baseline_outputs will already be filtered to only include the baselines that the metric is supposed to use.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) – The differentially private outputs of the program.
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) – The outputs of the baseline programs, after filtering to only include the baselines that the metric is supposed to use.
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) – Optional public dataframes used in error computation.
program_parameters (Optional[Dict[str, Any]]) – Optional program specific parameters used in error computation.
- Return type
- property baselines#
Returns the baselines used for the metric.
- abstract format(value)#
Converts value to human-readable format.
- Parameters
value (Any) –
- abstract format_as_table_row(result)#
Return a table row summarizing the metric result.
- Parameters
result (tmlt.analytics.metrics.MetricResult) –
- Return type
- format_as_dataframe(result)#
Returns the results of this metric formatted as a dataframe.
- Parameters
result (tmlt.analytics.metrics.MetricResult) –
- Return type
- abstract check_compatibility_with_data(dp_outputs, baseline_outputs)#
Check that the outputs have all the structure the metric expects.
Should throw a ValueError if the metric is not compatible.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) –
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) –
- __call__(dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes the given metric on the given DP and baseline outputs.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) – The differentially private outputs of the program.
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) – The outputs of the baseline programs.
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) – Optional public dataframes used in error computation.
program_parameters (Optional[Dict[str, Any]]) – Optional program specific parameters used in error computation.
- Return type
- class JoinedOutputMetric(output, join_columns, name, description, baselines, join_how='inner', dropna_columns=None, indicator_column_name=None)#
Bases:
SingleBaselineMetric
,abc.ABC
Base class for metrics computed from join between single DP and baseline output.
Note
This is only available on a paid version of Tumult Analytics. If you would like to hear more, please contact us at info@tmlt.io.
Subclasses of
JoinedOutputMetric
define acompute_on_joined_output
method which takes in a single dataframe, the result of joining the DP and baseline output tables with the given name on the given list of columns, and returns the metric value. The joined table is the result of performing a join (default inner join) between the DP and baseline tables on the given join columns.Methods# Returns the name of the run output or view name.
Returns the name of the join columns.
Returns the name of the indicator column.
Check that a particular set of outputs is compatible with the metric.
Check if the join keys uniquely identify rows in the joined DataFrame.
Computes metric value.
Return a table row summarizing the metric result.
Check that the outputs have all the structure the metric expects.
Returns the name of the metric.
Returns the description of the metric.
Returns the baselines used for the metric.
Converts value to human-readable format.
Returns the results of this metric formatted as a dataframe.
Computes the given metric on the given DP and baseline outputs.
- Parameters
- __init__(output, join_columns, name, description, baselines, join_how='inner', dropna_columns=None, indicator_column_name=None)#
Constructor.
- Parameters
baselines (
str
|List
[str
] |None
Union
[str
,List
[str
],None
]) – The name of the baseline program(s) used for the error report. If None, use all baselines specified as custom baseline and baseline options on tuner class. If no baselines are specified on tuner class, use default baseline. If a string, use only that baseline. If a list, use only those baselines.join_how (
str
str
(default:'inner'
)) – The type of join to perform. Must be one of “left”, “right”, “inner”, “outer”. Defaults to “inner”.dropna_columns (
List
[str
] |None
Optional
[List
[str
]] (default:None
)) – If specified, rows with nulls in these columns will be dropped.indicator_column_name (
str
|None
Optional
[str
] (default:None
)) – If specified, we will add a column with the specified name to the joined data that contains either “dp”, “baseline”, or “both” to indicate where the values in the row came from.
- check_compatibility_with_outputs(outputs, output_name)#
Check that a particular set of outputs is compatible with the metric.
Should throw a ValueError if the metric is not compatible.
- Parameters
outputs (Dict[str, pyspark.sql.DataFrame]) –
output_name (str) –
- check_join_key_uniqueness(joined_output)#
Check if the join keys uniquely identify rows in the joined DataFrame.
- Parameters
joined_output (pyspark.sql.DataFrame) –
- compute_for_baseline(baseline_name, dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes metric value.
- Parameters
baseline_name (str) –
dp_outputs (Dict[str, pyspark.sql.DataFrame]) –
baseline_outputs (Dict[str, pyspark.sql.DataFrame]) –
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) –
program_parameters (Optional[Dict[str, Any]]) –
- format_as_table_row(result)#
Return a table row summarizing the metric result.
- Parameters
result (MetricResult) –
- Return type
- check_compatibility_with_data(dp_outputs, baseline_outputs)#
Check that the outputs have all the structure the metric expects.
Should throw a ValueError if the metric is not compatible.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) –
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) –
- property baselines#
Returns the baselines used for the metric.
- abstract format(value)#
Converts value to human-readable format.
- Parameters
value (Any) –
- format_as_dataframe(result)#
Returns the results of this metric formatted as a dataframe.
- Parameters
result (tmlt.analytics.metrics.MetricResult) –
- Return type
- __call__(dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes the given metric on the given DP and baseline outputs.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) – The differentially private outputs of the program.
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) – The outputs of the baseline programs.
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) – Optional public dataframes used in error computation.
program_parameters (Optional[Dict[str, Any]]) – Optional program specific parameters used in error computation.
- Return type
- class GroupedMetric(output, join_columns, name, description, baselines, grouping_columns=None, join_how='inner', dropna_columns=None, indicator_column_name=None)#
Bases:
JoinedOutputMetric
,abc.ABC
Base class for metrics that can be computed on each group in a joined output.
Note
This is only available on a paid version of Tumult Analytics. If you would like to hear more, please contact us at info@tmlt.io.
Subclasses of
GroupedMetric
define acompute_on_grouped_output
method which takes in a single grouped dataframe, the result of joining the DP and baseline output tables with the given name on the given list of columns and grouping by the grouping column, and returns the metric value. The joined table is the result of performing an inner join between the DP and baseline tables on the given join columns.Methods# Returns the names of the grouping columns.
Check that a particular set of outputs is compatible with the metric.
Computes metric value from the joined, grouped DP and baseline output.
Return a table row summarizing the metric result.
Returns the results of this metric formatted as a dataframe.
Returns the name of the run output or view name.
Returns the name of the join columns.
Returns the name of the indicator column.
Check if the join keys uniquely identify rows in the joined DataFrame.
Computes metric value.
Check that the outputs have all the structure the metric expects.
Returns the name of the metric.
Returns the description of the metric.
Returns the baselines used for the metric.
Converts value to human-readable format.
Computes the given metric on the given DP and baseline outputs.
- Parameters
- __init__(output, join_columns, name, description, baselines, grouping_columns=None, join_how='inner', dropna_columns=None, indicator_column_name=None)#
Constructor.
- Parameters
baselines (
str
|List
[str
] |None
Union
[str
,List
[str
],None
]) – The name of the baseline program(s) used for the error report. If None, use all baselines specified as custom baseline and baseline options on tuner class. If no baselines are specified on tuner class, use default baseline. If a string, use only that baseline. If a list, use only those baselines.grouping_columns (
List
[str
] |None
Optional
[List
[str
]] (default:None
)) – A set of columns that will be used to group the DP and baseline outputs. The error metric will be calculated for each group, and returned in a table. If grouping columns are None, the metric will be calculated over the whole output, and returned as a single number.join_how (
str
str
(default:'inner'
)) – The type of join to perform. Must be one of “left”, “right”, “inner”, “outer”. Defaults to “inner”.dropna_columns (
List
[str
] |None
Optional
[List
[str
]] (default:None
)) – If specified, rows with nulls in these columns will be dropped.indicator_column_name (
str
|None
Optional
[str
] (default:None
)) – If specified, we will add a column with the specified name to the joined data that contains either “dp”, “baseline”, or “both” to indicate where the values in the row came from.
- check_compatibility_with_outputs(outputs, output_name)#
Check that a particular set of outputs is compatible with the metric.
Should throw a ValueError if the metric is not compatible.
- Parameters
outputs (Dict[str, pyspark.sql.DataFrame]) –
output_name (str) –
- abstract compute_on_grouped_output(grouped_output, baseline_name, unprotected_inputs=None, program_parameters=None)#
Computes metric value from the joined, grouped DP and baseline output.
If grouping columns are empty, the grouped output will have one group that is the entire dataset.
- Parameters
grouped_output (pyspark.sql.GroupedData) –
baseline_name (str) –
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) –
program_parameters (Optional[Dict[str, Any]]) –
- format_as_table_row(result)#
Return a table row summarizing the metric result.
- Parameters
result (tmlt.analytics.metrics.MetricResult) –
- Return type
- format_as_dataframe(result)#
Returns the results of this metric formatted as a dataframe.
- Parameters
result (tmlt.analytics.metrics.MetricResult) –
- Return type
- check_join_key_uniqueness(joined_output)#
Check if the join keys uniquely identify rows in the joined DataFrame.
- Parameters
joined_output (pyspark.sql.DataFrame) –
- compute_for_baseline(baseline_name, dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes metric value.
- Parameters
baseline_name (str) –
dp_outputs (Dict[str, pyspark.sql.DataFrame]) –
baseline_outputs (Dict[str, pyspark.sql.DataFrame]) –
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) –
program_parameters (Optional[Dict[str, Any]]) –
- check_compatibility_with_data(dp_outputs, baseline_outputs)#
Check that the outputs have all the structure the metric expects.
Should throw a ValueError if the metric is not compatible.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) –
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) –
- property baselines#
Returns the baselines used for the metric.
- abstract format(value)#
Converts value to human-readable format.
- Parameters
value (Any) –
- __call__(dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes the given metric on the given DP and baseline outputs.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) – The differentially private outputs of the program.
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) – The outputs of the baseline programs.
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) – Optional public dataframes used in error computation.
program_parameters (Optional[Dict[str, Any]]) – Optional program specific parameters used in error computation.
- Return type
- class MeasureColumnMetric(output, join_columns, measure_column, name, description, baselines, grouping_columns=None, join_how='inner', dropna_columns=None)#
Bases:
GroupedMetric
,abc.ABC
Base class for metrics that are computed on a single measure column.
Note
This is only available on a paid version of Tumult Analytics. If you would like to hear more, please contact us at info@tmlt.io.
Methods# Returns the names of the grouping columns.
Check that a particular set of outputs is compatible with the metric.
Return a table row summarizing the metric result.
Returns the results of this metric formatted as a dataframe.
Returns the names of the grouping columns.
Computes metric value from the joined, grouped DP and baseline output.
Returns the name of the run output or view name.
Returns the name of the join columns.
Returns the name of the indicator column.
Check if the join keys uniquely identify rows in the joined DataFrame.
Computes metric value.
Check that the outputs have all the structure the metric expects.
Returns the name of the metric.
Returns the description of the metric.
Returns the baselines used for the metric.
Converts value to human-readable format.
Computes the given metric on the given DP and baseline outputs.
- Parameters
- __init__(output, join_columns, measure_column, name, description, baselines, grouping_columns=None, join_how='inner', dropna_columns=None)#
Constructor.
- Parameters
measure_column (
str
str
) – The column the measure will be calculated on.baselines (
str
|List
[str
] |None
Union
[str
,List
[str
],None
]) – The name of the baseline program(s) used for the error report. If None, use all baselines specified as custom baseline and baseline options on tuner class. If no baselines are specified on tuner class, use default baseline. If a string, use only that baseline. If a list, use only those baselines.grouping_columns (
List
[str
] |None
Optional
[List
[str
]] (default:None
)) – A set of columns that will be used to group the DP and baseline outputs. The error metric will be calculated for each group, and returned in a table. If grouping columns are None, the metric will be calculated over the whole output, and returned as a single number.join_how (
str
str
(default:'inner'
)) – The type of join to perform. Must be one of “left”, “right”, “inner”, “outer”. Defaults to “outer”.dropna_columns (
List
[str
] |None
Optional
[List
[str
]] (default:None
)) – If specified, rows with nulls in these columns will be dropped.
- check_compatibility_with_outputs(outputs, output_name)#
Check that a particular set of outputs is compatible with the metric.
Should throw a ValueError if the metric is not compatible.
- Parameters
outputs (Dict[str, pyspark.sql.DataFrame]) –
output_name (str) –
- format_as_table_row(result)#
Return a table row summarizing the metric result.
- Parameters
result (MetricResult) –
- Return type
- format_as_dataframe(result)#
Returns the results of this metric formatted as a dataframe.
- Parameters
result (tmlt.analytics.metrics.MetricResult) –
- Return type
- abstract compute_on_grouped_output(grouped_output, baseline_name, unprotected_inputs=None, program_parameters=None)#
Computes metric value from the joined, grouped DP and baseline output.
If grouping columns are empty, the grouped output will have one group that is the entire dataset.
- Parameters
grouped_output (pyspark.sql.GroupedData) –
baseline_name (str) –
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) –
program_parameters (Optional[Dict[str, Any]]) –
- check_join_key_uniqueness(joined_output)#
Check if the join keys uniquely identify rows in the joined DataFrame.
- Parameters
joined_output (pyspark.sql.DataFrame) –
- compute_for_baseline(baseline_name, dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes metric value.
- Parameters
baseline_name (str) –
dp_outputs (Dict[str, pyspark.sql.DataFrame]) –
baseline_outputs (Dict[str, pyspark.sql.DataFrame]) –
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) –
program_parameters (Optional[Dict[str, Any]]) –
- check_compatibility_with_data(dp_outputs, baseline_outputs)#
Check that the outputs have all the structure the metric expects.
Should throw a ValueError if the metric is not compatible.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) –
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) –
- property baselines#
Returns the baselines used for the metric.
- abstract format(value)#
Converts value to human-readable format.
- Parameters
value (Any) –
- __call__(dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes the given metric on the given DP and baseline outputs.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) – The differentially private outputs of the program.
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) – The outputs of the baseline programs.
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) – Optional public dataframes used in error computation.
program_parameters (Optional[Dict[str, Any]]) – Optional program specific parameters used in error computation.
- Return type
- class ScalarMetric(output, name, description, column=None, baselines=None)#
Bases:
SingleBaselineMetric
,abc.ABC
Base class for metrics computed from outputs containing only one value.
Note
This is only available on a paid version of Tumult Analytics. If you would like to hear more, please contact us at info@tmlt.io.
Subclasses of
ScalarMetric
define acompute_on_scalar
method which takes two values, each one taken from the given column of the given output, and returns a metric value. The given output must contain a single row in both the DP and baseline outputs.Methods# Returns the name of the run output or view name.
Returns the name of the value column, if it is set.
Check that a particular set of outputs is compatible with the metric.
Returns the metric value given the DP outputs and the baseline outputs.
Check that the outputs have all the structure the metric expects.
Returns the name of the metric.
Returns the description of the metric.
Returns the baselines used for the metric.
Converts value to human-readable format.
Return a table row summarizing the metric result.
Returns the results of this metric formatted as a dataframe.
Computes the given metric on the given DP and baseline outputs.
- Parameters
- __init__(output, name, description, column=None, baselines=None)#
Constructor.
- Parameters
column (
str
|None
Optional
[str
] (default:None
)) – The column to take the value from. If the given output has only one column, this argument may be omitted.baselines (
str
|List
[str
] |None
Union
[str
,List
[str
],None
] (default:None
)) – The name of the baseline program(s) used for the error report. If None, use all baselines specified as custom baseline and baseline options on tuner class. If no baselines are specified on tuner class, use default baseline. If a string, use only that baseline. If a list, use only those baselines.
- check_compatibility_with_outputs(outputs, output_name)#
Check that a particular set of outputs is compatible with the metric.
Should throw a ValueError if the metric is not compatible.
- Parameters
outputs (Dict[str, pyspark.sql.DataFrame]) –
output_name (str) –
- compute_for_baseline(baseline_name, dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Returns the metric value given the DP outputs and the baseline outputs.
- Parameters
baseline_name (str) –
dp_outputs (Dict[str, pyspark.sql.DataFrame]) –
baseline_outputs (Dict[str, pyspark.sql.DataFrame]) –
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) –
program_parameters (Optional[Dict[str, Any]]) –
- Return type
Any
- check_compatibility_with_data(dp_outputs, baseline_outputs)#
Check that the outputs have all the structure the metric expects.
Should throw a ValueError if the metric is not compatible.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) –
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) –
- property baselines#
Returns the baselines used for the metric.
- abstract format(value)#
Converts value to human-readable format.
- Parameters
value (Any) –
- abstract format_as_table_row(result)#
Return a table row summarizing the metric result.
- Parameters
result (tmlt.analytics.metrics.MetricResult) –
- Return type
- format_as_dataframe(result)#
Returns the results of this metric formatted as a dataframe.
- Parameters
result (tmlt.analytics.metrics.MetricResult) –
- Return type
- __call__(dp_outputs, baseline_outputs, unprotected_inputs=None, program_parameters=None)#
Computes the given metric on the given DP and baseline outputs.
- Parameters
dp_outputs (Dict[str, pyspark.sql.DataFrame]) – The differentially private outputs of the program.
baseline_outputs (Dict[str, Dict[str, pyspark.sql.DataFrame]]) – The outputs of the baseline programs.
unprotected_inputs (Optional[Dict[str, pyspark.sql.DataFrame]]) – Optional public dataframes used in error computation.
program_parameters (Optional[Dict[str, Any]]) – Optional program specific parameters used in error computation.
- Return type