noise_mechanisms#

Measurements for adding noise to individual numbers.

Classes#

`AddLaplaceNoise`	Add Laplace noise to a number.
`AddGeometricNoise`	Add Geometric noise to a number.
`AddDiscreteGaussianNoise`	Add discrete Gaussian noise to a number.

class AddLaplaceNoise(input_domain, scale)#

Bases: tmlt.core.measurements.base.Measurement

Add Laplace noise to a number.

Parameters

input_domain (Union[tmlt.core.domains.numpy_domains.NumpyIntegerDomain, tmlt.core.domains.numpy_domains.NumpyFloatDomain]) –
scale (tmlt.core.utils.exact_number.ExactNumberInput) –

__init__(input_domain, scale)#

Constructor.

Parameters

input_domain (NumpyIntegerDomain | NumpyFloatDomainUnion[NumpyIntegerDomain, NumpyFloatDomain]) – Input Domain.
scale (ExactNumber | float | int | str | Fraction | ExprUnion[ExactNumber, float, int, str, Fraction, Expr]) – Noise scale.

property input_domain(self)#

Return input domain for the measurement.

Return type: tmlt.core.domains.numpy_domains.NumpyDomain

property scale(self)#

Returns the noise scale.

Return type: tmlt.core.utils.exact_number.ExactNumber

property output_type(self)#

Return the output data type after being used as a UDF.

Return type: pyspark.sql.types.DataType

privacy_function(self, d_in)#

Returns the smallest d_out satisfied by the measurement.

The returned d_out is \(\frac{d_{in}}{b}\) (\(\infty\) if \(b = 0\)).

where:

\(d_{in}\) is the input argument “d_in”
\(b\) is the property “scale”

Parameters: d_in (tmlt.core.utils.exact_number.ExactNumberInput) – Distance between inputs under input_metric.
Return type: tmlt.core.utils.exact_number.ExactNumber

__call__(self, val)#

Returns the value with laplace noise added.

The added laplace noise has the probability density function

\(f(x) = \frac{1}{2 b} e ^ {\frac{-\mid x \mid}{b}}\)

where:

\(x\) is a real number
\(b\) is the property “scale”

Parameters: val (Union[numpy.int32, numpy.int64, numpy.float32, numpy.float64]) – Value to add Laplace noise to.
Return type: float

property input_metric(self)#

Distance metric on input domain.

Return type: tmlt.core.metrics.Metric

property output_measure(self)#

Distance measure on output.

Return type: tmlt.core.measures.Measure

property is_interactive(self)#

Returns true iff the measurement is interactive.

Return type: bool

privacy_relation(self, d_in, d_out)#

Return True if close inputs produce close outputs.

See the privacy and stability tutorial (add link?) for more information.

Parameters

d_in (Any) – Distance between inputs under input_metric.
d_out (Any) – Distance between outputs under output_measure.

Return type

bool

class AddGeometricNoise(alpha)#

Bases: tmlt.core.measurements.base.Measurement

Add Geometric noise to a number.

Parameters: alpha (tmlt.core.utils.exact_number.ExactNumberInput) –

__init__(alpha)#

Constructor.

Parameters: alpha (ExactNumber | float | int | str | Fraction | ExprUnion[ExactNumber, float, int, str, Fraction, Expr]) – Noise scale.

property input_domain(self)#

Return input domain for the measurement.

Return type: tmlt.core.domains.numpy_domains.NumpyIntegerDomain

property output_type(self)#

Return the output data type after being used as a UDF.

Return type: pyspark.sql.types.DataType

property alpha(self)#

Returns the noise scale.

Return type: tmlt.core.utils.exact_number.ExactNumber

privacy_function(self, d_in)#

Returns the smallest d_out satisfied by the measurement.

The returned d_out is \(\frac{d_{in}}{\alpha}\) (\(\infty\) if \(\alpha = 0\)).

where:

\(d_{in}\) is the input argument “d_in”
\(\alpha\) is alpha

Parameters: d_in (tmlt.core.utils.exact_number.ExactNumberInput) – Distance between inputs under input_metric.
Return type: tmlt.core.utils.exact_number.ExactNumber

__call__(self, value)#

Returns the value with double sided geometric noise added.

The added noise has the probability mass function

\[f(k)= \frac {e^{1 / \alpha} - 1} {e^{1 / \alpha} + 1} \cdot e^{\frac{-\mid k \mid}{\alpha}}\]

where:

\(k\) is an integer
\(\alpha\) is alpha

A double sided geometric distribution is the difference between two geometric distributions (It can be sampled from by sampling a two values from a geometric distribution, and taking their difference).

See section 4.1 in [BV18], remark 2 in this paper, or scipy.stats.geom for more information. (Note that the parameter \(p\) used in scipy.stats.geom is related to \(\alpha\) through \(p = 1 - e^{-1 / \alpha}\)).

Parameters: value (Union[numpy.int32, numpy.int64]) – Value to add geometric noise to.
Return type: int

property input_metric(self)#

Distance metric on input domain.

Return type: tmlt.core.metrics.Metric

property output_measure(self)#

Distance measure on output.

Return type: tmlt.core.measures.Measure

property is_interactive(self)#

Returns true iff the measurement is interactive.

Return type: bool

privacy_relation(self, d_in, d_out)#

Return True if close inputs produce close outputs.

See the privacy and stability tutorial (add link?) for more information.

Parameters

d_in (Any) – Distance between inputs under input_metric.
d_out (Any) – Distance between outputs under output_measure.

Return type

bool

class AddDiscreteGaussianNoise(sigma_squared)#

Bases: tmlt.core.measurements.base.Measurement

Add discrete Gaussian noise to a number.

Parameters: sigma_squared (tmlt.core.utils.exact_number.ExactNumberInput) –

__init__(sigma_squared)#

Constructor.

Parameters: sigma_squared (ExactNumber | float | int | str | Fraction | ExprUnion[ExactNumber, float, int, str, Fraction, Expr]) – Noise scale. This is the variance of the discrete Gaussian distribution to be used for sampling noise.

property input_domain(self)#

Return input domain for the measurement.

Return type: tmlt.core.domains.numpy_domains.NumpyIntegerDomain

property output_type(self)#

Return the output data type after being used as a UDF.

Return type: pyspark.sql.types.DataType

property sigma_squared(self)#

Returns the noise scale.

Return type: tmlt.core.utils.exact_number.ExactNumber

privacy_function(self, d_in)#

Returns the smallest d_out satisfied by the measurement.

The returned d_out is \(\frac{d_{in}^2}{2 \cdot \sigma^2}\) (\(\infty\) if \(\sigma^2 = 0\)).

where:

\(d_{in}\) is the input argument “d_in”
\(\sigma^2\) is sigma_squared

See Proposition 1.6 in this paper.

Parameters: d_in (tmlt.core.utils.exact_number.ExactNumberInput) – Distance between inputs under input_metric.
Return type: tmlt.core.utils.exact_number.ExactNumber

__call__(self, value)#

Adds discrete Gaussian noise with specified scale.

The added noise has the probability mass function

\[f(k) = \frac {e^{k^2/2\sigma^2}} { \sum_{n\in \mathbb{Z}} e^{n^2/2\sigma^2} }\]

where:

\(k\) is an integer
\(\sigma^2\) is sigma_squared

See [CKS20] for more information. The formula above is based on Definition 1.

Parameters: value (Union[numpy.int32, numpy.int64]) –
Return type: int

property input_metric(self)#

Distance metric on input domain.

Return type: tmlt.core.metrics.Metric

property output_measure(self)#

Distance measure on output.

Return type: tmlt.core.measures.Measure

property is_interactive(self)#

Returns true iff the measurement is interactive.

Return type: bool

privacy_relation(self, d_in, d_out)#

Return True if close inputs produce close outputs.

See the privacy and stability tutorial (add link?) for more information.

Parameters

d_in (Any) – Distance between inputs under input_metric.
d_out (Any) – Distance between outputs under output_measure.

Return type

bool

Tumult Core

noise_mechanisms#

Classes#