noise_mechanisms#

Measurements for adding noise to individual numbers.

Classes#

AddLaplaceNoise

Add Laplace noise to a number.

AddGeometricNoise

Add Geometric noise to a number.

AddDiscreteGaussianNoise

Add discrete Gaussian noise to a number.

class AddLaplaceNoise(input_domain, scale)#

Bases: tmlt.core.measurements.base.Measurement

Add Laplace noise to a number.

Parameters
__init__(input_domain, scale)#

Constructor.

Parameters
property input_domain(self)#

Return input domain for the measurement.

Return type

tmlt.core.domains.numpy_domains.NumpyDomain

property scale(self)#

Returns the noise scale.

Return type

tmlt.core.utils.exact_number.ExactNumber

property output_type(self)#

Return the output data type after being used as a UDF.

Return type

pyspark.sql.types.DataType

privacy_function(self, d_in)#

Returns the smallest d_out satisfied by the measurement.

The returned d_out is \(\frac{d_{in}}{b}\) (\(\infty\) if \(b = 0\)).

where:

  • \(d_{in}\) is the input argument “d_in”

  • \(b\) is the property “scale”

Parameters

d_in (tmlt.core.utils.exact_number.ExactNumberInput) – Distance between inputs under input_metric.

Return type

tmlt.core.utils.exact_number.ExactNumber

__call__(self, val)#

Returns the value with laplace noise added.

The added laplace noise has the probability density function

\(f(x) = \frac{1}{2 b} e ^ {\frac{-\mid x \mid}{b}}\)

where:

  • \(x\) is a real number

  • \(b\) is the property “scale”

Parameters

val (Union[numpy.int32, numpy.int64, numpy.float32, numpy.float64]) – Value to add Laplace noise to.

Return type

float

property input_metric(self)#

Distance metric on input domain.

Return type

tmlt.core.metrics.Metric

property output_measure(self)#

Distance measure on output.

Return type

tmlt.core.measures.Measure

property is_interactive(self)#

Returns true iff the measurement is interactive.

Return type

bool

privacy_relation(self, d_in, d_out)#

Return True if close inputs produce close outputs.

See the privacy and stability tutorial (add link?) for more information.

Parameters
  • d_in (Any) – Distance between inputs under input_metric.

  • d_out (Any) – Distance between outputs under output_measure.

Return type

bool

class AddGeometricNoise(alpha)#

Bases: tmlt.core.measurements.base.Measurement

Add Geometric noise to a number.

Parameters

alpha (tmlt.core.utils.exact_number.ExactNumberInput) –

__init__(alpha)#

Constructor.

Parameters

alpha (ExactNumber | float | int | str | Fraction | ExprUnion[ExactNumber, float, int, str, Fraction, Expr]) – Noise scale.

property input_domain(self)#

Return input domain for the measurement.

Return type

tmlt.core.domains.numpy_domains.NumpyIntegerDomain

property output_type(self)#

Return the output data type after being used as a UDF.

Return type

pyspark.sql.types.DataType

property alpha(self)#

Returns the noise scale.

Return type

tmlt.core.utils.exact_number.ExactNumber

privacy_function(self, d_in)#

Returns the smallest d_out satisfied by the measurement.

The returned d_out is \(\frac{d_{in}}{\alpha}\) (\(\infty\) if \(\alpha = 0\)).

where:

  • \(d_{in}\) is the input argument “d_in”

  • \(\alpha\) is alpha

Parameters

d_in (tmlt.core.utils.exact_number.ExactNumberInput) – Distance between inputs under input_metric.

Return type

tmlt.core.utils.exact_number.ExactNumber

__call__(self, value)#

Returns the value with double sided geometric noise added.

The added noise has the probability mass function

\[f(k)= \frac {e^{1 / \alpha} - 1} {e^{1 / \alpha} + 1} \cdot e^{\frac{-\mid k \mid}{\alpha}}\]

where:

  • \(k\) is an integer

  • \(\alpha\) is alpha

A double sided geometric distribution is the difference between two geometric distributions (It can be sampled from by sampling a two values from a geometric distribution, and taking their difference).

See section 4.1 in [BV18], remark 2 in this paper, or scipy.stats.geom for more information. (Note that the parameter \(p\) used in scipy.stats.geom is related to \(\alpha\) through \(p = 1 - e^{-1 / \alpha}\)).

Parameters

value (Union[numpy.int32, numpy.int64]) – Value to add geometric noise to.

Return type

int

property input_metric(self)#

Distance metric on input domain.

Return type

tmlt.core.metrics.Metric

property output_measure(self)#

Distance measure on output.

Return type

tmlt.core.measures.Measure

property is_interactive(self)#

Returns true iff the measurement is interactive.

Return type

bool

privacy_relation(self, d_in, d_out)#

Return True if close inputs produce close outputs.

See the privacy and stability tutorial (add link?) for more information.

Parameters
  • d_in (Any) – Distance between inputs under input_metric.

  • d_out (Any) – Distance between outputs under output_measure.

Return type

bool

class AddDiscreteGaussianNoise(sigma_squared)#

Bases: tmlt.core.measurements.base.Measurement

Add discrete Gaussian noise to a number.

Parameters

sigma_squared (tmlt.core.utils.exact_number.ExactNumberInput) –

__init__(sigma_squared)#

Constructor.

Parameters

sigma_squared (ExactNumber | float | int | str | Fraction | ExprUnion[ExactNumber, float, int, str, Fraction, Expr]) – Noise scale. This is the variance of the discrete Gaussian distribution to be used for sampling noise.

property input_domain(self)#

Return input domain for the measurement.

Return type

tmlt.core.domains.numpy_domains.NumpyIntegerDomain

property output_type(self)#

Return the output data type after being used as a UDF.

Return type

pyspark.sql.types.DataType

property sigma_squared(self)#

Returns the noise scale.

Return type

tmlt.core.utils.exact_number.ExactNumber

privacy_function(self, d_in)#

Returns the smallest d_out satisfied by the measurement.

The returned d_out is \(\frac{d_{in}^2}{2 \cdot \sigma^2}\) (\(\infty\) if \(\sigma^2 = 0\)).

where:

  • \(d_{in}\) is the input argument “d_in”

  • \(\sigma^2\) is sigma_squared

See Proposition 1.6 in this paper.

Parameters

d_in (tmlt.core.utils.exact_number.ExactNumberInput) – Distance between inputs under input_metric.

Return type

tmlt.core.utils.exact_number.ExactNumber

__call__(self, value)#

Adds discrete Gaussian noise with specified scale.

The added noise has the probability mass function

\[f(k) = \frac {e^{k^2/2\sigma^2}} { \sum_{n\in \mathbb{Z}} e^{n^2/2\sigma^2} }\]

where:

See [CKS20] for more information. The formula above is based on Definition 1.

Parameters

value (Union[numpy.int32, numpy.int64]) –

Return type

int

property input_metric(self)#

Distance metric on input domain.

Return type

tmlt.core.metrics.Metric

property output_measure(self)#

Distance measure on output.

Return type

tmlt.core.measures.Measure

property is_interactive(self)#

Returns true iff the measurement is interactive.

Return type

bool

privacy_relation(self, d_in, d_out)#

Return True if close inputs produce close outputs.

See the privacy and stability tutorial (add link?) for more information.

Parameters
  • d_in (Any) – Distance between inputs under input_metric.

  • d_out (Any) – Distance between outputs under output_measure.

Return type

bool