noise_mechanisms#

Measurements for adding noise to individual numbers.

Classes#

AddLaplaceNoise

Add Laplace noise to a number.

AddGeometricNoise

Add Geometric noise to a number.

AddDiscreteGaussianNoise

Add discrete Gaussian noise to a number.

AddGaussianNoise

Add Gaussian noise to a number.

class AddLaplaceNoise(input_domain, scale)#

Bases: tmlt.core.measurements.base.Measurement

Add Laplace noise to a number.

Parameters:
property input_domain: tmlt.core.domains.numpy_domains.NumpyDomain#

Return input domain for the measurement.

Return type:

tmlt.core.domains.numpy_domains.NumpyDomain

property scale: tmlt.core.utils.exact_number.ExactNumber#

Returns the noise scale.

Return type:

tmlt.core.utils.exact_number.ExactNumber

property output_type: pyspark.sql.types.DataType#

Return the output data type after being used as a UDF.

Return type:

pyspark.sql.types.DataType

property input_metric: tmlt.core.metrics.Metric#

Distance metric on input domain.

Return type:

tmlt.core.metrics.Metric

property output_measure: tmlt.core.measures.Measure#

Distance measure on output.

Return type:

tmlt.core.measures.Measure

property is_interactive: bool#

Returns true iff the measurement is interactive.

Return type:

bool

__init__(input_domain, scale)#

Constructor.

Parameters:
privacy_function(d_in)#

Returns the smallest d_out satisfied by the measurement.

The returned d_out is \(\frac{d_{in}}{b}\) (\(\infty\) if \(b = 0\)).

where:

  • \(d_{in}\) is the input argument “d_in”

  • \(b\) is the property “scale”

Parameters:

d_in (tmlt.core.utils.exact_number.ExactNumberInput) – Distance between inputs under input_metric.

Return type:

tmlt.core.utils.exact_number.ExactNumber

__call__(val)#

Returns the value with laplace noise added.

The added laplace noise has the probability density function

\(f(x) = \frac{1}{2 b} e ^ {\frac{-\mid x \mid}{b}}\)

where:

  • \(x\) is a real number

  • \(b\) is the property “scale”

Parameters:

val (Union[numpy.int32, numpy.int64, numpy.float32, numpy.float64, float, int]) – Value to add Laplace noise to.

Return type:

float

classmethod inverse_cdf(scale, probability)#

Inverse CDF function.

Given a probability, returns a point x in the range such that the noise generated by this measurement will be <= x with probability probability.

Parameters:
  • scale (float) – The noise scale.

  • probability (float) – The probability that noise generated by this class should fall below the threshold it returns. Must be in [0, 1].

Return type:

float

privacy_relation(d_in, d_out)#

Return True if close inputs produce close outputs.

See the privacy and stability tutorial (add link?) for more information.

Parameters:
  • d_in (Any) – Distance between inputs under input_metric.

  • d_out (Any) – Distance between outputs under output_measure.

Return type:

bool

class AddGeometricNoise(alpha)#

Bases: tmlt.core.measurements.base.Measurement

Add Geometric noise to a number.

Parameters:

alpha (tmlt.core.utils.exact_number.ExactNumberInput) –

property input_domain: tmlt.core.domains.numpy_domains.NumpyIntegerDomain#

Return input domain for the measurement.

Return type:

tmlt.core.domains.numpy_domains.NumpyIntegerDomain

property output_type: pyspark.sql.types.DataType#

Return the output data type after being used as a UDF.

Return type:

pyspark.sql.types.DataType

property alpha: tmlt.core.utils.exact_number.ExactNumber#

Returns the noise scale.

Return type:

tmlt.core.utils.exact_number.ExactNumber

property input_metric: tmlt.core.metrics.Metric#

Distance metric on input domain.

Return type:

tmlt.core.metrics.Metric

property output_measure: tmlt.core.measures.Measure#

Distance measure on output.

Return type:

tmlt.core.measures.Measure

property is_interactive: bool#

Returns true iff the measurement is interactive.

Return type:

bool

__init__(alpha)#

Constructor.

Parameters:

alpha (Union[ExactNumber, float, int, str, Fraction, Expr]) – Noise scale.

privacy_function(d_in)#

Returns the smallest d_out satisfied by the measurement.

The returned d_out is \(\frac{d_{in}}{\alpha}\) (\(\infty\) if \(\alpha = 0\)).

where:

  • \(d_{in}\) is the input argument “d_in”

  • \(\alpha\) is alpha

Parameters:

d_in (tmlt.core.utils.exact_number.ExactNumberInput) – Distance between inputs under input_metric.

Return type:

tmlt.core.utils.exact_number.ExactNumber

__call__(value)#

Returns the value with double sided geometric noise added.

The added noise has the probability mass function

\[f(k)= \frac {e^{1 / \alpha} - 1} {e^{1 / \alpha} + 1} \cdot e^{\frac{-\mid k \mid}{\alpha}}\]

where:

  • \(k\) is an integer

  • \(\alpha\) is alpha

A double sided geometric distribution is the difference between two geometric distributions (It can be sampled from by sampling a two values from a geometric distribution, and taking their difference).

See section 4.1 in [BV18], remark 2 in this paper, or scipy.stats.geom for more information. (Note that the parameter \(p\) used in scipy.stats.geom is related to \(\alpha\) through \(p = 1 - e^{-1 / \alpha}\)).

Parameters:

value (Union[numpy.int32, numpy.int64, float, int]) – Value to add geometric noise to.

Return type:

int

classmethod inverse_cdf(alpha, probability)#

Inverse CDF function.

Given a probability, returns a point x in the range such that the noise generated by this measurement will be <= x with probability probability.

Parameters:
  • alpha (float) – The noise scale.

  • probability (float) – The probability that noise generated by this class should fall below the threshold it returns. Must be in [0, 1].

Return type:

float

privacy_relation(d_in, d_out)#

Return True if close inputs produce close outputs.

See the privacy and stability tutorial (add link?) for more information.

Parameters:
  • d_in (Any) – Distance between inputs under input_metric.

  • d_out (Any) – Distance between outputs under output_measure.

Return type:

bool

class AddDiscreteGaussianNoise(sigma_squared)#

Bases: tmlt.core.measurements.base.Measurement

Add discrete Gaussian noise to a number.

Parameters:

sigma_squared (tmlt.core.utils.exact_number.ExactNumberInput) –

property input_domain: tmlt.core.domains.numpy_domains.NumpyIntegerDomain#

Return input domain for the measurement.

Return type:

tmlt.core.domains.numpy_domains.NumpyIntegerDomain

property output_type: pyspark.sql.types.DataType#

Return the output data type after being used as a UDF.

Return type:

pyspark.sql.types.DataType

property sigma_squared: tmlt.core.utils.exact_number.ExactNumber#

Returns the noise scale.

Return type:

tmlt.core.utils.exact_number.ExactNumber

property input_metric: tmlt.core.metrics.Metric#

Distance metric on input domain.

Return type:

tmlt.core.metrics.Metric

property output_measure: tmlt.core.measures.Measure#

Distance measure on output.

Return type:

tmlt.core.measures.Measure

property is_interactive: bool#

Returns true iff the measurement is interactive.

Return type:

bool

__init__(sigma_squared)#

Constructor.

Parameters:

sigma_squared (Union[ExactNumber, float, int, str, Fraction, Expr]) – Noise scale. This is the variance of the discrete Gaussian distribution to be used for sampling noise.

privacy_function(d_in)#

Returns the smallest d_out satisfied by the measurement.

The returned d_out is \(\frac{d_{in}^2}{2 \cdot \sigma^2}\) (\(\infty\) if \(\sigma^2 = 0\)).

where:

  • \(d_{in}\) is the input argument “d_in”

  • \(\sigma^2\) is sigma_squared

See Proposition 1.6 in this paper.

Parameters:

d_in (tmlt.core.utils.exact_number.ExactNumberInput) – Distance between inputs under input_metric.

Return type:

tmlt.core.utils.exact_number.ExactNumber

__call__(value)#

Adds discrete Gaussian noise with specified scale.

The added noise has the probability mass function

\[f(k) = \frac {e^{-k^2/2\sigma^2}} { \sum_{n\in \mathbb{Z}} e^{n^2/2\sigma^2} }\]

where:

See [CKS20] for more information. The formula above is based on Definition 1.

Parameters:

value (Union[numpy.int32, numpy.int64, float, int]) –

Return type:

int

classmethod inverse_cdf(sigma_squared, probability)#

Inverse CDF function.

Given a probability, returns a point x in the range such that the noise generated by this measurement will be <= x with probability probability.

Parameters:
  • sigma_squared (float) – The noise scale.

  • probability (float) – The probability that noise generated by this class should fall below the threshold it returns. Must be in [0, 1].

Return type:

float

privacy_relation(d_in, d_out)#

Return True if close inputs produce close outputs.

See the privacy and stability tutorial (add link?) for more information.

Parameters:
  • d_in (Any) – Distance between inputs under input_metric.

  • d_out (Any) – Distance between outputs under output_measure.

Return type:

bool

class AddGaussianNoise(input_domain, sigma_squared)#

Bases: tmlt.core.measurements.base.Measurement

Add Gaussian noise to a number.

Parameters:
property input_domain: tmlt.core.domains.numpy_domains.NumpyDomain#

Return input domain for the measurement.

Return type:

tmlt.core.domains.numpy_domains.NumpyDomain

property output_type: pyspark.sql.types.DataType#

Return the output data type after being used as a UDF.

Return type:

pyspark.sql.types.DataType

property sigma_squared: tmlt.core.utils.exact_number.ExactNumber#

Returns the noise scale.

Return type:

tmlt.core.utils.exact_number.ExactNumber

property input_metric: tmlt.core.metrics.Metric#

Distance metric on input domain.

Return type:

tmlt.core.metrics.Metric

property output_measure: tmlt.core.measures.Measure#

Distance measure on output.

Return type:

tmlt.core.measures.Measure

property is_interactive: bool#

Returns true iff the measurement is interactive.

Return type:

bool

__init__(input_domain, sigma_squared)#

Constructor.

Parameters:
privacy_function(d_in)#

Returns the smallest d_out satisfied by the measurement.

The returned d_out is \(\frac{d_{in}^2}{2 \cdot \sigma^2}\) (\(\infty\) if \(\sigma^2 = 0\)).

where:

  • \(d_{in}\) is the input argument “d_in”

  • \(\sigma^2\) is sigma_squared

See Proposition 1.6 in this paper.

Parameters:

d_in (tmlt.core.utils.exact_number.ExactNumberInput) – Distance between inputs under input_metric.

Return type:

tmlt.core.utils.exact_number.ExactNumber

__call__(value)#

Adds Gaussian noise with specified scale.

The added noise has the probability density function

\[f(x) = \frac {e^{-(x-\mu)^2/2\sigma^2}} {\sqrt{2\pi}\sigma}\]

If \(\sigma^2 = \infty\)) then the value returned is (\(\infty\) with probability 1/2 and \(-\infty\) with probability 1/2

Parameters:

value (Union[numpy.int32, numpy.int64, numpy.float32, numpy.float64, float, int]) –

Return type:

float

privacy_relation(d_in, d_out)#

Return True if close inputs produce close outputs.

See the privacy and stability tutorial (add link?) for more information.

Parameters:
  • d_in (Any) – Distance between inputs under input_metric.

  • d_out (Any) – Distance between outputs under output_measure.

Return type:

bool