noise_mechanisms#
Measurements for adding noise to individual numbers.
Classes#
Add Laplace noise to a number. |
|
Add Geometric noise to a number. |
|
Add discrete Gaussian noise to a number. |
|
Add Gaussian noise to a number. |
- class AddLaplaceNoise(input_domain, scale)#
Bases:
tmlt.core.measurements.base.Measurement
Add Laplace noise to a number.
- Parameters:
input_domain (Union[tmlt.core.domains.numpy_domains.NumpyIntegerDomain, tmlt.core.domains.numpy_domains.NumpyFloatDomain])
scale (tmlt.core.utils.exact_number.ExactNumberInput)
- property input_domain: tmlt.core.domains.numpy_domains.NumpyDomain#
Return input domain for the measurement.
- Return type:
- property scale: tmlt.core.utils.exact_number.ExactNumber#
Returns the noise scale.
- Return type:
- property output_type: pyspark.sql.types.DataType#
Return the output data type after being used as a UDF.
- Return type:
- property input_metric: tmlt.core.metrics.Metric#
Distance metric on input domain.
- Return type:
- property output_measure: tmlt.core.measures.Measure#
Distance measure on output.
- Return type:
- __init__(input_domain, scale)#
Constructor.
- Parameters:
input_domain (
Union
[NumpyIntegerDomain
,NumpyFloatDomain
]) – Input Domain.scale (
Union
[ExactNumber
,float
,int
,str
,Fraction
,Expr
]) – Noise scale.
- privacy_function(d_in)#
Returns the smallest d_out satisfied by the measurement.
The returned d_out is \(\frac{d_{in}}{b}\) (\(\infty\) if \(b = 0\)).
where:
\(d_{in}\) is the input argument “d_in”
\(b\) is the property “scale”
- Parameters:
d_in (tmlt.core.utils.exact_number.ExactNumberInput) – Distance between inputs under input_metric.
- Return type:
- __call__(val)#
Returns the value with laplace noise added.
The added laplace noise has the probability density function
\(f(x) = \frac{1}{2 b} e ^ {\frac{-\mid x \mid}{b}}\)
where:
\(x\) is a real number
\(b\) is the property “scale”
- classmethod inverse_cdf(scale, probability)#
Inverse CDF function.
Given a probability, returns a point x in the range such that the noise generated by this measurement will be <= x with probability probability.
- privacy_relation(d_in, d_out)#
Return True if close inputs produce close outputs.
See the privacy and stability tutorial (add link?) for more information.
- Parameters:
d_in (Any) – Distance between inputs under
input_metric
.d_out (Any) – Distance between outputs under
output_measure
.
- Return type:
- class AddGeometricNoise(alpha)#
Bases:
tmlt.core.measurements.base.Measurement
Add Geometric noise to a number.
- Parameters:
alpha (tmlt.core.utils.exact_number.ExactNumberInput)
- property input_domain: tmlt.core.domains.numpy_domains.NumpyIntegerDomain#
Return input domain for the measurement.
- Return type:
- property output_type: pyspark.sql.types.DataType#
Return the output data type after being used as a UDF.
- Return type:
- property alpha: tmlt.core.utils.exact_number.ExactNumber#
Returns the noise scale.
- Return type:
- property input_metric: tmlt.core.metrics.Metric#
Distance metric on input domain.
- Return type:
- property output_measure: tmlt.core.measures.Measure#
Distance measure on output.
- Return type:
- __init__(alpha)#
Constructor.
- privacy_function(d_in)#
Returns the smallest d_out satisfied by the measurement.
The returned d_out is \(\frac{d_{in}}{\alpha}\) (\(\infty\) if \(\alpha = 0\)).
where:
\(d_{in}\) is the input argument “d_in”
\(\alpha\) is
alpha
- Parameters:
d_in (tmlt.core.utils.exact_number.ExactNumberInput) – Distance between inputs under input_metric.
- Return type:
- __call__(value)#
Returns the value with double sided geometric noise added.
The added noise has the probability mass function
\[f(k)= \frac {e^{1 / \alpha} - 1} {e^{1 / \alpha} + 1} \cdot e^{\frac{-\mid k \mid}{\alpha}}\]where:
\(k\) is an integer
\(\alpha\) is
alpha
A double sided geometric distribution is the difference between two geometric distributions (It can be sampled from by sampling a two values from a geometric distribution, and taking their difference).
See section 4.1 in [BV18], remark 2 in this paper, or scipy.stats.geom for more information. (Note that the parameter \(p\) used in scipy.stats.geom is related to \(\alpha\) through \(p = 1 - e^{-1 / \alpha}\)).
- classmethod inverse_cdf(alpha, probability)#
Inverse CDF function.
Given a probability, returns a point x in the range such that the noise generated by this measurement will be <= x with probability probability.
- privacy_relation(d_in, d_out)#
Return True if close inputs produce close outputs.
See the privacy and stability tutorial (add link?) for more information.
- Parameters:
d_in (Any) – Distance between inputs under
input_metric
.d_out (Any) – Distance between outputs under
output_measure
.
- Return type:
- class AddDiscreteGaussianNoise(sigma_squared)#
Bases:
tmlt.core.measurements.base.Measurement
Add discrete Gaussian noise to a number.
- Parameters:
sigma_squared (tmlt.core.utils.exact_number.ExactNumberInput)
- property input_domain: tmlt.core.domains.numpy_domains.NumpyIntegerDomain#
Return input domain for the measurement.
- Return type:
- property output_type: pyspark.sql.types.DataType#
Return the output data type after being used as a UDF.
- Return type:
- property sigma_squared: tmlt.core.utils.exact_number.ExactNumber#
Returns the noise scale.
- Return type:
- property input_metric: tmlt.core.metrics.Metric#
Distance metric on input domain.
- Return type:
- property output_measure: tmlt.core.measures.Measure#
Distance measure on output.
- Return type:
- __init__(sigma_squared)#
Constructor.
- privacy_function(d_in)#
Returns the smallest d_out satisfied by the measurement.
The returned d_out is \(\frac{d_{in}^2}{2 \cdot \sigma^2}\) (\(\infty\) if \(\sigma^2 = 0\)).
where:
\(d_{in}\) is the input argument “d_in”
\(\sigma^2\) is
sigma_squared
See Proposition 1.6 in this paper.
- Parameters:
d_in (tmlt.core.utils.exact_number.ExactNumberInput) – Distance between inputs under input_metric.
- Return type:
- __call__(value)#
Adds discrete Gaussian noise with specified scale.
The added noise has the probability mass function
\[f(k) = \frac {e^{-k^2/2\sigma^2}} { \sum_{n\in \mathbb{Z}} e^{n^2/2\sigma^2} }\]where:
\(k\) is an integer
\(\sigma^2\) is
sigma_squared
See [CKS20] for more information. The formula above is based on Definition 1.
- classmethod inverse_cdf(sigma_squared, probability)#
Inverse CDF function.
Given a probability, returns a point x in the range such that the noise generated by this measurement will be <= x with probability probability.
- privacy_relation(d_in, d_out)#
Return True if close inputs produce close outputs.
See the privacy and stability tutorial (add link?) for more information.
- Parameters:
d_in (Any) – Distance between inputs under
input_metric
.d_out (Any) – Distance between outputs under
output_measure
.
- Return type:
- class AddGaussianNoise(input_domain, sigma_squared)#
Bases:
tmlt.core.measurements.base.Measurement
Add Gaussian noise to a number.
- Parameters:
input_domain (Union[tmlt.core.domains.numpy_domains.NumpyIntegerDomain, tmlt.core.domains.numpy_domains.NumpyFloatDomain])
sigma_squared (tmlt.core.utils.exact_number.ExactNumberInput)
- property input_domain: tmlt.core.domains.numpy_domains.NumpyDomain#
Return input domain for the measurement.
- Return type:
- property output_type: pyspark.sql.types.DataType#
Return the output data type after being used as a UDF.
- Return type:
- property sigma_squared: tmlt.core.utils.exact_number.ExactNumber#
Returns the noise scale.
- Return type:
- property input_metric: tmlt.core.metrics.Metric#
Distance metric on input domain.
- Return type:
- property output_measure: tmlt.core.measures.Measure#
Distance measure on output.
- Return type:
- __init__(input_domain, sigma_squared)#
Constructor.
- Parameters:
input_domain (
Union
[NumpyIntegerDomain
,NumpyFloatDomain
]) – Domain of the input.sigma_squared (
Union
[ExactNumber
,float
,int
,str
,Fraction
,Expr
]) – Noise scale. This is the variance of the Gaussian distribution to be used for sampling noise.
- privacy_function(d_in)#
Returns the smallest d_out satisfied by the measurement.
The returned d_out is \(\frac{d_{in}^2}{2 \cdot \sigma^2}\) (\(\infty\) if \(\sigma^2 = 0\)).
where:
\(d_{in}\) is the input argument “d_in”
\(\sigma^2\) is
sigma_squared
See Proposition 1.6 in this paper.
- Parameters:
d_in (tmlt.core.utils.exact_number.ExactNumberInput) – Distance between inputs under input_metric.
- Return type:
- __call__(value)#
Adds Gaussian noise with specified scale.
The added noise has the probability density function
\[f(x) = \frac {e^{-(x-\mu)^2/2\sigma^2}} {\sqrt{2\pi}\sigma}\]If \(\sigma^2 = \infty\)) then the value returned is (\(\infty\) with probability 1/2 and \(-\infty\) with probability 1/2
- privacy_relation(d_in, d_out)#
Return True if close inputs produce close outputs.
See the privacy and stability tutorial (add link?) for more information.
- Parameters:
d_in (Any) – Distance between inputs under
input_metric
.d_out (Any) – Distance between outputs under
output_measure
.
- Return type: