testing#

Utilities for testing.

Functions#

`get_all_props()`	Returns all properties and fields of a component.
`assert_property_immutability()`	Raises error if property is mutable.
`create_mock_transformation()`	Returns a mocked Transformation with the given properties.
`create_mock_queryable()`	Returns a mocked Queryable.
`create_mock_measurement()`	Returns a mocked Measurement with the given properties.
`skip()`	Skips tests and allows override using ‘–no-skip’ flag.
`run_test_using_ks_test()`	Runs given `KSTestCase`.
`run_test_using_chi_squared_test()`	Runs given `ChiSquaredTestCase`.
`get_values_summing_to_loc()`	Returns a list of n values that sum to loc.
`get_sampler()`	Returns a sampler function.
`get_noise_scales()`	Get noise scale per output column for an aggregation.
`get_prob_functions()`	Returns probability mass/density functions for different noise mechanisms.

get_all_props(Component)#

Returns all properties and fields of a component.

Parameters: Component (type) –
Return type: List[Tuple[str]]

assert_property_immutability(component, prop_name)#

Raises error if property is mutable.

Parameters

component (Any) – Privacy framework component whose attribute is to be checked.
prop_name (str) – Name of property to be checked.

Return type

None

create_mock_transformation(input_domain=NumpyIntegerDomain(), input_metric=AbsoluteDifference(), output_domain=NumpyIntegerDomain(), output_metric=AbsoluteDifference(), return_value=0, stability_function_implemented=False, stability_function_return_value=ExactNumber(1), stability_relation_return_value=True)#

Returns a mocked Transformation with the given properties.

Parameters

input_domain (tmlt.core.domains.base.Domain) – Input domain for the mock.
input_metric (tmlt.core.metrics.Metric) – Input metric for the mock.
output_domain (tmlt.core.domains.base.Domain) – Output domain for the mock.
output_metric (tmlt.core.metrics.Metric) – Output metric for the mock.
return_value (Any) – Return value for the Transformation’s __call__.
stability_function_implemented (bool) – If False, raises a NotImplementedError with the message “TEST” when the stability function is called.
stability_function_return_value (Any) – Return value for the Transformation’s stability function.
stability_relation_return_value (bool) – Return value for the Transformation’s stability relation.

Return type

unittest.mock.Mock

create_mock_queryable(return_value=0)#

Returns a mocked Queryable.

Parameters: return_value (Any) – Return value for the Queryable’s __call__.
Return type: unittest.mock.Mock

create_mock_measurement(input_domain=NumpyIntegerDomain(), input_metric=AbsoluteDifference(), output_measure=PureDP(), is_interactive=False, return_value=np.int64(0), privacy_function_implemented=False, privacy_function_return_value=ExactNumber(1), privacy_relation_return_value=True)#

Returns a mocked Measurement with the given properties.

Parameters

input_domain (tmlt.core.domains.base.Domain) – Input domain for the mock.
input_metric (tmlt.core.metrics.Metric) – Input metric for the mock.
output_measure (tmlt.core.measures.Measure) – Output measure for the mock.
is_interactive (bool) – Whether the mock should be interactive.
return_value (Any) – Return value for the Measurement’s __call__.
privacy_function_implemented (bool) – If False, raises a NotImplementedError with the message “TEST” when the privacy function is called.
privacy_function_return_value (Any) – Return value for the Measurement’s privacy function.
privacy_relation_return_value (bool) – Return value for the Measurement’s privacy relation.

Return type

unittest.mock.Mock

skip(reason)#

Skips tests and allows override using ‘–no-skip’ flag.

Parameters: reason (str) –
Return type: Callable[[Callable[Ellipsis, Any]], Any]

run_test_using_ks_test(case, p_threshold, noise_scale_fudge_factor)#

Runs given KSTestCase.

Parameters

case (KSTestCase) –
p_threshold (float) –
noise_scale_fudge_factor (float) –

Return type

None

run_test_using_chi_squared_test(case, p_threshold, noise_scale_fudge_factor)#

Runs given ChiSquaredTestCase.

Parameters

case (ChiSquaredTestCase) –
p_threshold (float) –
noise_scale_fudge_factor (float) –

Return type

None

get_values_summing_to_loc(loc: int, n: int) → List[int]#

get_values_summing_to_loc(loc: float, n: int) → List[float]

Returns a list of n values that sum to loc.

Parameters

loc – Value to which the return list adds up to. If this is a float, a list of floats will be returned, otherwise this must be an int, and a list of ints will be returned.
n – Desired list size.

get_sampler(measurement, dataset, post_processor, iterations=1)#

Returns a sampler function.

A sampler function takes 0 arguments and produces a numpy array containing samples obtaining by performing groupby-agg on the given dataset.

Parameters

measurement (tmlt.core.measurements.base.Measurement) – Measurement to sample from.
dataset (FixedGroupDataSet) – FixedGroupDataSet object containing DataFrame to perform measurement on.
post_processor (Callable[[pyspark.sql.DataFrame], pyspark.sql.DataFrame]) – Function to process measurement’s output DataFrame and select relevant columns.
iterations (int) – Number of iterations of groupby-agg.

Return type

Callable[[], Dict[str, numpy.ndarray]]

get_noise_scales(agg, budget, dataset, noise_mechanism)#

Get noise scale per output column for an aggregation.

Parameters

agg (str) –
budget (tmlt.core.utils.exact_number.ExactNumberInput) –
dataset (FixedGroupDataSet) –
noise_mechanism (tmlt.core.measurements.aggregations.NoiseMechanism) –

Return type

Dict[str, tmlt.core.utils.exact_number.ExactNumber]

get_prob_functions(noise_mechanism, locations)#

Returns probability mass/density functions for different noise mechanisms.

Parameters

noise_mechanism (tmlt.core.measurements.aggregations.NoiseMechanism) –
locations (Dict[str, Union[float, int]]) –

Return type

Dict[str, Dict[str, Callable]]

Classes#

`FakeAggregate`	Dummy Pandas Series aggregation for testing purposes.
`PySparkTest`	Create a pyspark testing base class for all tests.
`TestComponent`	Helper class for component tests.
`FixedGroupDataSet`	Encapsulates a Spark DataFrame with specified number of identical groups.
`KSTestCase`	Test case for `run_test_using_chi_squared_test()`.
`ChiSquaredTestCase`	Test case for `run_test_using_ks_test()`.

class FakeAggregate#

Bases: tmlt.core.measurements.pandas_measurements.dataframe.Aggregate

Dummy Pandas Series aggregation for testing purposes.

__init__()#

Constructor.

Return type: None

privacy_relation(_, __)#

Returns False always for testing purposes.

Parameters

_ (tmlt.core.utils.exact_number.ExactNumberInput) –
__ (tmlt.core.utils.exact_number.ExactNumberInput) –

Return type

bool

__call__(data)#

Perform dummy measurement.

Parameters: data (pandas.DataFrame) –
Return type: pandas.DataFrame

property input_domain#

Return input domain for the measurement.

Return type: tmlt.core.domains.pandas_domains.PandasDataFrameDomain

property output_schema#

Return the output schema.

Return type: pyspark.sql.types.StructType

property input_metric#

Distance metric on input domain.

Return type: tmlt.core.metrics.Metric

property output_measure#

Distance measure on output.

Return type: tmlt.core.measures.Measure

property is_interactive#

Returns true iff the measurement is interactive.

Return type: bool

privacy_function(d_in)#

Returns the smallest d_out satisfied by the measurement.

See the privacy and stability tutorial (add link?) for more information.

Parameters: d_in (Any) – Distance between inputs under input_metric.
Raises: NotImplementedError – If not overridden.
Return type: Any

class PySparkTest(methodName='runTest')#

Bases: unittest.TestCase

Create a pyspark testing base class for all tests.

All the unit test methods in the same test class can share or reuse the same spark context.

Methods#
`spark()`	Returns the spark session.
`suppress_py4j_logging()`	Remove noise in the logs irrelevant to testing.
`setUpClass()`	Setup SparkSession.
`tearDownClass()`	Tears down SparkSession.
`assert_frame_equal_with_sort()`	Asserts that the two data frames are equal.
`addTypeEqualityFunc()`	Add a type specific assertEqual style function to compare a type.
`addCleanup()`	Add a function, with arguments, to be called when the test is
`setUp()`	Hook method for setting up the test fixture before exercising it.
`tearDown()`	Hook method for deconstructing the test fixture after testing it.
`shortDescription()`	Returns a one-line description of the test, or None if no
`__eq__()`	Return self==value.
`subTest()`	Return a context manager that will return the enclosed block
`doCleanups()`	Execute all cleanup functions. Normally called for you after
`debug()`	Run the test without collecting errors in a TestResult
`skipTest()`	Skip this test.
`fail()`	Fail immediately, with the given message.
`assertFalse()`	Check that the expression is false.
`assertTrue()`	Check that the expression is true.
`assertRaises()`	Fail unless an exception of class expected_exception is raised
`assertWarns()`	Fail unless a warning of class warnClass is triggered
`assertLogs()`	Fail unless a log message of level level or higher is emitted
`assertEqual()`	Fail if the two objects are unequal as determined by the ‘==’
`assertNotEqual()`	Fail if the two objects are equal as determined by the ‘!=’
`assertAlmostEqual()`	Fail if the two objects are unequal as determined by their
`assertNotAlmostEqual()`	Fail if the two objects are equal as determined by their
`assertSequenceEqual()`	An equality assertion for ordered sequences (like lists and tuples).
`assertListEqual()`	A list-specific equality assertion.
`assertTupleEqual()`	A tuple-specific equality assertion.
`assertSetEqual()`	A set-specific equality assertion.
`assertIn()`	Just like self.assertTrue(a in b), but with a nicer default message.
`assertNotIn()`	Just like self.assertTrue(a not in b), but with a nicer default message.
`assertIs()`	Just like self.assertTrue(a is b), but with a nicer default message.
`assertIsNot()`	Just like self.assertTrue(a is not b), but with a nicer default message.
`assertDictContainsSubset()`	Checks whether dictionary is a superset of subset.
`assertCountEqual()`	An unordered sequence comparison asserting that the same elements,
`assertMultiLineEqual()`	Assert that two multi-line strings are equal.
`assertLess()`	Just like self.assertTrue(a < b), but with a nicer default message.
`assertLessEqual()`	Just like self.assertTrue(a <= b), but with a nicer default message.
`assertGreater()`	Just like self.assertTrue(a > b), but with a nicer default message.
`assertGreaterEqual()`	Just like self.assertTrue(a >= b), but with a nicer default message.
`assertIsNone()`	Same as self.assertTrue(obj is None), with a nicer default message.
`assertIsNotNone()`	Included for symmetry with assertIsNone.
`assertIsInstance()`	Same as self.assertTrue(isinstance(obj, cls)), with a nicer
`assertNotIsInstance()`	Included for symmetry with assertIsInstance.
`assertRaisesRegex()`	Asserts that the message in a raised exception matches a regex.
`assertWarnsRegex()`	Asserts that the message in a triggered warning matches a regexp.
`assertRegex()`	Fail the test unless the text matches the regular expression.
`assertNotRegex()`	Fail the test if the text matches the regular expression.

__init__(methodName='runTest')#: Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.

property spark#

Returns the spark session.

Return type: pyspark.sql.SparkSession

classmethod suppress_py4j_logging()#

Remove noise in the logs irrelevant to testing.

Return type: None

classmethod setUpClass()#

Setup SparkSession.

Return type: None

classmethod tearDownClass()#

Tears down SparkSession.

Return type: None

classmethod assert_frame_equal_with_sort(first_df, second_df, sort_columns=None, **kwargs)#

Asserts that the two data frames are equal.

Wrapper around pandas test function. Both dataframes are sorted since the ordering in Spark is not guaranteed.

Parameters

first_df (pandas.DataFrame) – First dataframe to compare.
second_df (pandas.DataFrame) – Second dataframe to compare.
sort_columns (Optional[Sequence[str]]) – Names of column to sort on. By default sorts by all columns.
**kwargs – Keyword arguments that will be passed to assert_frame_equal().
kwargs (Any) –

Return type

None

addTypeEqualityFunc(typeobj, function)#

Add a type specific assertEqual style function to compare a type.

This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.

Parameters

typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.

addCleanup(**kwargs)#

Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.

Cleanup items are called even if setUp fails (unlike tearDown).

setUp()#: Hook method for setting up the test fixture before exercising it.

tearDown()#: Hook method for deconstructing the test fixture after testing it.

shortDescription()#

Returns a one-line description of the test, or None if no description has been provided.

The default implementation of this method returns the first line of the specified test method’s docstring.

__eq__(other)#: Return self==value.

subTest(msg=_subtest_msg_sentinel, **params)#: Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.

doCleanups()#: Execute all cleanup functions. Normally called for you after tearDown.

debug()#: Run the test without collecting errors in a TestResult

skipTest(reason)#: Skip this test.

fail(msg=None)#: Fail immediately, with the given message.

assertFalse(expr, msg=None)#: Check that the expression is false.

assertTrue(expr, msg=None)#: Check that the expression is true.

assertRaises(expected_exception, *args, **kwargs)#

Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments. If a different type of exception is raised, it will not be caught, and the test case will be deemed to have suffered an error, exactly as for an unexpected exception.

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertRaises(SomeException):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.

The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:

with self.assertRaises(SomeException) as cm:
    do_something()
the_exception = cm.exception
self.assertEqual(the_exception.error_code, 3)

assertWarns(expected_warning, *args, **kwargs)#

Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments. If a different type of warning is triggered, it will not be handled: depending on the other warning filtering rules in effect, it might be silenced, printed out, or raised as an exception.

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertWarns(SomeWarning):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.

The context manager keeps a reference to the first matching warning as the ‘warning’ attribute; similarly, the ‘filename’ and ‘lineno’ attributes give you information about the line of Python code from which the warning was triggered. This allows you to inspect the warning after the assertion:

with self.assertWarns(SomeWarning) as cm:
    do_something()
the_warning = cm.warning
self.assertEqual(the_warning.some_attribute, 147)

assertLogs(logger=None, level=None)#

Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.

This method must be used as a context manager, and will yield a recording object with two attributes: output and records. At the end of the context manager, the output attribute will be a list of the matching formatted log messages and the records attribute will be a list of the corresponding LogRecord objects.

Example:

with self.assertLogs('foo', level='INFO') as cm:
    logging.getLogger('foo').info('first message')
    logging.getLogger('foo.bar').error('second message')
self.assertEqual(cm.output, ['INFO:foo:first message',
                             'ERROR:foo.bar:second message'])

assertEqual(first, second, msg=None)#: Fail if the two objects are unequal as determined by the ‘==’ operator.

assertNotEqual(first, second, msg=None)#: Fail if the two objects are equal as determined by the ‘!=’ operator.

assertAlmostEqual(first, second, places=None, msg=None, delta=None)#

Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

If the two objects compare equal then they will automatically compare almost equal.

assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)#

Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

Objects that are equal automatically fail.

assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)#

An equality assertion for ordered sequences (like lists and tuples).

For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.

Parameters

seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.

assertListEqual(list1, list2, msg=None)#

A list-specific equality assertion.

Parameters

list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.

assertTupleEqual(tuple1, tuple2, msg=None)#

A tuple-specific equality assertion.

Parameters

tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.

assertSetEqual(set1, set2, msg=None)#

A set-specific equality assertion.

Parameters

set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.

assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).

assertIn(member, container, msg=None)#: Just like self.assertTrue(a in b), but with a nicer default message.

assertNotIn(member, container, msg=None)#: Just like self.assertTrue(a not in b), but with a nicer default message.

assertIs(expr1, expr2, msg=None)#: Just like self.assertTrue(a is b), but with a nicer default message.

assertIsNot(expr1, expr2, msg=None)#: Just like self.assertTrue(a is not b), but with a nicer default message.

assertDictContainsSubset(subset, dictionary, msg=None)#: Checks whether dictionary is a superset of subset.

assertCountEqual(first, second, msg=None)#

An unordered sequence comparison asserting that the same elements, regardless of order. If the same element occurs more than once, it verifies that the elements occur the same number of times.

self.assertEqual(Counter(list(first)),
Counter(list(second)))

Example:

[0, 1, 1] and [1, 0, 1] compare equal.

[0, 0, 1] and [0, 1] compare unequal.

assertMultiLineEqual(first, second, msg=None)#: Assert that two multi-line strings are equal.

assertLess(a, b, msg=None)#: Just like self.assertTrue(a < b), but with a nicer default message.

assertLessEqual(a, b, msg=None)#: Just like self.assertTrue(a <= b), but with a nicer default message.

assertGreater(a, b, msg=None)#: Just like self.assertTrue(a > b), but with a nicer default message.

assertGreaterEqual(a, b, msg=None)#: Just like self.assertTrue(a >= b), but with a nicer default message.

assertIsNone(obj, msg=None)#: Same as self.assertTrue(obj is None), with a nicer default message.

assertIsNotNone(obj, msg=None)#: Included for symmetry with assertIsNone.

assertIsInstance(obj, cls, msg=None)#: Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.

assertNotIsInstance(obj, cls, msg=None)#: Included for symmetry with assertIsInstance.

assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)#

Asserts that the message in a raised exception matches a regex.

Parameters

expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.

assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)#

Asserts that the message in a triggered warning matches a regexp. Basic functioning is similar to assertWarns() with the addition that only warnings whose messages also match the regular expression are considered successful matches.

Parameters

expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.

assertRegex(text, expected_regex, msg=None)#: Fail the test unless the text matches the regular expression.

assertNotRegex(text, unexpected_regex, msg=None)#: Fail the test if the text matches the regular expression.

class TestComponent(methodName='runTest')#

Bases: PySparkTest

Helper class for component tests.

Methods#
`setUp()`	Common setup for all component tests.
`spark()`	Returns the spark session.
`suppress_py4j_logging()`	Remove noise in the logs irrelevant to testing.
`setUpClass()`	Setup SparkSession.
`tearDownClass()`	Tears down SparkSession.
`assert_frame_equal_with_sort()`	Asserts that the two data frames are equal.
`addTypeEqualityFunc()`	Add a type specific assertEqual style function to compare a type.
`addCleanup()`	Add a function, with arguments, to be called when the test is
`tearDown()`	Hook method for deconstructing the test fixture after testing it.
`shortDescription()`	Returns a one-line description of the test, or None if no
`__eq__()`	Return self==value.
`subTest()`	Return a context manager that will return the enclosed block
`doCleanups()`	Execute all cleanup functions. Normally called for you after
`debug()`	Run the test without collecting errors in a TestResult
`skipTest()`	Skip this test.
`fail()`	Fail immediately, with the given message.
`assertFalse()`	Check that the expression is false.
`assertTrue()`	Check that the expression is true.
`assertRaises()`	Fail unless an exception of class expected_exception is raised
`assertWarns()`	Fail unless a warning of class warnClass is triggered
`assertLogs()`	Fail unless a log message of level level or higher is emitted
`assertEqual()`	Fail if the two objects are unequal as determined by the ‘==’
`assertNotEqual()`	Fail if the two objects are equal as determined by the ‘!=’
`assertAlmostEqual()`	Fail if the two objects are unequal as determined by their
`assertNotAlmostEqual()`	Fail if the two objects are equal as determined by their
`assertSequenceEqual()`	An equality assertion for ordered sequences (like lists and tuples).
`assertListEqual()`	A list-specific equality assertion.
`assertTupleEqual()`	A tuple-specific equality assertion.
`assertSetEqual()`	A set-specific equality assertion.
`assertIn()`	Just like self.assertTrue(a in b), but with a nicer default message.
`assertNotIn()`	Just like self.assertTrue(a not in b), but with a nicer default message.
`assertIs()`	Just like self.assertTrue(a is b), but with a nicer default message.
`assertIsNot()`	Just like self.assertTrue(a is not b), but with a nicer default message.
`assertDictContainsSubset()`	Checks whether dictionary is a superset of subset.
`assertCountEqual()`	An unordered sequence comparison asserting that the same elements,
`assertMultiLineEqual()`	Assert that two multi-line strings are equal.
`assertLess()`	Just like self.assertTrue(a < b), but with a nicer default message.
`assertLessEqual()`	Just like self.assertTrue(a <= b), but with a nicer default message.
`assertGreater()`	Just like self.assertTrue(a > b), but with a nicer default message.
`assertGreaterEqual()`	Just like self.assertTrue(a >= b), but with a nicer default message.
`assertIsNone()`	Same as self.assertTrue(obj is None), with a nicer default message.
`assertIsNotNone()`	Included for symmetry with assertIsNone.
`assertIsInstance()`	Same as self.assertTrue(isinstance(obj, cls)), with a nicer
`assertNotIsInstance()`	Included for symmetry with assertIsInstance.
`assertRaisesRegex()`	Asserts that the message in a raised exception matches a regex.
`assertWarnsRegex()`	Asserts that the message in a triggered warning matches a regexp.
`assertRegex()`	Fail the test unless the text matches the regular expression.
`assertNotRegex()`	Fail the test if the text matches the regular expression.

__init__(methodName='runTest')#: Create an instance of the class that will use the named test method when executed. Raises a ValueError if the instance does not have a method with the specified name.

setUp()#

Common setup for all component tests.

Return type: None

property spark#

Returns the spark session.

Return type: pyspark.sql.SparkSession

classmethod suppress_py4j_logging()#

Remove noise in the logs irrelevant to testing.

Return type: None

classmethod setUpClass()#

Setup SparkSession.

Return type: None

classmethod tearDownClass()#

Tears down SparkSession.

Return type: None

classmethod assert_frame_equal_with_sort(first_df, second_df, sort_columns=None, **kwargs)#

Asserts that the two data frames are equal.

Wrapper around pandas test function. Both dataframes are sorted since the ordering in Spark is not guaranteed.

Parameters

first_df (pandas.DataFrame) – First dataframe to compare.
second_df (pandas.DataFrame) – Second dataframe to compare.
sort_columns (Optional[Sequence[str]]) – Names of column to sort on. By default sorts by all columns.
**kwargs – Keyword arguments that will be passed to assert_frame_equal().
kwargs (Any) –

Return type

None

addTypeEqualityFunc(typeobj, function)#

Add a type specific assertEqual style function to compare a type.

This method is for use by TestCase subclasses that need to register their own type equality functions to provide nicer error messages.

Parameters

typeobj – The data type to call this function on when both values are of the same type in assertEqual().
function – The callable taking two arguments and an optional msg= argument that raises self.failureException with a useful error message when the two arguments are not equal.

addCleanup(**kwargs)#

Add a function, with arguments, to be called when the test is completed. Functions added are called on a LIFO basis and are called after tearDown on test failure or success.

Cleanup items are called even if setUp fails (unlike tearDown).

tearDown()#: Hook method for deconstructing the test fixture after testing it.

shortDescription()#

Returns a one-line description of the test, or None if no description has been provided.

The default implementation of this method returns the first line of the specified test method’s docstring.

__eq__(other)#: Return self==value.

subTest(msg=_subtest_msg_sentinel, **params)#: Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters. A failure in the subtest marks the test case as failed but resumes execution at the end of the enclosed block, allowing further test code to be executed.

doCleanups()#: Execute all cleanup functions. Normally called for you after tearDown.

debug()#: Run the test without collecting errors in a TestResult

skipTest(reason)#: Skip this test.

fail(msg=None)#: Fail immediately, with the given message.

assertFalse(expr, msg=None)#: Check that the expression is false.

assertTrue(expr, msg=None)#: Check that the expression is true.

assertRaises(expected_exception, *args, **kwargs)#

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertRaises(SomeException):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertRaises is used as a context object.

The context manager keeps a reference to the exception as the ‘exception’ attribute. This allows you to inspect the exception after the assertion:

with self.assertRaises(SomeException) as cm:
    do_something()
the_exception = cm.exception
self.assertEqual(the_exception.error_code, 3)

assertWarns(expected_warning, *args, **kwargs)#

If called with the callable and arguments omitted, will return a context object used like this:

with self.assertWarns(SomeWarning):
    do_something()

An optional keyword argument ‘msg’ can be provided when assertWarns is used as a context object.

with self.assertWarns(SomeWarning) as cm:
    do_something()
the_warning = cm.warning
self.assertEqual(the_warning.some_attribute, 147)

assertLogs(logger=None, level=None)#

Fail unless a log message of level level or higher is emitted on logger_name or its children. If omitted, level defaults to INFO and logger defaults to the root logger.

Example:

with self.assertLogs('foo', level='INFO') as cm:
    logging.getLogger('foo').info('first message')
    logging.getLogger('foo.bar').error('second message')
self.assertEqual(cm.output, ['INFO:foo:first message',
                             'ERROR:foo.bar:second message'])

assertEqual(first, second, msg=None)#: Fail if the two objects are unequal as determined by the ‘==’ operator.

assertNotEqual(first, second, msg=None)#: Fail if the two objects are equal as determined by the ‘!=’ operator.

assertAlmostEqual(first, second, places=None, msg=None, delta=None)#

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

If the two objects compare equal then they will automatically compare almost equal.

assertNotAlmostEqual(first, second, places=None, msg=None, delta=None)#

Note that decimal places (from zero) are usually not the same as significant digits (measured from the most significant digit).

Objects that are equal automatically fail.

assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)#

An equality assertion for ordered sequences (like lists and tuples).

For the purposes of this function, a valid ordered sequence type is one which can be indexed, has a length, and has an equality operator.

Parameters

seq1 – The first sequence to compare.
seq2 – The second sequence to compare.
seq_type – The expected datatype of the sequences, or None if no datatype should be enforced.
msg – Optional message to use on failure instead of a list of differences.

assertListEqual(list1, list2, msg=None)#

A list-specific equality assertion.

Parameters

list1 – The first list to compare.
list2 – The second list to compare.
msg – Optional message to use on failure instead of a list of differences.

assertTupleEqual(tuple1, tuple2, msg=None)#

A tuple-specific equality assertion.

Parameters

tuple1 – The first tuple to compare.
tuple2 – The second tuple to compare.
msg – Optional message to use on failure instead of a list of differences.

assertSetEqual(set1, set2, msg=None)#

A set-specific equality assertion.

Parameters

set1 – The first set to compare.
set2 – The second set to compare.
msg – Optional message to use on failure instead of a list of differences.

assertSetEqual uses ducktyping to support different types of sets, and is optimized for sets specifically (parameters must support a difference method).

assertIn(member, container, msg=None)#: Just like self.assertTrue(a in b), but with a nicer default message.

assertNotIn(member, container, msg=None)#: Just like self.assertTrue(a not in b), but with a nicer default message.

assertIs(expr1, expr2, msg=None)#: Just like self.assertTrue(a is b), but with a nicer default message.

assertIsNot(expr1, expr2, msg=None)#: Just like self.assertTrue(a is not b), but with a nicer default message.

assertDictContainsSubset(subset, dictionary, msg=None)#: Checks whether dictionary is a superset of subset.

assertCountEqual(first, second, msg=None)#

An unordered sequence comparison asserting that the same elements, regardless of order. If the same element occurs more than once, it verifies that the elements occur the same number of times.

self.assertEqual(Counter(list(first)),
Counter(list(second)))

Example:

[0, 1, 1] and [1, 0, 1] compare equal.

[0, 0, 1] and [0, 1] compare unequal.

assertMultiLineEqual(first, second, msg=None)#: Assert that two multi-line strings are equal.

assertLess(a, b, msg=None)#: Just like self.assertTrue(a < b), but with a nicer default message.

assertLessEqual(a, b, msg=None)#: Just like self.assertTrue(a <= b), but with a nicer default message.

assertGreater(a, b, msg=None)#: Just like self.assertTrue(a > b), but with a nicer default message.

assertGreaterEqual(a, b, msg=None)#: Just like self.assertTrue(a >= b), but with a nicer default message.

assertIsNone(obj, msg=None)#: Same as self.assertTrue(obj is None), with a nicer default message.

assertIsNotNone(obj, msg=None)#: Included for symmetry with assertIsNone.

assertIsInstance(obj, cls, msg=None)#: Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.

assertNotIsInstance(obj, cls, msg=None)#: Included for symmetry with assertIsInstance.

assertRaisesRegex(expected_exception, expected_regex, *args, **kwargs)#

Asserts that the message in a raised exception matches a regex.

Parameters

expected_exception – Exception class expected to be raised.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertRaisesRegex is used as a context manager.

assertWarnsRegex(expected_warning, expected_regex, *args, **kwargs)#

Parameters

expected_warning – Warning class expected to be triggered.
expected_regex – Regex (re.Pattern object or string) expected to be found in error message.
args – Function to be called and extra positional args.
kwargs – Extra kwargs.
msg – Optional message used in case of failure. Can only be used when assertWarnsRegex is used as a context manager.

assertRegex(text, expected_regex, msg=None)#: Fail the test unless the text matches the regular expression.

assertNotRegex(text, unexpected_regex, msg=None)#: Fail the test if the text matches the regular expression.

class FixedGroupDataSet#

Encapsulates a Spark DataFrame with specified number of identical groups.

The DataFrame contains columns A and B – column ‘A’ corresponds to group index and column ‘B’ corresponds to the measure column (to be aggregated).

group_vals :Union[List[float], List[int]]#: Values for each group.

num_groups :int#: Number of identical groups.

float_measure_column :bool = False#: If True, measure column has floating point values.

groupby(noise_mechanism)#

Returns appropriate GroupBy transformation.

Parameters: noise_mechanism (tmlt.core.measurements.aggregations.NoiseMechanism) –
Return type: tmlt.core.transformations.spark_transformations.groupby.GroupBy

property domain#

Return dataframe domain.

Return type: tmlt.core.domains.spark_domains.SparkDataFrameDomain

property lower#

Returns a lower bound on the values in B.

Return type: tmlt.core.utils.exact_number.ExactNumber

property upper#

Returns an upper bound on the values in B.

Return type: tmlt.core.utils.exact_number.ExactNumber

get_dataframe()#

Returns dataframe.

Return type: pyspark.sql.DataFrame

class KSTestCase(sampler=None, locations=None, scales=None, cdfs=None)#

Test case for run_test_using_chi_squared_test().

Parameters

sampler (Optional[Callable[[], Dict[str, numpy.ndarray]]]) –
locations (Optional[Dict[str, Union[str, float]]]) –
scales (Optional[Dict[str, tmlt.core.utils.exact_number.ExactNumberInput]]) –
cdfs (Optional[Dict[str, Callable]]) –

__init__(sampler=None, locations=None, scales=None, cdfs=None)#

Constructor.

Parameters

sampler (Optional[Callable[[], Dict[str, numpy.ndarray]]]) –
locations (Optional[Dict[str, Union[str, float]]]) –
scales (Optional[Dict[str, Union[tmlt.core.utils.exact_number.ExactNumber, float, int, str, fractions.Fraction, sympy.core.expr.Expr]]]) –
cdfs (Optional[Dict[str, Callable]]) –

Return type

None

classmethod from_dict(d)#

Transform a dictionary into an KSTestCase.

Parameters: d (Dict[str, Any]) –
Return type: KSTestCase

class ChiSquaredTestCase(sampler=None, locations=None, scales=None, cmfs=None, pmfs=None)#

Test case for run_test_using_ks_test().

Parameters

sampler (Optional[Callable[[], Dict[str, numpy.ndarray]]]) –
locations (Optional[Dict[str, int]]) –
scales (Optional[Dict[str, tmlt.core.utils.exact_number.ExactNumberInput]]) –
cmfs (Optional[Dict[str, Callable]]) –
pmfs (Optional[Dict[str, Callable]]) –

__init__(sampler=None, locations=None, scales=None, cmfs=None, pmfs=None)#

Constructor.

Parameters

sampler (Optional[Callable[[], Dict[str, numpy.ndarray]]]) –
locations (Optional[Dict[str, int]]) –
scales (Optional[Dict[str, Union[tmlt.core.utils.exact_number.ExactNumber, float, int, str, fractions.Fraction, sympy.core.expr.Expr]]]) –
cmfs (Optional[Dict[str, Callable]]) –
pmfs (Optional[Dict[str, Callable]]) –

Return type

None

classmethod from_dict(d)#

Turns a dictionary into a ChiSquaredTestCase.

Parameters: d (Dict[str, Any]) –
Return type: ChiSquaredTestCase

Tumult Core

testing#

Functions#

Classes#