Tumult Analytics documentation#

Tumult Analytics is a Python library for computing aggregate queries on tabular data using differential privacy.

Tumult Analytics is…

  • robust: it is built and maintained by a team of differential privacy experts, and runs in production at institutions like the U.S. Census Bureau.

  • scalable: it runs on Spark, so it can scale to very large datasets.

  • easy to use: its interface will seem familiar to anyone with prior experience with tools like SQL or PySpark.

  • feature-rich: it supports a large and ever-growing list of aggregation functions, data transformation operators, and privacy definitions.

New users probably want to start with the Installation instructions. Alternatively, this Colab notebook. demonstrates basic features of the library without requiring any installation.

No prior expertise in differential privacy is needed to use Tumult Analytics. Users who still wish to learn more about the fundamentals of differential privacy can consult this blog post series or this longer introduction.



Tutorials are the place where new users can learn the basics of how to use the library. No prior knowledge of differential privacy is required!


Topic guides

Topic guides dive deeper into specific aspects of the library, and explain in more detail how it works behind the scenes.


API reference

The API reference contains a detailed description of the packages, classes, and methods available in Tumult Analytics. It assumes that you have an understanding of the key concepts.


Additional resources

Additional resources include the changelog, which describes notable changes to the library, contact information, as well as license information.

Documentation License#

This documentation is licensed under the Creative Commons Attribution-ShareAlike 4.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.