Tumult Analytics documentation#
Tumult Analytics is a Python library for computing aggregate queries on tabular data using differential privacy.
Tumult Analytics is…
… easy to use: its interface will seem familiar to anyone with prior experience with tools like SQL or PySpark.
… feature-rich: it supports a large and ever-growing list of aggregation functions, data transformation operators, and privacy definitions.
… robust: it is built and maintained by a team of differential privacy experts, and runs in production at institutions like the U.S. Census Bureau.
… scalable: it runs on Spark, so it can scale to very large datasets.
For new users, this Colab notebook demonstrates basic features of the library without requiring a local installation. To explore further or work on larger datasets, a good starting point is the installation instructions.
The Tumult Analytics documentation introduces all of the concepts necessary to get started producing differentially private results. Users who wish to learn more about the fundamentals of differential privacy can consult this blog post series or this longer introduction.
Additional resources include the changelog, which describes notable changes to the library, as well as license information.
The best place to ask questions, file feature requests, or give feedback about Tumult Analytics is our Slack server. We also use it for announcements of new releases and feature additions.
This documentation is licensed under the Creative Commons Attribution-ShareAlike 4.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.