Tumult Analytics documentation#

Tumult Analytics is a Python library for computing aggregate queries on tabular data using differential privacy.

Tumult Analytics is…

… easy to use: its interface will seem familiar to anyone with prior experience with tools like SQL or PySpark.
… feature-rich: it supports a large and ever-growing list of aggregation functions, data transformation operators, and privacy definitions.
… robust: it is built and maintained by a team of differential privacy experts, and runs in production at institutions like the U.S. Census Bureau.
… scalable: it runs on Spark, so it can scale to very large datasets.

For new users, this Colab notebook demonstrates basic features of the library without requiring a local installation. To explore further, start with the installation instructions, then follow our tutorial series.

If you have any questions, feedback, or feature requests, please let us know on Slack!

Tutorials

Learn the basics of how to use the library. No prior knowledge of differential privacy is required!

Tutorials

Topic guides

Dive deeper into specific aspects of the library, and understand in more detail how it works behind the scenes.

Topic guides

Deployment

Find step-by-step instructions on how to deploy and troubleshoot Tumult Analytics in a variety of environments.

How-to guides

API reference

Browse detailed documentation of all packages, classes, and methods in Tumult Analytics.

API reference

The Tumult Analytics documentation introduces all of the concepts necessary to get started producing differentially private results. Users who wish to learn more about the fundamentals of differential privacy can consult this blog post series or this longer introduction.

Additional resources#

Contact us#

The best place to ask questions, file feature requests, or give feedback about Tumult Analytics is our Slack server. We also use it for announcements of new releases and feature additions.

Cite us#

If you use Tumult Analytics for a scientific publication, we would appreciate citations to the published software and/or its whitepaper. Both citations can be found below; for the software citation, please replace the version with the version you are using.

@software{tumultanalyticssoftware,
    author = {Tumult Labs},
    title = {Tumult {{Analytics}}},
    month = dec,
    year = 2022,
    version = {latest},
    url = {https://tmlt.dev}
}

@article{tumultanalyticswhitepaper,
    title={Tumult {{Analytics}}: a robust, easy-to-use, scalable, and expressive framework for differential privacy},
    author={Berghel, Skye and Bohannon, Philip and Desfontaines, Damien and Estes, Charles and Haney, Sam and Hartman, Luke and Hay, Michael and Machanavajjhala, Ashwin and Magerlein, Tom and Miklau, Gerome and Pai, Amritha and Sexton, William and Shrestha, Ruchit},
    journal={arXiv preprint arXiv:2212.04133},
    month = dec,
    year={2022}
}

License#

This documentation is licensed under the Creative Commons Attribution-ShareAlike 4.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

The Tumult Analytics source code is licensed under the Apache License, version 2.0 (Apache-2.0).