API Reference#

Tumult Analytics is a differentially private analytics library from Tumult Labs.

The library is broken up into a number of modules that enable complex differentially private queries to be defined and run. A typical workflow is for users to instantiate PySpark and read in their data; create a Session object with private datasets, their corresponding privacy protection, and a privacy budget; define differentially private queries using a QueryBuilder object, and then evaluating these queries with evaluate().

For specifying privacy guarantees:

  • session defines the Session, an interactive interface for evaluating differentially private queries.

  • privacy_budget contains types for representing privacy budgets.

  • protected_change contains types for expressing what changes in input tables are protected by differential privacy.

For defining queries:

  • query_builder provides an interface for constructing differentially private queries from basic query operations.

  • keyset defines KeySet, a type for specifying the keys used in group-by operations.

  • constraints contains types for representing constraints used when evaluating queries on data with the AddRowsWithID protected change.

  • truncation_strategy and binning_spec provide types that are used to define certain types of queries.

This is only applicable to Analytics Pro. For evaluating error:

  • program provides an interface for defining a particular differentially private computation as a reusable program object.

  • tuner provides an interface for evaluating and tuning the error of a program.

  • metrics defines a collection of metrics which can be used as part of the tuning process.

Modules#

binning_spec

A BinningSpec defines a binning operation on a column.

config

Configuration for Tumult Analytics.

constraints

Defines Constraint types.

keyset

A KeySet specifies a list of values for one or more columns.

metrics

This is only applicable to Analytics Pro. Metrics to measure the quality of program outputs.

no_privacy_session

This is only applicable to Analytics Pro. Interactive query evaluation without any privacy guarantees.

privacy_budget

Classes for specifying privacy budgets.

program

This is only applicable to Analytics Pro. SessionProgram and SessionProgram.Builder interfaces.

protected_change

Types for programmatically specifying what changes in input tables are protected.

query_builder

An API for building differentially private queries from basic operations.

session

Interactive query evaluation using a differential privacy framework.

synthetics

Tumult Synthetics is a differentially private synthetic data generator built on Tumult Analytics.

truncation_strategy

Defines strategies for performing truncation in private joins.

tuner

This is only applicable to Analytics Pro. Interface for tuning SessionPrograms.

utils

Utility functions.

Exceptions#

exception AnalyticsInternalError(message)#

Bases: AssertionError

Generic error to raise for internal analytics errors.

Parameters:

message (str)

This diagram shows the basic workflow for most Analytics operations. (Click to see full-size image.)