QueryBuilder#

from tmlt.analytics import QueryBuilder
class tmlt.analytics.QueryBuilder(source_id)#

Bases: object

High-level interface for specifying DP queries.

Each instance corresponds to applying a transformation. The full graph of QueryBuilder objects can be traversed from root to a node.

Example

>>> my_private_data.toPandas()
   A  B  X
0  0  1  0
1  1  0  1
2  1  2  1
>>> budget = PureDPBudget(float("inf"))
>>> sess = Session.from_dataframe(
...     privacy_budget=budget,
...     source_id="my_private_data",
...     dataframe=my_private_data,
...     protected_change=AddOneRow(),
... )
>>> # Building a query
>>> query = QueryBuilder("my_private_data").count()
>>> # Answering the query with infinite privacy budget
>>> answer = sess.evaluate(
...     query,
...     PureDPBudget(float("inf"))
... )
>>> answer.toPandas()
   count
0      3
__init__(source_id)#

Constructor.

Parameters:

source_id (str) – The source id used in the query_expr.

Methods

QueryBuilder.average

Returns an average query ready to be evaluated.

QueryBuilder.bin_column

Creates a new column by assigning the values in a given column to bins.

QueryBuilder.count

Returns a count query ready to be evaluated.

QueryBuilder.count_distinct

Returns a count_distinct query ready to be evaluated.

QueryBuilder.drop_infinity

Remove rows containing infinite values.

QueryBuilder.drop_null_and_nan

Removes rows containing null or NaN values.

QueryBuilder.enforce

Enforces a Constraint on the table.

QueryBuilder.filter

Filter rows matching a condition.

QueryBuilder.flat_map

Applies a mapping function to each row, returning zero or more rows.

QueryBuilder.flat_map_by_id

Applies a transformation to each group of records sharing an ID.

QueryBuilder.get_bounds

Returns a query that gets approximate upper and lower bounds for a column.

QueryBuilder.get_groups

Returns a query that gets combinations of values in the listed columns.

QueryBuilder.groupby

Groups the query by the given set of keys, returning a GroupedQueryBuilder.

QueryBuilder.histogram

Returns a count query containing the frequency of values in specified column.

QueryBuilder.join_private

Join the table with another QueryBuilder.

QueryBuilder.join_public

Joins the table with a DataFrame or a public source.

QueryBuilder.map

Applies a mapping function to each row.

QueryBuilder.max

Returns a quantile query requesting a maximum value, ready to be evaluated.

QueryBuilder.median

Returns a quantile query requesting a median value, ready to be evaluated.

QueryBuilder.min

Returns a quantile query requesting a minimum value, ready to be evaluated.

QueryBuilder.quantile

Returns a quantile query ready to be evaluated.

QueryBuilder.rename

Renames one or more columns in the table.

QueryBuilder.replace_infinity

Replaces +inf and -inf values in specified columns.

QueryBuilder.replace_null_and_nan

Replaces null and NaN values in specified columns.

QueryBuilder.select

Selects the specified columns, dropping the others.

QueryBuilder.stdev

Returns a standard deviation query ready to be evaluated.

QueryBuilder.sum

Returns a sum query ready to be evaluated.

QueryBuilder.variance

Returns a variance query ready to be evaluated.