QueryBuilder.drop_infinity#

from tmlt.analytics import QueryBuilder
QueryBuilder.drop_infinity(columns)#

Remove rows containing infinite values.

Example

>>> my_private_data.toPandas()
    A  B    X
0  a1  1  1.1
1  a1  2  0.0
2  a2  2  inf
>>> budget = PureDPBudget(float("inf"))
>>> sess = Session.from_dataframe(
...     privacy_budget=budget,
...     source_id="my_private_data",
...     dataframe=my_private_data,
...     protected_change=AddOneRow(),
... )
>>> # Count query on the original data
>>> query = (
...     QueryBuilder("my_private_data")
...     .groupby(KeySet.from_dict({"A": ["a1", "a2"]}))
...     .count()
... )
>>> # Answering the query with infinite privacy budget
>>> answer = sess.evaluate(
...     query,
...     PureDPBudget(float("inf"))
... )
>>> answer.sort("A").toPandas()
    A  count
0  a1      2
1  a2      1
>>> # Building a query with a drop_infinity transformation
>>> query = (
...     QueryBuilder("my_private_data")
...     .drop_infinity(columns=["X"])
...     .groupby(KeySet.from_dict({"A": ["a1", "a2"]}))
...     .count()
... )
>>> # Answering the query with infinite privacy budget
>>> answer = sess.evaluate(
...     query,
...     PureDPBudget(float("inf"))
... )
>>> answer.sort("A").toPandas()
    A  count
0  a1      2
1  a2      0
Parameters:

columns (Optional[List[str]]) – A list of columns in which to look for positive and negative infinities. If None or an empty list, then all columns will be considered, meaning that if any column has an infinite value then the row it is in will be dropped.

Return type:

QueryBuilder