QueryBuilder.replace_null_and_nan#

from tmlt.analytics import QueryBuilder
QueryBuilder.replace_null_and_nan(replace_with=None)#

Replaces null and NaN values in specified columns.

Note

Null values cannot be replaced in the ID column of a table initialized with a AddRowsWithID ProtectedChange, nor on a column generated by a flat_map() with the grouping parameter set to True.

Warning

If null values are replaced in a column, then Analytics will raise an error if a KeySet is used with a null value for that column.

Example

>>> my_private_data.toPandas()
      A    B    X
0  None  0.0  0.0
1     1  NaN  1.1
2     2  2.0  NaN
>>> budget = PureDPBudget(float("inf"))
>>> sess = Session.from_dataframe(
...     privacy_budget=budget,
...     source_id="my_private_data",
...     dataframe=my_private_data,
...     protected_change=AddOneRow(),
... )
>>> # Building a query with a replace_null_and_nan transformation
>>> query = (
...     QueryBuilder("my_private_data")
...     .replace_null_and_nan(
...         replace_with={
...             "A": "new_value",
...             "B": 1234,
...             "X": 56.78,
...         },
...     )
...     .groupby(KeySet.from_dict({"A": ["new_value", "1", "2"]}))
...     .count()
... )
>>> # Answering the query with infinite privacy budget
>>> answer = sess.evaluate(
...     query,
...     PureDPBudget(float("inf"))
... )
>>> answer.sort("A").toPandas()
           A  count
0          1      1
1          2      1
2  new_value      1
Parameters:

replace_with (Optional[Mapping[str, Union[int, float, str, date, datetime]]]) – A dictionary mapping column names to values used to replace null and NaN values. If None (or empty), all columns will have null and NaN values replaced with Analytics defaults; see AnalyticsDefault.

Return type:

QueryBuilder