QueryBuilder.quantile#
from tmlt.analytics import QueryBuilder
- QueryBuilder.quantile(column, quantile, low, high, name=None)#
Returns a quantile query ready to be evaluated.
Note
If the column being measured contains NaN or null values, a
drop_null_and_nan()
query will be performed first. If the column being measured contains infinite values, adrop_infinity()
query will be performed first.Example
>>> my_private_data.toPandas() A B X 0 0 1 0 1 1 0 1 2 1 2 1 >>> budget = PureDPBudget(float("inf")) >>> sess = Session.from_dataframe( ... privacy_budget=budget, ... source_id="my_private_data", ... dataframe=my_private_data, ... protected_change=AddOneRow(), ... ) >>> # Building a quantile query >>> query = ( ... QueryBuilder("my_private_data") ... .quantile(column="B", quantile=0.6, low=0, high=2) ... ) >>> # Answering the query with infinite privacy budget >>> answer = sess.evaluate( ... query, ... PureDPBudget(float("inf")) ... ) >>> answer.toPandas() B_quantile(0.6) 0 1.331107
- Parameters:
column (
str
) – The column to compute the quantile over.quantile (
float
) – A number between 0 and 1 specifying the quantile to compute. For example, 0.5 would compute the median.low (
float
) – The lower bound for clamping.high (
float
) – The upper bound for clamping. Must be such thatlow
is less thanhigh
.name (
Optional
[str
]) – The name to give the resulting aggregation column. Defaults tof"{column}_quantile({quantile})"
.
- Return type:
from tmlt.analytics import GroupedQueryBuilder
- GroupedQueryBuilder.quantile(column, quantile, low, high, name=None)#
Returns a Query with a quantile query.
Note
If the column being measured contains NaN or null values, a
drop_null_and_nan()
query will be performed first. If the column being measured contains infinite values, adrop_infinity()
query will be performed first.Example
>>> my_private_data.toPandas() A B X 0 0 1 0 1 1 0 1 2 1 2 1 >>> budget = PureDPBudget(float("inf")) >>> sess = Session.from_dataframe( ... privacy_budget=budget, ... source_id="my_private_data", ... dataframe=my_private_data, ... protected_change=AddOneRow(), ... ) >>> # Building a groupby quantile query >>> query = ( ... QueryBuilder("my_private_data") ... .groupby(KeySet.from_dict({"A": ["0", "1"]})) ... .quantile(column="B", quantile=0.6, low=0, high=2) ... ) >>> # Answering the query with infinite privacy budget >>> answer = sess.evaluate( ... query, ... PureDPBudget(float("inf")) ... ) >>> answer.sort("A").toPandas() A B_quantile(0.6) 0 0 1.331107 1 1 1.331107
- Parameters:
column (
str
) – The column to compute the quantile over.quantile (
float
) – A number between 0 and 1 specifying the quantile to compute. For example, 0.5 would compute the median.low (
float
) – The lower bound for clamping.high (
float
) – The upper bound for clamping. Must be such thatlow
is less thanhigh
.name (
Optional
[str
]) – The name to give the resulting aggregation column. Defaults tof"{column}_quantile({quantile})"
.
- Return type: