QueryBuilder.bin_column#
from tmlt.analytics import QueryBuilder
- QueryBuilder.bin_column(column, spec, name=None)#
Creates a new column by assigning the values in a given column to bins.
An illustrated example can be found in the Simple transformations tutorial.
Example
>>> my_private_data.toPandas() age income 0 11 0 1 17 6 2 30 54 3 18 14 4 59 126 5 48 163 6 76 151 7 91 18 8 48 97 9 53 85 >>> from tmlt.analytics import BinningSpec >>> sess = Session.from_dataframe( ... PureDPBudget(float("inf")), ... source_id="private_data", ... dataframe=my_private_data, ... protected_change=AddOneRow(), ... ) >>> age_binspec = BinningSpec( ... [0, 18, 65, 100], include_both_endpoints=False ... ) >>> income_tax_rate_binspec = BinningSpec( ... [0, 10, 40, 86, 165], names=[10, 12, 22, 24] ... ) >>> keys = KeySet.from_dict( ... { ... "age_binned": age_binspec.bins(), ... "marginal_tax_rate": income_tax_rate_binspec.bins() ... } ... ) >>> query = ( ... QueryBuilder("private_data") ... .bin_column("age", age_binspec) ... .bin_column( ... "income", income_tax_rate_binspec, name="marginal_tax_rate" ... ) ... .groupby(keys).count() ... ) >>> answer = sess.evaluate(query, PureDPBudget(float("inf"))) >>> answer.sort("age_binned", "marginal_tax_rate").toPandas() age_binned marginal_tax_rate count 0 (0, 18] 10 2 1 (0, 18] 12 1 2 (0, 18] 22 0 3 (0, 18] 24 0 4 (18, 65] 10 0 5 (18, 65] 12 0 6 (18, 65] 22 2 7 (18, 65] 24 3 8 (65, 100] 10 0 9 (65, 100] 12 1 10 (65, 100] 22 0 11 (65, 100] 24 1
- Parameters:
column (
str
) – Name of the column used to assign bins.spec (
BinningSpec
) – ABinningSpec
that defines the binning operation to be performed.name (
Optional
[str
]) – The name of the column that will be created. If None (the default), the input column name with_binned
appended to it.
- Return type: