AdaptiveMarginals#
from tmlt.synthetics import AdaptiveMarginals
- class tmlt.synthetics.AdaptiveMarginals(total_by, total_count_budget_fraction, cases, weight=1, count_column='count')#
Bases:
MeasurementStrategy
An adaptive workload with multiple cases based on the total noisy count.
In particular, this workload specifies different sets of marginals to compute for different subsets of the data based on the size of the subset.
For example, in a dataset of website visits for a month for a set of websites, we might want to compute finer granularity marginals for websites that get a large number of visits while only computing coarser marginals for websites that get relatively fewer visits. We could do this using an
AdaptiveMarginals
as follows:>>> adaptive_counts = AdaptiveMarginals( ... total_by=["website_id"], ... total_count_budget_fraction=0.15, ... cases=[ ... AdaptiveMarginals.Case( ... threshold=1000, ... marginals=[ ... Count(["website_id", "day"]), ... Count(["website_id", "day", "hour"]), ... ], ... ), ... AdaptiveMarginals.Case( ... threshold=500, ... marginals=[ ... Count(["website_id", "day"]), ... ], ... ), ... AdaptiveMarginals.Case( ... default=True, ... marginals=[ ... Count(["website_id", "week"]), ... ], ... ), ... ], ... )
The adaptive strategy defined in the example above does the following:
Computes the total count of visits for each website.
Divides the websites into three cases:
Case 1: Websites with more than 1000 visits.
Computes the count of visits per day and per hour.
Case 2: Websites with between 500 and 1000 visits.
Computes the count of visits per day.
Default case: Websites with fewer than 500 visits.
Computes the count of visits per week.
- class Case(marginals, threshold=None, default=False)#
Bases:
object
A case in the adaptive workload.
- compute(session, source_id, budget, keysets, clamping_bounds)#
Compute marginals adaptively based on the total noisy count.
- Return type: