Session.evaluate#
from tmlt.analytics import Session
- Session.evaluate(query_expr, privacy_budget)#
Answers a query within the given privacy budget and returns a Spark dataframe.
The type of privacy budget that you use must match the type your Session was initialized with (i.e., you cannot evaluate a query using RhoZCDPBudget if the Session was initialized with a PureDPBudget, and vice versa).
Example
>>> sess.private_sources ['my_private_data'] >>> sess.get_column_types("my_private_data") {'A': ColumnType.VARCHAR, 'B': ColumnType.INTEGER, 'X': ColumnType.INTEGER} >>> sess.remaining_privacy_budget PureDPBudget(epsilon=1) >>> # Evaluate Queries >>> filter_query = QueryBuilder("my_private_data").filter("A > 0") >>> count_query = filter_query.groupby(KeySet.from_dict({"X": [0, 1]})).count() >>> count_answer = sess.evaluate( ... query_expr=count_query, ... privacy_budget=PureDPBudget(0.5), ... ) >>> sum_query = filter_query.sum(column="B", low=0, high=1) >>> sum_answer = sess.evaluate( ... query_expr=sum_query, ... privacy_budget=PureDPBudget(0.5), ... ) >>> count_answer # TODO(#798): Seed randomness and change to toPandas() DataFrame[X: bigint, count: bigint] >>> sum_answer # TODO(#798): Seed randomness and change to toPandas() DataFrame[B_sum: bigint]
- Parameters:
query_expr (
Query
) – One query expression to answer.privacy_budget (
PrivacyBudget
) – The privacy budget used for the query.
- Return type: