_schema#
Schema management for private and public tables.
The schema represents the column types of the underlying table. This allows for seamless transitions of the data representation type.
Functions#
Converts a ColumnType to a python type. |
|
Returns the mapping from column names to supported python types. |
|
Convert an Analytics schema to a Spark schema. |
|
Convert a schema in Analytics representation to a Spark columns descriptor. |
|
Convert Spark schema to Analytics columns. |
|
Convert a Spark dataframe domain to Analytics columns. |
- column_type_to_py_type(column_type)#
Converts a ColumnType to a python type.
- Parameters
column_type (ColumnType) –
- Return type
- analytics_to_py_types(analytics_schema)#
Returns the mapping from column names to supported python types.
- analytics_to_spark_schema(analytics_schema)#
Convert an Analytics schema to a Spark schema.
- Parameters
analytics_schema (Schema) –
- Return type
- analytics_to_spark_columns_descriptor(analytics_schema)#
Convert a schema in Analytics representation to a Spark columns descriptor.
- Parameters
analytics_schema (Schema) –
- Return type
tmlt.core.domains.spark_domains.SparkColumnsDescriptor
- spark_schema_to_analytics_columns(spark_schema)#
Convert Spark schema to Analytics columns.
- Parameters
spark_schema (pyspark.sql.types.StructType) –
- Return type
Dict[str, ColumnDescriptor]
- spark_dataframe_domain_to_analytics_columns(domain)#
Convert a Spark dataframe domain to Analytics columns.
- Parameters
domain (tmlt.core.domains.base.Domain) –
- Return type
Dict[str, ColumnDescriptor]
Classes#
The supported SQL92 column types for Analytics data. |
|
Information about a column. |
|
Schema class describing the column information of the data. |
- class ColumnType#
Bases:
enum.Enum
The supported SQL92 column types for Analytics data.
- INTEGER#
Integer column type.
- DECIMAL#
Floating-point column type.
- VARCHAR#
String column type.
- DATE#
Date column type.
- TIMESTAMP#
Timestamp column type.
- name()#
The name of the Enum member.
- value()#
The value of the Enum member.
- class ColumnDescriptor#
Information about a column.
ColumnDescriptors have the following attributes:
- column_type#
A
ColumnType
, specifying what type this column has.
- class Schema(column_descs, grouping_column=None, id_column=None, id_space=None, default_allow_null=False, default_allow_nan=False, default_allow_inf=False)#
Bases:
collections.abc.Mapping
Schema class describing the column information of the data.
- The following SQL92 types are currently supported:
INTEGER, DECIMAL, VARCHAR, DATE, TIMESTAMP
Methods# Return the names of the columns in the schema.
Returns a mapping from column name to column descriptor.
Returns a mapping from column name to column type.
Returns the optional column that must be grouped by.
Return whether the grouping column is an ID column.
Return the ID space for this schema.
Returns True if schemas are equal.
Returns the data type for the given column.
Return an iterator over the columns in the schema.
Return the number of columns in the schema.
D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.
D.keys() -> a set-like object providing a view on D’s keys
D.items() -> a set-like object providing a view on D’s items
D.values() -> an object providing a view on D’s values
- Parameters
column_descs (Mapping[str, Union[str, ColumnType, ColumnDescriptor]]) –
grouping_column (Optional[str]) –
id_column (Optional[str]) –
id_space (Optional[str]) –
default_allow_null (bool) –
default_allow_nan (bool) –
default_allow_inf (bool) –
- __init__(column_descs, grouping_column=None, id_column=None, id_space=None, default_allow_null=False, default_allow_nan=False, default_allow_inf=False)#
Constructor.
- Parameters
column_descs (
Mapping
Mapping
[str
,Union
[str
,ColumnType
,ColumnDescriptor
]]) – Mapping from column names to supported types.grouping_column (
str
|None
Optional
[str
] (default:None
)) – Optional column that must be grouped by in this query.id_column (
str
|None
Optional
[str
] (default:None
)) – The ID column on this table, if one exists.id_space (
str
|None
Optional
[str
] (default:None
)) – The ID space for this table, if one exists.default_allow_null (
bool
bool
(default:False
)) – When a ColumnType or string is used as the value in the ColumnDescriptors mapping, the column will allow_null if default_allow_null is True.default_allow_nan (
bool
bool
(default:False
)) – When a ColumnType or string is used as the value in the ColumnDescriptors mapping, the column will allow_nan if default_allow_nan is True.default_allow_inf (
bool
bool
(default:False
)) – When a ColumnType or string is used as the value in the ColumnDescriptors mapping, the column will allow_inf if default_allow_inf is True.
- property columns#
Return the names of the columns in the schema.
- property column_descs#
Returns a mapping from column name to column descriptor.
- Return type
Dict[str, ColumnDescriptor]
- property column_types#
Returns a mapping from column name to column type.
- property grouping_column#
Returns the optional column that must be grouped by.
- Return type
Optional[str]
- __eq__(other)#
Returns True if schemas are equal.
- __getitem__(column)#
Returns the data type for the given column.
- Parameters
column (str) – The column to get the data type for.
- Return type
- get(key, default=None)#
D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.
- keys()#
D.keys() -> a set-like object providing a view on D’s keys
- items()#
D.items() -> a set-like object providing a view on D’s items
- values()#
D.values() -> an object providing a view on D’s values