misc#

Miscellaneous helper functions and classes.

Functions#

get_nonconflicting_string()

Returns a string distinct from given strings.

print_sdf()

Prints a spark dataframe in a deterministic way.

copy_if_mutable()

Returns a deep copy of argument if it is mutable.

get_fullname()

Returns the fully qualified name of the given object.

escape_column_name()

Escapes column name if it contains special characters.

get_materialized_df()

Returns a new DataFrame constructed after materializing.

get_nonconflicting_string(strs)#

Returns a string distinct from given strings.

Parameters:

strs (List[str])

Return type:

str

print_sdf(sdf)#

Prints a spark dataframe in a deterministic way.

Parameters:

sdf (pyspark.sql.DataFrame)

Return type:

None

copy_if_mutable(value)#

Returns a deep copy of argument if it is mutable.

Parameters:

value (T)

Return type:

T

get_fullname(obj)#

Returns the fully qualified name of the given object.

Parameters:

obj (Any) – Object to get the name of.

Return type:

str

escape_column_name(column_name)#

Escapes column name if it contains special characters.

Parameters:

column_name (str) – The name of the column to check and potentially escape.

Return type:

str

get_materialized_df(sdf, table_name)#

Returns a new DataFrame constructed after materializing.

Parameters:
  • sdf (pyspark.sql.DataFrame) – DataFrame to be materialized.

  • table_name (str) – Name to be used to refer to the table. If a table with table_name already exists, an error is raised.

Return type:

pyspark.sql.DataFrame