Module metadata

Module metadata 

Source
Expand description

DFParquetMetadata for fetching Parquet file metadata, statistics and schema information.

Structsยง

CachedParquetMetaData
Wrapper to implement [FileMetadata] for [ParquetMetaData].
DFParquetMetadata
Handles fetching Parquet file schema, metadata and statistics from object store.
StatisticsAccumulators ๐Ÿ”’
Holds the accumulator state for collecting statistics from row groups

Functionsยง

create_max_min_accs ๐Ÿ”’
get_col_stats ๐Ÿ”’
has_any_exact_match ๐Ÿ”’
Checks if any occurrence of value in array corresponds to a true entry in the exactness array.
min_max_aggregate_data_type ๐Ÿ”’
Min/max aggregation can take Dictionary encode input but always produces unpacked (aka non Dictionary) output. We need to adjust the output data type to reflect this. The reason min/max aggregate produces unpacked output because there is only one min/max value per group; there is no needs to keep them Dictionary encoded
summarize_min_max_null_counts ๐Ÿ”’