compute_file_group_statistics

Function compute_file_group_statistics 

Source
pub fn compute_file_group_statistics(
    file_group: FileGroup,
    file_schema: SchemaRef,
    collect_stats: bool,
) -> Result<FileGroup>
Expand description

Computes the summary statistics for a group of files(FileGroup level’s statistics).

This function combines statistics from all files in the file group to create summary statistics. It handles the following aspects:

  • Merges row counts and byte sizes across files
  • Computes column-level statistics like min/max values
  • Maintains appropriate precision information (exact, inexact, absent)

§Parameters

  • file_group - The group of files to process
  • file_schema - Schema of the files
  • collect_stats - Whether to collect statistics (if false, returns original file group)

§Returns

A new file group with summary statistics attached