async fn get_files_with_limit(
files: impl Stream<Item = Result<PartitionedFile>>,
limit: Option<usize>,
collect_stats: bool,
) -> Result<(FileGroup, bool)>Expand description
Processes a stream of partitioned files and returns a FileGroup containing the files.
This function collects files from the provided stream until either:
- The stream is exhausted
- The accumulated number of rows exceeds the provided
limit(if specified)
§Arguments
files- A stream ofResult<PartitionedFile>items to processlimit- An optional row count limit. If provided, the function will stop collecting files once the accumulated number of rows exceeds this limitcollect_stats- Whether to collect and accumulate statistics from the files
§Returns
A Result containing a FileGroup with the collected files
and a boolean indicating whether the statistics are inexact.
§Note
The function will continue processing files if statistics are not available or if the
limit is not provided. If collect_stats is false, statistics won’t be accumulated
but files will still be collected.