Module physical_plan

Module physical_plan 

Source
Expand description

Execution plans that read file formats

Modules§

arrow
Reexports the datafusion_datasource_arrow::source module, containing Arrow based FileSource.
avro
Reexports the datafusion_datasource_json::source module, containing Avro based FileSource.
csv
Reexports the datafusion_datasource_json::source module, containing CSV based FileSource.
json
Reexports the datafusion_datasource_json::source module, containing JSON based FileSource.
parquet
Reexports the datafusion_datasource_parquet crate, containing Parquet based FileSource.

Structs§

ArrowOpener
The struct arrow that implements [FileOpener] trait
ArrowSource
Arrow configuration struct that is given to DataSourceExec Does not hold anything special, since FileScanConfig is sufficient for arrow
AvroSource
AvroSource holds the extra configuration that is necessary for opening avro files
CsvOpener
A FileOpener that opens a CSV file and yields a FileOpenFuture
CsvSource
A Config for CsvOpener
FileGroup
Represents a group of partitioned files that’ll be processed by a single thread. Maintains optional statistics across all files in the group.
FileGroupPartitioner
Repartition input files into target_partitions partitions, if total file size exceed repartition_file_min_size
FileScanConfig
The base configurations for a DataSourceExec, the a physical plan for any given file format.
FileScanConfigBuilder
A builder for FileScanConfig’s.
FileSinkConfig
The base configurations to provide when creating a physical plan for writing to any given file format.
FileStream
A stream that iterates record batch by record batch, file over file.
JsonOpener
A FileOpener that opens a JSON file and yields a FileOpenFuture
JsonSource
JsonSource holds the extra configuration that is necessary for JsonOpener
ParquetFileMetrics
Stores metrics about the parquet execution for a particular parquet file.
ParquetSource
Execution plan for reading one or more Parquet files.

Enums§

OnError
Describes the behavior of the FileStream if file opening or scanning fails

Traits§

FileOpener
Generic API for opening a file using an ObjectStore and resolving to a stream of RecordBatch
FileSink
General behaviors for files that do DataSink operations
FileSource
file format specific behaviors for elements in DataSource
ParquetFileReaderFactory
Interface for reading parquet files.

Functions§

wrap_partition_type_in_dict
Convert type to a type suitable for use as a ListingTable partition column. Returns Dictionary(UInt16, val_type), which is a reasonable trade off between a reasonable number of partition values and space efficiency.
wrap_partition_value_in_dict
Convert a ScalarValue of partition columns to a type, as described in the documentation of wrap_partition_type_in_dict, which can wrap the types.

Type Aliases§

FileOpenFuture
A fallible future that resolves to a stream of RecordBatch