ListingTableConfig

Struct ListingTableConfig 

Source
pub struct ListingTableConfig {
    pub table_paths: Vec<ListingTableUrl>,
    pub file_schema: Option<Arc<Schema>>,
    pub options: Option<ListingOptions>,
    pub(crate) schema_source: SchemaSource,
    pub(crate) schema_adapter_factory: Option<Arc<dyn SchemaAdapterFactory>>,
    pub(crate) expr_adapter_factory: Option<Arc<dyn PhysicalExprAdapterFactory>>,
}
Expand description

Configuration for creating a crate::ListingTable

§Schema Evolution Support

This configuration supports schema evolution through the optional SchemaAdapterFactory. You might want to override the default factory when you need:

  • Type coercion requirements: When you need custom logic for converting between different Arrow data types (e.g., Int32 ↔ Int64, Utf8 ↔ LargeUtf8)
  • Column mapping: You need to map columns with a legacy name to a new name
  • Custom handling of missing columns: By default they are filled in with nulls, but you may e.g. want to fill them in with 0 or "".

If not specified, a datafusion_datasource::schema_adapter::DefaultSchemaAdapterFactory will be used, which handles basic schema compatibility cases.

Fields§

§table_paths: Vec<ListingTableUrl>

Paths on the ObjectStore for creating crate::ListingTable. They should share the same schema and object store.

§file_schema: Option<Arc<Schema>>

Optional SchemaRef for the to be created crate::ListingTable.

See details on ListingTableConfig::with_schema

§options: Option<ListingOptions>

Optional ListingOptions for the to be created crate::ListingTable.

See details on ListingTableConfig::with_listing_options

§schema_source: SchemaSource§schema_adapter_factory: Option<Arc<dyn SchemaAdapterFactory>>§expr_adapter_factory: Option<Arc<dyn PhysicalExprAdapterFactory>>

Implementations§

Source§

impl ListingTableConfig

Source

pub fn new(table_path: ListingTableUrl) -> ListingTableConfig

Creates new ListingTableConfig for reading the specified URL

Source

pub fn new_with_multi_paths( table_paths: Vec<ListingTableUrl>, ) -> ListingTableConfig

Creates new ListingTableConfig with multiple table paths.

See ListingTableConfigExt::infer_options for details on what happens with multiple paths

Source

pub fn schema_source(&self) -> SchemaSource

Returns the source of the schema for this configuration

Source

pub fn with_schema(self, schema: Arc<Schema>) -> ListingTableConfig

Set the schema for the overall crate::ListingTable

crate::ListingTable will automatically coerce, when possible, the schema for individual files to match this schema.

If a schema is not provided, it is inferred using Self::infer_schema.

If the schema is provided, it must contain only the fields in the file without the table partitioning columns.

§Example: Specifying Table Schema
let schema = Arc::new(Schema::new(vec![
    Field::new("id", DataType::Int64, false),
    Field::new("name", DataType::Utf8, true),
]));

let config = ListingTableConfig::new(table_paths)
    .with_listing_options(listing_options)  // Set options first
    .with_schema(schema);                    // Then set schema
Source

pub fn with_listing_options( self, listing_options: ListingOptions, ) -> ListingTableConfig

Add listing_options to ListingTableConfig

If not provided, format and other options are inferred via ListingTableConfigExt::infer_options.

§Example: Configuring Parquet Files with Custom Options
let options = ListingOptions::new(Arc::new(ParquetFormat::default()))
    .with_file_extension(".parquet")
    .with_collect_stat(true);

let config = ListingTableConfig::new(table_paths).with_listing_options(options);
// Configure file format and options
Source

pub fn infer_file_extension_and_compression_type( path: &str, ) -> Result<(String, Option<String>), DataFusionError>

Returns a tuple of (file_extension, optional compression_extension)

For example a path ending with blah.test.csv.gz returns ("csv", Some("gz")) For example a path ending with blah.test.csv returns ("csv", None)

Source

pub async fn infer_schema( self, state: &dyn Session, ) -> Result<ListingTableConfig, DataFusionError>

Infer the SchemaRef based on table_paths.

This method infers the table schema using the first table_path. See ListingOptions::infer_schema for more details

§Errors
Source

pub async fn infer_partitions_from_path( self, state: &dyn Session, ) -> Result<ListingTableConfig, DataFusionError>

Infer the partition columns from table_paths.

§Errors
Source

pub fn with_schema_adapter_factory( self, schema_adapter_factory: Arc<dyn SchemaAdapterFactory>, ) -> ListingTableConfig

Set the SchemaAdapterFactory for the crate::ListingTable

The schema adapter factory is used to create schema adapters that can handle schema evolution and type conversions when reading files with different schemas than the table schema.

If not provided, a default schema adapter factory will be used.

§Example: Custom Schema Adapter for Type Coercion
let config = ListingTableConfig::new(table_paths)
    .with_listing_options(listing_options)
    .with_schema(table_schema)
    .with_schema_adapter_factory(Arc::new(MySchemaAdapterFactory));
Source

pub fn schema_adapter_factory(&self) -> Option<&Arc<dyn SchemaAdapterFactory>>

Get the SchemaAdapterFactory for this configuration

Source

pub fn with_expr_adapter_factory( self, expr_adapter_factory: Arc<dyn PhysicalExprAdapterFactory>, ) -> ListingTableConfig

Set the PhysicalExprAdapterFactory for the crate::ListingTable

The expression adapter factory is used to create physical expression adapters that can handle schema evolution and type conversions when evaluating expressions with different schemas than the table schema.

If not provided, a default physical expression adapter factory will be used unless a custom SchemaAdapterFactory is set, in which case only the SchemaAdapterFactory will be used.

See https://github.com/apache/datafusion/issues/16800 for details on this transition.

Trait Implementations§

Source§

impl Clone for ListingTableConfig

Source§

fn clone(&self) -> ListingTableConfig

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for ListingTableConfig

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more
Source§

impl Default for ListingTableConfig

Source§

fn default() -> ListingTableConfig

Returns the “default value” for a type. Read more
Source§

impl ListingTableConfigExt for ListingTableConfig

Source§

fn infer_options<'life0, 'async_trait>( self, state: &'life0 dyn Session, ) -> Pin<Box<dyn Future<Output = Result<ListingTableConfig>> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait,

Infer ListingOptions based on table_path and file suffix. Read more
Source§

fn infer<'life0, 'async_trait>( self, state: &'life0 dyn Session, ) -> Pin<Box<dyn Future<Output = Result<Self>> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait,

Convenience method to call both Self::infer_options and ListingTableConfig::infer_schema

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

§

impl<T> Instrument for T

§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided [Span], returning an Instrumented wrapper. Read more
§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
§

impl<T> PolicyExt for T
where T: ?Sized,

§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] only if self and other return Action::Follow. Read more
§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V

§

impl<T> WithSubscriber for T

§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

impl<T> ErasedDestructor for T
where T: 'static,