pub struct CsvReadOptions<'a> {Show 15 fields
pub has_header: bool,
pub delimiter: u8,
pub quote: u8,
pub terminator: Option<u8>,
pub escape: Option<u8>,
pub comment: Option<u8>,
pub newlines_in_values: bool,
pub schema: Option<&'a Schema>,
pub schema_infer_max_records: usize,
pub file_extension: &'a str,
pub table_partition_cols: Vec<(String, DataType)>,
pub file_compression_type: FileCompressionType,
pub file_sort_order: Vec<Vec<SortExpr>>,
pub null_regex: Option<String>,
pub truncated_rows: bool,
}Expand description
Options that control the reading of CSV files.
Note this structure is supplied when a datasource is created and
can not not vary from statement to statement. For settings that
can vary statement to statement see
ConfigOptions.
Fields§
§has_header: boolDoes the CSV file have a header?
If schema inference is run on a file with no headers, default column names are created.
delimiter: u8An optional column delimiter. Defaults to b','.
quote: u8An optional quote character. Defaults to b'"'.
terminator: Option<u8>An optional terminator character. Defaults to None (CRLF).
escape: Option<u8>An optional escape character. Defaults to None.
comment: Option<u8>If enabled, lines beginning with this byte are ignored.
newlines_in_values: boolSpecifies whether newlines in (quoted) values are supported.
Parsing newlines in quoted values may be affected by execution behaviour such as
parallel file scanning. Setting this to true ensures that newlines in values are
parsed successfully, which may reduce performance.
The default behaviour depends on the datafusion.catalog.newlines_in_values setting.
schema: Option<&'a Schema>An optional schema representing the CSV files. If None, CSV reader will try to infer it based on data in file.
schema_infer_max_records: usizeMax number of rows to read from CSV files for schema inference if needed. Defaults to DEFAULT_SCHEMA_INFER_MAX_RECORD.
file_extension: &'a strFile extension; only files with this extension are selected for data input.
Defaults to FileType::CSV.get_ext().as_str().
table_partition_cols: Vec<(String, DataType)>Partition Columns
file_compression_type: FileCompressionTypeFile compression type
file_sort_order: Vec<Vec<SortExpr>>Indicates how the file is sorted
null_regex: Option<String>Optional regex to match null values
truncated_rows: boolWhether to allow truncated rows when parsing. By default this is set to false and will error if the CSV rows have different lengths. When set to true then it will allow records with less than the expected number of columns and fill the missing columns with nulls. If the record’s schema is not nullable, then it will still return an error.
Implementations§
Source§impl<'a> CsvReadOptions<'a>
impl<'a> CsvReadOptions<'a>
Sourcepub fn has_header(self, has_header: bool) -> Self
pub fn has_header(self, has_header: bool) -> Self
Configure has_header setting
Sourcepub fn terminator(self, terminator: Option<u8>) -> Self
pub fn terminator(self, terminator: Option<u8>) -> Self
Specify terminator to use for CSV read
Sourcepub fn newlines_in_values(self, newlines_in_values: bool) -> Self
pub fn newlines_in_values(self, newlines_in_values: bool) -> Self
Specifies whether newlines in (quoted) values are supported.
Parsing newlines in quoted values may be affected by execution behaviour such as
parallel file scanning. Setting this to true ensures that newlines in values are
parsed successfully, which may reduce performance.
The default behaviour depends on the datafusion.catalog.newlines_in_values setting.
Sourcepub fn file_extension(self, file_extension: &'a str) -> Self
pub fn file_extension(self, file_extension: &'a str) -> Self
Specify the file extension for CSV file selection
Sourcepub fn delimiter_option(self, delimiter: Option<u8>) -> Self
pub fn delimiter_option(self, delimiter: Option<u8>) -> Self
Configure delimiter setting with Option, None value will be ignored
Sourcepub fn table_partition_cols(
self,
table_partition_cols: Vec<(String, DataType)>,
) -> Self
pub fn table_partition_cols( self, table_partition_cols: Vec<(String, DataType)>, ) -> Self
Specify table_partition_cols for partition pruning
Sourcepub fn schema_infer_max_records(self, max_records: usize) -> Self
pub fn schema_infer_max_records(self, max_records: usize) -> Self
Configure number of max records to read for schema inference
Sourcepub fn file_compression_type(
self,
file_compression_type: FileCompressionType,
) -> Self
pub fn file_compression_type( self, file_compression_type: FileCompressionType, ) -> Self
Configure file compression type
Sourcepub fn file_sort_order(self, file_sort_order: Vec<Vec<SortExpr>>) -> Self
pub fn file_sort_order(self, file_sort_order: Vec<Vec<SortExpr>>) -> Self
Configure if file has known sort order
Sourcepub fn null_regex(self, null_regex: Option<String>) -> Self
pub fn null_regex(self, null_regex: Option<String>) -> Self
Configure the null parsing regex.
Sourcepub fn truncated_rows(self, truncated_rows: bool) -> Self
pub fn truncated_rows(self, truncated_rows: bool) -> Self
Configure whether to allow truncated rows when parsing. By default this is set to false and will error if the CSV rows have different lengths When set to true then it will allow records with less than the expected number of columns and fill the missing columns with nulls. If the record’s schema is not nullable, then it will still return an error.
Trait Implementations§
Source§impl<'a> Clone for CsvReadOptions<'a>
impl<'a> Clone for CsvReadOptions<'a>
Source§fn clone(&self) -> CsvReadOptions<'a>
fn clone(&self) -> CsvReadOptions<'a>
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Default for CsvReadOptions<'_>
impl Default for CsvReadOptions<'_>
Source§impl ReadOptions<'_> for CsvReadOptions<'_>
impl ReadOptions<'_> for CsvReadOptions<'_>
Source§fn to_listing_options(
&self,
config: &SessionConfig,
table_options: TableOptions,
) -> ListingOptions
fn to_listing_options( &self, config: &SessionConfig, table_options: TableOptions, ) -> ListingOptions
ListingTable optionsSource§fn get_resolved_schema<'life0, 'life1, 'async_trait>(
&'life0 self,
config: &'life1 SessionConfig,
state: SessionState,
table_path: ListingTableUrl,
) -> Pin<Box<dyn Future<Output = Result<SchemaRef>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait,
fn get_resolved_schema<'life0, 'life1, 'async_trait>(
&'life0 self,
config: &'life1 SessionConfig,
state: SessionState,
table_path: ListingTableUrl,
) -> Pin<Box<dyn Future<Output = Result<SchemaRef>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait,
Source§fn _get_resolved_schema<'life0, 'async_trait>(
&'a self,
config: &'life0 SessionConfig,
state: SessionState,
table_path: ListingTableUrl,
schema: Option<&'a Schema>,
) -> Pin<Box<dyn Future<Output = Result<SchemaRef>> + Send + 'async_trait>>where
Self: Sync + 'async_trait,
'a: 'async_trait,
'life0: 'async_trait,
fn _get_resolved_schema<'life0, 'async_trait>(
&'a self,
config: &'life0 SessionConfig,
state: SessionState,
table_path: ListingTableUrl,
schema: Option<&'a Schema>,
) -> Pin<Box<dyn Future<Output = Result<SchemaRef>> + Send + 'async_trait>>where
Self: Sync + 'async_trait,
'a: 'async_trait,
'life0: 'async_trait,
Auto Trait Implementations§
impl<'a> Freeze for CsvReadOptions<'a>
impl<'a> !RefUnwindSafe for CsvReadOptions<'a>
impl<'a> Send for CsvReadOptions<'a>
impl<'a> Sync for CsvReadOptions<'a>
impl<'a> Unpin for CsvReadOptions<'a>
impl<'a> !UnwindSafe for CsvReadOptions<'a>
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more