pub struct RowGroupAccessPlanFilter {
access_plan: ParquetAccessPlan,
}Expand description
Reduces the ParquetAccessPlan based on row group level metadata.
This struct implements the various types of pruning that are applied to a set of row groups within a parquet file, progressively narrowing down the set of row groups (and ranges/selections within those row groups) that should be scanned, based on the available metadata.
Fields§
§access_plan: ParquetAccessPlanImplementations§
Source§impl RowGroupAccessPlanFilter
impl RowGroupAccessPlanFilter
Sourcepub fn new(access_plan: ParquetAccessPlan) -> RowGroupAccessPlanFilter
pub fn new(access_plan: ParquetAccessPlan) -> RowGroupAccessPlanFilter
Create a new RowGroupPlanBuilder for pruning out the groups to scan
based on metadata and statistics
Sourcepub fn remaining_row_group_count(&self) -> usize
pub fn remaining_row_group_count(&self) -> usize
Return the number of row groups that are currently expected to be scanned
Sourcepub fn build(self) -> ParquetAccessPlan
pub fn build(self) -> ParquetAccessPlan
Returns the inner access plan
Sourcepub fn prune_by_range(&mut self, groups: &[RowGroupMetaData], range: &FileRange)
pub fn prune_by_range(&mut self, groups: &[RowGroupMetaData], range: &FileRange)
Prune remaining row groups to only those within the specified range.
Updates this set to mark row groups that should not be scanned
§Panics
if groups.len() != self.len()
Sourcepub fn prune_by_statistics(
&mut self,
arrow_schema: &Schema,
parquet_schema: &SchemaDescriptor,
groups: &[RowGroupMetaData],
predicate: &PruningPredicate,
metrics: &ParquetFileMetrics,
)
pub fn prune_by_statistics( &mut self, arrow_schema: &Schema, parquet_schema: &SchemaDescriptor, groups: &[RowGroupMetaData], predicate: &PruningPredicate, metrics: &ParquetFileMetrics, )
Prune remaining row groups using min/max/null_count statistics and
the PruningPredicate to determine if the predicate can not be true.
Updates this set to mark row groups that should not be scanned
Note: This method currently ignores ColumnOrder https://github.com/apache/datafusion/issues/8335
§Panics
if groups.len() != self.len()
Sourcepub async fn prune_by_bloom_filters<T>(
&mut self,
arrow_schema: &Schema,
builder: &mut ArrowReaderBuilder<AsyncReader<T>>,
predicate: &PruningPredicate,
metrics: &ParquetFileMetrics,
)where
T: AsyncFileReader + Send + 'static,
pub async fn prune_by_bloom_filters<T>(
&mut self,
arrow_schema: &Schema,
builder: &mut ArrowReaderBuilder<AsyncReader<T>>,
predicate: &PruningPredicate,
metrics: &ParquetFileMetrics,
)where
T: AsyncFileReader + Send + 'static,
Prune remaining row groups using available bloom filters and the
PruningPredicate.
Updates this set with row groups that should not be scanned
§Panics
if the builder does not have the same number of row groups as this set
Trait Implementations§
Source§impl Clone for RowGroupAccessPlanFilter
impl Clone for RowGroupAccessPlanFilter
Source§fn clone(&self) -> RowGroupAccessPlanFilter
fn clone(&self) -> RowGroupAccessPlanFilter
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for RowGroupAccessPlanFilter
impl Debug for RowGroupAccessPlanFilter
Source§impl PartialEq for RowGroupAccessPlanFilter
impl PartialEq for RowGroupAccessPlanFilter
impl StructuralPartialEq for RowGroupAccessPlanFilter
Auto Trait Implementations§
impl Freeze for RowGroupAccessPlanFilter
impl RefUnwindSafe for RowGroupAccessPlanFilter
impl Send for RowGroupAccessPlanFilter
impl Sync for RowGroupAccessPlanFilter
impl Unpin for RowGroupAccessPlanFilter
impl UnwindSafe for RowGroupAccessPlanFilter
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more