pub struct PagePruningAccessPlanFilter {
predicates: Vec<PruningPredicate>,
}Expand description
Filters a ParquetAccessPlan based on the Parquet PageIndex, if present
It does so by evaluating statistics from the [ParquetColumnIndex] and
[ParquetOffsetIndex] and converting them to [RowSelection].
For example, given a row group with two column (chunks) for A
and B with the following with page level statistics:
┏━━ ━━━ ━━━ ━━━ ━━━ ━━━ ━━━ ━━━ ━━━ ━━━ ━━━ ━━━ ━━━
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┃
┃ ┌──────────────┐ │ ┌──────────────┐ │ ┃
┃ │ │ │ │ │ │ ┃
┃ │ │ │ │ Page │ │
│ │ │ │ │ 3 │ ┃
┃ │ │ │ │ min: "A" │ │ ┃
┃ │ │ │ │ │ max: "C" │ ┃
┃ │ Page │ │ │ first_row: 0 │ │
│ │ 1 │ │ │ │ ┃
┃ │ min: 10 │ │ └──────────────┘ │ ┃
┃ │ │ max: 20 │ │ ┌──────────────┐ ┃
┃ │ first_row: 0 │ │ │ │ │
│ │ │ │ │ Page │ ┃
┃ │ │ │ │ 4 │ │ ┃
┃ │ │ │ │ │ min: "D" │ ┃
┃ │ │ │ │ max: "G" │ │
│ │ │ │ │first_row: 100│ ┃
┃ └──────────────┘ │ │ │ │ ┃
┃ │ ┌──────────────┐ │ │ │ ┃
┃ │ │ │ └──────────────┘ │
│ │ Page │ │ ┌──────────────┐ ┃
┃ │ 2 │ │ │ │ │ ┃
┃ │ │ min: 30 │ │ │ Page │ ┃
┃ │ max: 40 │ │ │ 5 │ │
│ │first_row: 200│ │ │ min: "H" │ ┃
┃ │ │ │ │ max: "Z" │ │ ┃
┃ │ │ │ │ │first_row: 250│ ┃
┃ └──────────────┘ │ │ │ │
│ │ └──────────────┘ ┃
┃ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ ┃
┃ ColumnChunk ColumnChunk ┃
┃ A B
━━━ ━━━ ━━━ ━━━ ━━━ ━━━ ━━━ ━━━ ━━━ ━━━ ━━━ ━━━ ━━┛
Total rows: 300Given the predicate A > 35 AND B = 'F':
Using A > 35: can rule out all of values in Page 1 (rows 0 -> 199)
Using B = 'F': can rule out all values in Page 3 and Page 5 (rows 0 -> 99, and 250 -> 299)
So we can entirely skip rows 0->199 and 250->299 as we know they can not contain rows that match the predicate.
§Implementation notes
Single column predicates are evaluated using the PageIndex information for that column to determine which row ranges can be skipped based.
The resulting [RowSelection]’s are combined into a final
row selection that is added to the ParquetAccessPlan.
Fields§
§predicates: Vec<PruningPredicate>single column predicates (e.g. (col = 5) extracted from the overall
predicate. Must all be true for a row to be included in the result.
Implementations§
Source§impl PagePruningAccessPlanFilter
impl PagePruningAccessPlanFilter
Sourcepub fn new(expr: &Arc<dyn PhysicalExpr>, schema: SchemaRef) -> Self
pub fn new(expr: &Arc<dyn PhysicalExpr>, schema: SchemaRef) -> Self
Create a new PagePruningAccessPlanFilter from a physical
expression.
Sourcepub fn prune_plan_with_page_index(
&self,
access_plan: ParquetAccessPlan,
arrow_schema: &Schema,
parquet_schema: &SchemaDescriptor,
parquet_metadata: &ParquetMetaData,
file_metrics: &ParquetFileMetrics,
) -> ParquetAccessPlan
pub fn prune_plan_with_page_index( &self, access_plan: ParquetAccessPlan, arrow_schema: &Schema, parquet_schema: &SchemaDescriptor, parquet_metadata: &ParquetMetaData, file_metrics: &ParquetFileMetrics, ) -> ParquetAccessPlan
Returns an updated ParquetAccessPlan by applying predicates to the
parquet page index, if any
Sourcepub fn filter_number(&self) -> usize
pub fn filter_number(&self) -> usize
Returns the number of filters in the PagePruningAccessPlanFilter
Trait Implementations§
Auto Trait Implementations§
impl Freeze for PagePruningAccessPlanFilter
impl !RefUnwindSafe for PagePruningAccessPlanFilter
impl Send for PagePruningAccessPlanFilter
impl Sync for PagePruningAccessPlanFilter
impl Unpin for PagePruningAccessPlanFilter
impl !UnwindSafe for PagePruningAccessPlanFilter
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more