Expand description
PruningPredicate to apply filter Expr to prune “containers”
based on statistics (e.g. Parquet Row Groups)
Structs§
- Bool
VecBuilder 🔒 - Builds the return
VecforPruningPredicate::prune. - Constant
Unhandled 🔒Predicate Hook - The default handling for unhandled predicates is to return a constant
true(meaning don’t prune the container) - Predicate
Rewriter - Rewrite a predicate expression in terms of statistics (min/max/null_counts)
for use as a
PruningPredicate. - Pruning
Expression 🔒Builder - Pruning
Predicate - Used to prove that arbitrary predicates (boolean expression) can not
possibly evaluate to
truegiven information about a column provided byPruningStatistics. - Required
Columns - Describes which columns statistics are necessary to evaluate a
PruningPredicate.
Enums§
Constants§
- MAX_
LIST_ 🔒VALUE_ SIZE_ REWRITE - The maximum number of entries in an
InListthat might be rewritten into an OR chain
Traits§
- Pruning
Statistics - A source of runtime statistical information to
PruningPredicates. - Unhandled
Predicate Hook - Rewrites predicates that
PredicateRewritercan not handle, e.g. certain complex expressions or predicates that reference columns that are not in the schema.
Functions§
- build_
is_ 🔒null_ column_ expr - Given an expression reference to
expr, ifexpris a column expression, returns a pruning expression in terms of IsNull that will evaluate to true if the column may contain null, and false if definitely does not contain null. Ifwith_notis true, build a pruning expression forcol IS NOT NULL:col_count != col_null_countThe pruning expression evaluates to true ONLY if the column definitely CONTAINS at least one NULL value. In this case we can know thatIS NOT NULLcan not be true and thus can prune the row group / value - build_
like_ 🔒match - Convert
column LIKE literalwhere P is a constant prefix of the literal to a range check on the column:P <= column && column < P', where P’ is the lowest string after all P* strings. - build_
not_ 🔒like_ match - build_
predicate_ 🔒expression - Translate logical filter expression into pruning predicate expression that will evaluate to FALSE if it can be determined no rows between the min/max values could pass the predicates.
- build_
pruning_ predicate - Build a pruning predicate from an optional predicate expression.
If the predicate is None or the predicate cannot be converted to a pruning
predicate, return None.
If there is an error creating the pruning predicate it is recorded by incrementing
the
predicate_creation_errorscounter. - build_
single_ 🔒column_ expr - Given a column reference to
column, returns a pruning expression in terms of the min and max that will evaluate to true if the column may contain values, and false if definitely does not contain values - build_
statistics_ 🔒expr - build_
statistics_ 🔒record_ batch - Build a RecordBatch from a list of statistics, creating arrays, with one row for each PruningStatistics and columns specified in the required_columns parameter.
- extract_
string_ 🔒literal - increment_
utf8 🔒 - Increment a UTF8 string by one, returning
Noneif it can’t be incremented. This makes it so that the returned string will always compare greater than the input string or any other string with the same prefix. This is necessary since the statistics may have been truncated: if we have a min statistic of “fo” that may have originally been “foz” or anything else with the prefix “fo”. E.g.increment_utf8("foo") >= "foo"andincrement_utf8("foo") >= "fooz"In this example `increment_utf8(“foo”) == “fop” - is_
always_ 🔒false - is_
always_ 🔒true - is_
compare_ 🔒op - is_
string_ 🔒type - reverse_
operator 🔒 - rewrite_
column_ 🔒expr - replaces a column with an old name with a new name in an expression
- rewrite_
expr_ 🔒to_ prunable - This function is designed to rewrite the column_expr to ensure the column_expr is monotonically increasing.
- split_
constant_ 🔒prefix - Returns unescaped constant prefix of a LIKE pattern (possibly empty) and the remaining pattern (possibly empty)
- unpack_
string 🔒 - returns the string literal of the scalar value if it is a string
- verify_
support_ 🔒type_ for_ prune - wrap_
null_ 🔒count_ check_ expr - Wrap the statistics expression in a check that skips the expression if the column is all nulls.