Module pruning_predicate

Module pruning_predicate 

Source
Expand description

PruningPredicate to apply filter Expr to prune “containers” based on statistics (e.g. Parquet Row Groups)

Structs§

BoolVecBuilder 🔒
Builds the return Vec for PruningPredicate::prune.
ConstantUnhandledPredicateHook 🔒
The default handling for unhandled predicates is to return a constant true (meaning don’t prune the container)
PredicateRewriter
Rewrite a predicate expression in terms of statistics (min/max/null_counts) for use as a PruningPredicate.
PruningExpressionBuilder 🔒
PruningPredicate
Used to prove that arbitrary predicates (boolean expression) can not possibly evaluate to true given information about a column provided by PruningStatistics.
RequiredColumns
Describes which columns statistics are necessary to evaluate a PruningPredicate.

Enums§

StatisticsType 🔒

Constants§

MAX_LIST_VALUE_SIZE_REWRITE 🔒
The maximum number of entries in an InList that might be rewritten into an OR chain

Traits§

PruningStatistics
A source of runtime statistical information to PruningPredicates.
UnhandledPredicateHook
Rewrites predicates that PredicateRewriter can not handle, e.g. certain complex expressions or predicates that reference columns that are not in the schema.

Functions§

build_is_null_column_expr 🔒
Given an expression reference to expr, if expr is a column expression, returns a pruning expression in terms of IsNull that will evaluate to true if the column may contain null, and false if definitely does not contain null. If with_not is true, build a pruning expression for col IS NOT NULL: col_count != col_null_count The pruning expression evaluates to true ONLY if the column definitely CONTAINS at least one NULL value. In this case we can know that IS NOT NULL can not be true and thus can prune the row group / value
build_like_match 🔒
Convert column LIKE literal where P is a constant prefix of the literal to a range check on the column: P <= column && column < P', where P’ is the lowest string after all P* strings.
build_not_like_match 🔒
build_predicate_expression 🔒
Translate logical filter expression into pruning predicate expression that will evaluate to FALSE if it can be determined no rows between the min/max values could pass the predicates.
build_pruning_predicate
Build a pruning predicate from an optional predicate expression. If the predicate is None or the predicate cannot be converted to a pruning predicate, return None. If there is an error creating the pruning predicate it is recorded by incrementing the predicate_creation_errors counter.
build_single_column_expr 🔒
Given a column reference to column, returns a pruning expression in terms of the min and max that will evaluate to true if the column may contain values, and false if definitely does not contain values
build_statistics_expr 🔒
build_statistics_record_batch 🔒
Build a RecordBatch from a list of statistics, creating arrays, with one row for each PruningStatistics and columns specified in the required_columns parameter.
extract_string_literal 🔒
increment_utf8 🔒
Increment a UTF8 string by one, returning None if it can’t be incremented. This makes it so that the returned string will always compare greater than the input string or any other string with the same prefix. This is necessary since the statistics may have been truncated: if we have a min statistic of “fo” that may have originally been “foz” or anything else with the prefix “fo”. E.g. increment_utf8("foo") >= "foo" and increment_utf8("foo") >= "fooz" In this example `increment_utf8(“foo”) == “fop”
is_always_false 🔒
is_always_true 🔒
is_compare_op 🔒
is_string_type 🔒
reverse_operator 🔒
rewrite_column_expr 🔒
replaces a column with an old name with a new name in an expression
rewrite_expr_to_prunable 🔒
This function is designed to rewrite the column_expr to ensure the column_expr is monotonically increasing.
split_constant_prefix 🔒
Returns unescaped constant prefix of a LIKE pattern (possibly empty) and the remaining pattern (possibly empty)
unpack_string 🔒
returns the string literal of the scalar value if it is a string
verify_support_type_for_prune 🔒
wrap_null_count_check_expr 🔒
Wrap the statistics expression in a check that skips the expression if the column is all nulls.