Expand description
This file contains common subroutines for symmetric hash join related functionality, used both in join calculations and optimization rules.
StructsΒ§
- Pruning
Join Hash Map - The
PruningJoinHashMapis similar to a regularJoinHashMap, but with the capability of pruning elements in an efficient manner. This structure is particularly useful for cases where itβs necessary to remove elements from the map based on their buffer order. - Sorted
Filter Expr - The SortedFilterExpr object represents a sorted filter expression. It contains the following information: The origin expression, the filter expression, an interval encapsulating expression bounds, and a stable index identifying the expression in the expression DAG.
- Stream
Join Metrics - Metrics for HashJoinExec
- Stream
Join Side Metrics
FunctionsΒ§
- build_
filter_ input_ order - This function is used to build the filter expression based on the sort order of input columns.
- calculate_
filter_ expr_ intervals - Calculate the filter expression intervals.
- check_
filter_ πexpr_ contains_ sort_ information - combine_
two_ batches - convert_
filter_ πcolumns - Convert a physical expression into a filter expression using the given column mapping information.
- convert_
sort_ expr_ with_ filter_ schema - This function analyzes
PhysicalSortExprgraphs with respect to output orderings (sorting) properties. This is necessary since monotonically increasing and/or decreasing expressions are required when using join filter expressions for data pruning purposes. - get_
pruning_ anti_ indices - Get the anti join indices from the visited hash set.
- get_
pruning_ semi_ indices - This method creates a boolean buffer from the visited rows hash set and the indices of the pruned record batch slice.
- map_
origin_ col_ to_ filter_ col - Create a one to one mapping from main columns to filter columns using filter column indices. A column index looks like:
- prepare_
sorted_ exprs - Prepares and sorts expressions based on a given filter, left and right schemas, and sort expressions.
- record_
visited_ indices - Records the visited indices from the input
PrimitiveArrayof typeTinto the given hash setvisited. This function will insert the indices (offset byoffset) into thevisitedhash set. - update_
filter_ expr_ interval - This is a subroutine of the function
calculate_filter_expr_intervals. It constructs the current interval using the givenbatchand updates the filter expression (i.e.sorted_expr) with this interval. - update_
sorted_ πexprs_ with_ node_ indices - Updates sorted filter expressions with corresponding node indices from the expression interval graph.