Expand description
Sort-Merge Join execution
This module implements the runtime state machine for the Sort-Merge Join
operator. It drives two sorted input streams (the streamed side and the
buffered side), compares join keys, and produces joined RecordBatches.
Structsยง
- Buffered
Batch ๐ - A buffered batch that contains contiguous rows with same join key
- Buffered
Data ๐ - Buffered data contains all buffered batches with one unique join key
- Joined
Record ๐Batches - Joined batches with attached join filter information
- Sort
Merge ๐Join Stream - Sort-Merge join stream that consumes streamed and buffered data streams and produces joined output stream.
- Streamed
Batch ๐ - Represents a record batch from streamed input.
- Streamed
Joined ๐Chunk - Represents a chunk of joined data from streamed and buffered side
Enumsยง
- Buffered
Batch ๐State - Buffered
State ๐ - State of buffered data stream
- Sort
Merge ๐Join State - State of SMJ stream
- Streamed
State ๐ - State of streamed data stream
Functionsยง
- create_
unmatched_ ๐columns - fetch_
right_ ๐columns_ by_ idxs - Get
buffered_indicesrows forbuffered_data[buffered_batch_idx]by specific column indices - fetch_
right_ ๐columns_ from_ batch_ by_ idxs - get_
corrected_ ๐filter_ mask - get_
filter_ ๐column - Gets the arrays which join filters are applied on.
- is_
join_ ๐arrays_ equal - A faster version of compare_join_arrays() that only output whether the given two rows are equal
- join_
arrays ๐ - Get join array refs of given batch and join columns
- last_
index_ ๐for_ row - True if next index refers to either:
- produce_
buffered_ ๐null_ batch