Module stream

Module stream 

Source
Expand description

Sort-Merge Join execution

This module implements the runtime state machine for the Sort-Merge Join operator. It drives two sorted input streams (the streamed side and the buffered side), compares join keys, and produces joined RecordBatches.

Structsยง

BufferedBatch ๐Ÿ”’
A buffered batch that contains contiguous rows with same join key
BufferedData ๐Ÿ”’
Buffered data contains all buffered batches with one unique join key
JoinedRecordBatches ๐Ÿ”’
Joined batches with attached join filter information
SortMergeJoinStream ๐Ÿ”’
Sort-Merge join stream that consumes streamed and buffered data streams and produces joined output stream.
StreamedBatch ๐Ÿ”’
Represents a record batch from streamed input.
StreamedJoinedChunk ๐Ÿ”’
Represents a chunk of joined data from streamed and buffered side

Enumsยง

BufferedBatchState ๐Ÿ”’
BufferedState ๐Ÿ”’
State of buffered data stream
SortMergeJoinState ๐Ÿ”’
State of SMJ stream
StreamedState ๐Ÿ”’
State of streamed data stream

Functionsยง

create_unmatched_columns ๐Ÿ”’
fetch_right_columns_by_idxs ๐Ÿ”’
Get buffered_indices rows for buffered_data[buffered_batch_idx] by specific column indices
fetch_right_columns_from_batch_by_idxs ๐Ÿ”’
get_corrected_filter_mask ๐Ÿ”’
get_filter_column ๐Ÿ”’
Gets the arrays which join filters are applied on.
is_join_arrays_equal ๐Ÿ”’
A faster version of compare_join_arrays() that only output whether the given two rows are equal
join_arrays ๐Ÿ”’
Get join array refs of given batch and join columns
last_index_for_row ๐Ÿ”’
True if next index refers to either:
produce_buffered_null_batch ๐Ÿ”’