async fn collect_left_input(
random_state: RandomState,
left_stream: SendableRecordBatchStream,
on_left: Vec<PhysicalExprRef>,
metrics: BuildProbeJoinMetrics,
reservation: MemoryReservation,
with_visited_indices_bitmap: bool,
probe_threads_count: usize,
should_compute_bounds: bool,
) -> Result<JoinLeftData>Expand description
Collects all batches from the left (build) side stream and creates a hash map for joining.
This function is responsible for:
- Consuming the entire left stream and collecting all batches into memory
- Building a hash map from the join key columns for efficient probe operations
- Computing bounds for dynamic filter pushdown (if enabled)
- Preparing visited indices bitmap for certain join types
§Parameters
random_state- Random state for consistent hashing across partitionsleft_stream- Stream of record batches from the build sideon_left- Physical expressions for the left side join keysmetrics- Metrics collector for tracking memory usage and row countsreservation- Memory reservation tracker for the hash table and datawith_visited_indices_bitmap- Whether to track visited indices (for outer joins)probe_threads_count- Number of threads that will probe this hash tableshould_compute_bounds- Whether to compute min/max bounds for dynamic filtering
§Dynamic Filter Coordination
When should_compute_bounds is true, this function computes the min/max bounds
for each join key column but does NOT update the dynamic filter. Instead, the
bounds are stored in the returned JoinLeftData and later coordinated by
SharedBoundsAccumulator to ensure all partitions contribute their bounds
before updating the filter exactly once.
§Returns
JoinLeftData containing the hash map, consolidated batch, join key values,
visited indices bitmap, and computed bounds (if requested).