LinearSearch

Struct LinearSearch 

Source
pub struct LinearSearch {
    input_buffer_hashes: VecDeque<u64>,
    random_state: RandomState,
    ordered_partition_by_indices: Vec<usize>,
    row_map_batch: HashTable<(u64, usize)>,
    row_map_out: HashTable<(u64, usize, usize)>,
    input_schema: SchemaRef,
}
Expand description

This object encapsulates the algorithm state for a simple linear scan algorithm for computing partitions.

Fields§

§input_buffer_hashes: VecDeque<u64>

Keeps the hash of input buffer calculated from PARTITION BY columns. Its length is equal to the input_buffer length.

§random_state: RandomState

Used during hash value calculation.

§ordered_partition_by_indices: Vec<usize>

Input ordering and partition by key ordering need not be the same, so this vector stores the mapping between them. For instance, if the input is ordered by a, b and the window expression contains a PARTITION BY b, a clause, this attribute stores [1, 0].

§row_map_batch: HashTable<(u64, usize)>

We use this [HashTable] to calculate unique partitions for each new RecordBatch. First entry in the tuple is the hash value, the second entry is the unique ID for each partition (increments from 0 to n).

§row_map_out: HashTable<(u64, usize, usize)>

We use this [HashTable] to calculate the output columns that we can produce at each cycle. First entry in the tuple is the hash value, the second entry is the unique ID for each partition (increments from 0 to n). The third entry stores how many new outputs are calculated for the corresponding partition.

§input_schema: SchemaRef

Implementations§

Source§

impl LinearSearch

Source

fn new( ordered_partition_by_indices: Vec<usize>, input_schema: SchemaRef, ) -> Self

Initialize a new LinearSearch partition searcher.

Source

fn get_per_partition_indices( &mut self, columns: &[ArrayRef], batch: &RecordBatch, ) -> Result<Vec<(PartitionKey, Vec<u32>)>>

Calculate indices of each partition (according to PARTITION BY expression) columns contain partition by expression results.

Source

fn calc_partition_output_indices( &mut self, input_buffer: &RecordBatch, window_agg_states: &[PartitionWindowAggStates], window_expr: &[Arc<dyn WindowExpr>], ) -> Result<Vec<(PartitionKey, Vec<u32>)>>

Calculates partition keys and result indices for each partition. The return value is a vector of tuples where the first entry stores the partition key (unique for each partition) and the second entry stores indices of the rows for which the partition is constructed.

Trait Implementations§

Source§

impl PartitionSearcher for LinearSearch

Source§

fn calculate_out_columns( &mut self, input_buffer: &RecordBatch, window_agg_states: &[PartitionWindowAggStates], partition_buffers: &mut PartitionBatches, window_expr: &[Arc<dyn WindowExpr>], ) -> Result<Option<Vec<ArrayRef>>>

This method constructs output columns using the result of each window expression.

Source§

fn evaluate_partition_batches( &mut self, record_batch: &RecordBatch, window_expr: &[Arc<dyn WindowExpr>], ) -> Result<Vec<(PartitionKey, RecordBatch)>>

Source§

fn prune(&mut self, n_out: usize)

Prunes the state.
Source§

fn mark_partition_end(&self, partition_buffers: &mut PartitionBatches)

Marks the partition as done if we are sure that corresponding partition cannot receive any more values.
Source§

fn is_mode_linear(&self) -> bool

Determine whether [InputOrderMode] is [InputOrderMode::Linear] or not.
Source§

fn input_schema(&self) -> &SchemaRef

Source§

fn update_partition_batch( &mut self, input_buffer: &mut RecordBatch, record_batch: RecordBatch, window_expr: &[Arc<dyn WindowExpr>], partition_buffers: &mut PartitionBatches, ) -> Result<()>

Updates input_buffer and partition_buffers with the new record_batch.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

§

impl<T> Instrument for T

§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided [Span], returning an Instrumented wrapper. Read more
§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
§

impl<T> PolicyExt for T
where T: ?Sized,

§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] only if self and other return Action::Follow. Read more
§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V

§

impl<T> WithSubscriber for T

§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,

§

impl<T> ErasedDestructor for T
where T: 'static,