pub struct SortPreservingMergeExec {
input: Arc<dyn ExecutionPlan>,
expr: LexOrdering,
metrics: ExecutionPlanMetricsSet,
fetch: Option<usize>,
cache: PlanProperties,
enable_round_robin_repartition: bool,
}Expand description
Sort preserving merge execution plan
§Overview
This operator implements a K-way merge. It is used to merge multiple sorted streams into a single sorted stream and is highly optimized.
§Inputs:
- A list of sort expressions
- An input plan, where each partition is sorted with respect to these sort expressions.
§Output:
- A single partition that is also sorted with respect to the expressions
§Diagram
┌─────────────────────────┐
│ ┌───┬───┬───┬───┐ │
│ │ A │ B │ C │ D │ ... │──┐
│ └───┴───┴───┴───┘ │ │
└─────────────────────────┘ │ ┌───────────────────┐ ┌───────────────────────────────┐
Stream 1 │ │ │ │ ┌───┬───╦═══╦───┬───╦═══╗ │
├─▶│SortPreservingMerge│───▶│ │ A │ B ║ B ║ C │ D ║ E ║ ... │
│ │ │ │ └───┴─▲─╩═══╩───┴───╩═══╝ │
┌─────────────────────────┐ │ └───────────────────┘ └─┬─────┴───────────────────────┘
│ ╔═══╦═══╗ │ │
│ ║ B ║ E ║ ... │──┘ │
│ ╚═══╩═══╝ │ Stable sort if `enable_round_robin_repartition=false`:
└─────────────────────────┘ the merged stream places equal rows from stream 1
Stream 2
Input Partitions Output Partition
(sorted) (sorted)§Error Handling
If any of the input partitions return an error, the error is propagated to the output and inputs are not polled again.
Fields§
§input: Arc<dyn ExecutionPlan>Input plan with sorted partitions
expr: LexOrderingSort expressions
metrics: ExecutionPlanMetricsSetExecution metrics
fetch: Option<usize>Optional number of rows to fetch. Stops producing rows after this fetch
cache: PlanPropertiesCache holding plan properties like equivalences, output partitioning etc.
enable_round_robin_repartition: boolUse round-robin selection of tied winners of loser tree
See Self::with_round_robin_repartition for more information.
Implementations§
Source§impl SortPreservingMergeExec
impl SortPreservingMergeExec
Sourcepub fn new(expr: LexOrdering, input: Arc<dyn ExecutionPlan>) -> Self
pub fn new(expr: LexOrdering, input: Arc<dyn ExecutionPlan>) -> Self
Create a new sort execution plan
Sourcepub fn with_fetch(self, fetch: Option<usize>) -> Self
pub fn with_fetch(self, fetch: Option<usize>) -> Self
Sets the number of rows to fetch
Sourcepub fn with_round_robin_repartition(
self,
enable_round_robin_repartition: bool,
) -> Self
pub fn with_round_robin_repartition( self, enable_round_robin_repartition: bool, ) -> Self
Sets the selection strategy of tied winners of the loser tree algorithm
If true (the default) equal output rows are placed in the merged stream in round robin fashion. This approach consumes input streams at more even rates when there are many rows with the same sort key.
If false, equal output rows are always placed in the merged stream in the order of the inputs, resulting in potentially slower execution but a stable output order.
Sourcepub fn input(&self) -> &Arc<dyn ExecutionPlan>
pub fn input(&self) -> &Arc<dyn ExecutionPlan>
Input schema
Sourcepub fn expr(&self) -> &LexOrdering
pub fn expr(&self) -> &LexOrdering
Sort expressions
Sourcefn compute_properties(
input: &Arc<dyn ExecutionPlan>,
ordering: LexOrdering,
) -> PlanProperties
fn compute_properties( input: &Arc<dyn ExecutionPlan>, ordering: LexOrdering, ) -> PlanProperties
Creates the cache object that stores the plan properties such as schema, equivalence properties, ordering, partitioning, etc.
Trait Implementations§
Source§impl Clone for SortPreservingMergeExec
impl Clone for SortPreservingMergeExec
Source§fn clone(&self) -> SortPreservingMergeExec
fn clone(&self) -> SortPreservingMergeExec
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for SortPreservingMergeExec
impl Debug for SortPreservingMergeExec
Source§impl DisplayAs for SortPreservingMergeExec
impl DisplayAs for SortPreservingMergeExec
Source§impl ExecutionPlan for SortPreservingMergeExec
impl ExecutionPlan for SortPreservingMergeExec
Source§fn with_fetch(&self, limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>
fn with_fetch(&self, limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>
Sets the number of rows to fetch
Source§fn try_swapping_with_projection(
&self,
projection: &ProjectionExec,
) -> Result<Option<Arc<dyn ExecutionPlan>>>
fn try_swapping_with_projection( &self, projection: &ProjectionExec, ) -> Result<Option<Arc<dyn ExecutionPlan>>>
Tries to swap the projection with its input SortPreservingMergeExec.
If this is possible, it returns the new SortPreservingMergeExec whose
child is a projection. Otherwise, it returns None.
Source§fn name(&self) -> &'static str
fn name(&self) -> &'static str
Source§fn properties(&self) -> &PlanProperties
fn properties(&self) -> &PlanProperties
ExecutionPlan, such as output
ordering(s), partitioning information etc. Read moreSource§fn fetch(&self) -> Option<usize>
fn fetch(&self) -> Option<usize>
None means there is no fetch.Source§fn required_input_distribution(&self) -> Vec<Distribution>
fn required_input_distribution(&self) -> Vec<Distribution>
ExecutionPlan, By default it’s [Distribution::UnspecifiedDistribution] for each child,Source§fn benefits_from_input_partitioning(&self) -> Vec<bool>
fn benefits_from_input_partitioning(&self) -> Vec<bool>
ExecutionPlan benefits from increased
parallelization at its input for each child. Read moreSource§fn required_input_ordering(&self) -> Vec<Option<OrderingRequirements>>
fn required_input_ordering(&self) -> Vec<Option<OrderingRequirements>>
ExecutionPlan. Read moreSource§fn maintains_input_order(&self) -> Vec<bool>
fn maintains_input_order(&self) -> Vec<bool>
false if this ExecutionPlan’s implementation may reorder
rows within or between partitions. Read moreSource§fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>
fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>
ExecutionPlans that act as inputs to this plan.
The returned list will be empty for leaf nodes such as scans, will contain
a single value for unary nodes, or two values for binary nodes (such as
joins).Source§fn with_new_children(
self: Arc<Self>,
children: Vec<Arc<dyn ExecutionPlan>>,
) -> Result<Arc<dyn ExecutionPlan>>
fn with_new_children( self: Arc<Self>, children: Vec<Arc<dyn ExecutionPlan>>, ) -> Result<Arc<dyn ExecutionPlan>>
ExecutionPlan where all existing children were replaced
by the children, in orderSource§fn execute(
&self,
partition: usize,
context: Arc<TaskContext>,
) -> Result<SendableRecordBatchStream>
fn execute( &self, partition: usize, context: Arc<TaskContext>, ) -> Result<SendableRecordBatchStream>
Source§fn metrics(&self) -> Option<MetricsSet>
fn metrics(&self) -> Option<MetricsSet>
Metrics for this
ExecutionPlan. If no Metrics are available, return None. Read moreSource§fn statistics(&self) -> Result<Statistics>
fn statistics(&self) -> Result<Statistics>
partition_statistics method insteadExecutionPlan node. If statistics are not
available, should return Statistics::new_unknown (the default), not
an error. Read moreSource§fn partition_statistics(&self, _partition: Option<usize>) -> Result<Statistics>
fn partition_statistics(&self, _partition: Option<usize>) -> Result<Statistics>
ExecutionPlan node.
If statistics are not available, should return Statistics::new_unknown
(the default), not an error.
If partition is None, it returns statistics for the entire plan.Source§fn supports_limit_pushdown(&self) -> bool
fn supports_limit_pushdown(&self) -> bool
Source§fn static_name() -> &'static strwhere
Self: Sized,
fn static_name() -> &'static strwhere
Self: Sized,
name but can be called without an instance.Source§fn check_invariants(&self, check: InvariantLevel) -> Result<()>
fn check_invariants(&self, check: InvariantLevel) -> Result<()>
Source§fn reset_state(self: Arc<Self>) -> Result<Arc<dyn ExecutionPlan>>
fn reset_state(self: Arc<Self>) -> Result<Arc<dyn ExecutionPlan>>
ExecutionPlan. Read moreSource§fn repartitioned(
&self,
_target_partitions: usize,
_config: &ConfigOptions,
) -> Result<Option<Arc<dyn ExecutionPlan>>>
fn repartitioned( &self, _target_partitions: usize, _config: &ConfigOptions, ) -> Result<Option<Arc<dyn ExecutionPlan>>>
ExecutionPlan to
produce target_partitions partitions. Read moreSource§fn cardinality_effect(&self) -> CardinalityEffect
fn cardinality_effect(&self) -> CardinalityEffect
Source§fn gather_filters_for_pushdown(
&self,
_phase: FilterPushdownPhase,
parent_filters: Vec<Arc<dyn PhysicalExpr>>,
_config: &ConfigOptions,
) -> Result<FilterDescription>
fn gather_filters_for_pushdown( &self, _phase: FilterPushdownPhase, parent_filters: Vec<Arc<dyn PhysicalExpr>>, _config: &ConfigOptions, ) -> Result<FilterDescription>
ExecutionPlan::gather_filters_for_pushdown: Read moreSource§fn handle_child_pushdown_result(
&self,
_phase: FilterPushdownPhase,
child_pushdown_result: ChildPushdownResult,
_config: &ConfigOptions,
) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>>
fn handle_child_pushdown_result( &self, _phase: FilterPushdownPhase, child_pushdown_result: ChildPushdownResult, _config: &ConfigOptions, ) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>>
ExecutionPlan::gather_filters_for_pushdown.
It allows the current node to process the results of filter pushdown from
its children, deciding whether to absorb filters, modify the plan, or pass
filters back up to its parent. Read moreAuto Trait Implementations§
impl Freeze for SortPreservingMergeExec
impl !RefUnwindSafe for SortPreservingMergeExec
impl Send for SortPreservingMergeExec
impl Sync for SortPreservingMergeExec
impl Unpin for SortPreservingMergeExec
impl !UnwindSafe for SortPreservingMergeExec
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more