pub struct SortPreservingMergeExec {
input: Arc<dyn ExecutionPlan>,
expr: LexOrdering,
metrics: ExecutionPlanMetricsSet,
fetch: Option<usize>,
cache: PlanProperties,
enable_round_robin_repartition: bool,
}Expand description
Sort preserving merge execution plan
§Overview
This operator implements a K-way merge. It is used to merge multiple sorted streams into a single sorted stream and is highly optimized.
§Inputs:
- A list of sort expressions
- An input plan, where each partition is sorted with respect to these sort expressions.
§Output:
- A single partition that is also sorted with respect to the expressions
§Diagram
┌─────────────────────────┐
│ ┌───┬───┬───┬───┐ │
│ │ A │ B │ C │ D │ ... │──┐
│ └───┴───┴───┴───┘ │ │
└─────────────────────────┘ │ ┌───────────────────┐ ┌───────────────────────────────┐
Stream 1 │ │ │ │ ┌───┬───╦═══╦───┬───╦═══╗ │
├─▶│SortPreservingMerge│───▶│ │ A │ B ║ B ║ C │ D ║ E ║ ... │
│ │ │ │ └───┴─▲─╩═══╩───┴───╩═══╝ │
┌─────────────────────────┐ │ └───────────────────┘ └─┬─────┴───────────────────────┘
│ ╔═══╦═══╗ │ │
│ ║ B ║ E ║ ... │──┘ │
│ ╚═══╩═══╝ │ Stable sort if `enable_round_robin_repartition=false`:
└─────────────────────────┘ the merged stream places equal rows from stream 1
Stream 2
Input Partitions Output Partition
(sorted) (sorted)§Error Handling
If any of the input partitions return an error, the error is propagated to the output and inputs are not polled again.
Fields§
§input: Arc<dyn ExecutionPlan>§expr: LexOrdering§metrics: ExecutionPlanMetricsSet§fetch: Option<usize>§cache: PlanProperties§enable_round_robin_repartition: boolImplementations§
Source§impl SortPreservingMergeExec
impl SortPreservingMergeExec
Sourcepub fn new(
expr: LexOrdering,
input: Arc<dyn ExecutionPlan>,
) -> SortPreservingMergeExec
pub fn new( expr: LexOrdering, input: Arc<dyn ExecutionPlan>, ) -> SortPreservingMergeExec
Create a new sort execution plan
Sourcepub fn with_fetch(self, fetch: Option<usize>) -> SortPreservingMergeExec
pub fn with_fetch(self, fetch: Option<usize>) -> SortPreservingMergeExec
Sets the number of rows to fetch
Sourcepub fn with_round_robin_repartition(
self,
enable_round_robin_repartition: bool,
) -> SortPreservingMergeExec
pub fn with_round_robin_repartition( self, enable_round_robin_repartition: bool, ) -> SortPreservingMergeExec
Sets the selection strategy of tied winners of the loser tree algorithm
If true (the default) equal output rows are placed in the merged stream in round robin fashion. This approach consumes input streams at more even rates when there are many rows with the same sort key.
If false, equal output rows are always placed in the merged stream in the order of the inputs, resulting in potentially slower execution but a stable output order.
Sourcepub fn input(&self) -> &Arc<dyn ExecutionPlan>
pub fn input(&self) -> &Arc<dyn ExecutionPlan>
Input schema
Sourcepub fn expr(&self) -> &LexOrdering
pub fn expr(&self) -> &LexOrdering
Sort expressions
Trait Implementations§
Source§impl Clone for SortPreservingMergeExec
impl Clone for SortPreservingMergeExec
Source§fn clone(&self) -> SortPreservingMergeExec
fn clone(&self) -> SortPreservingMergeExec
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for SortPreservingMergeExec
impl Debug for SortPreservingMergeExec
Source§impl DisplayAs for SortPreservingMergeExec
impl DisplayAs for SortPreservingMergeExec
Source§impl ExecutionPlan for SortPreservingMergeExec
impl ExecutionPlan for SortPreservingMergeExec
Source§fn as_any(&self) -> &(dyn Any + 'static)
fn as_any(&self) -> &(dyn Any + 'static)
Return a reference to Any that can be used for downcasting
Source§fn with_fetch(&self, limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>
fn with_fetch(&self, limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>
Sets the number of rows to fetch
Source§fn try_swapping_with_projection(
&self,
projection: &ProjectionExec,
) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
fn try_swapping_with_projection( &self, projection: &ProjectionExec, ) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
Tries to swap the projection with its input SortPreservingMergeExec.
If this is possible, it returns the new SortPreservingMergeExec whose
child is a projection. Otherwise, it returns None.
Source§fn name(&self) -> &'static str
fn name(&self) -> &'static str
Source§fn properties(&self) -> &PlanProperties
fn properties(&self) -> &PlanProperties
ExecutionPlan, such as output
ordering(s), partitioning information etc. Read moreSource§fn fetch(&self) -> Option<usize>
fn fetch(&self) -> Option<usize>
None means there is no fetch.Source§fn required_input_distribution(&self) -> Vec<Distribution>
fn required_input_distribution(&self) -> Vec<Distribution>
ExecutionPlan, By default it’s [Distribution::UnspecifiedDistribution] for each child,Source§fn benefits_from_input_partitioning(&self) -> Vec<bool>
fn benefits_from_input_partitioning(&self) -> Vec<bool>
ExecutionPlan benefits from increased
parallelization at its input for each child. Read moreSource§fn required_input_ordering(&self) -> Vec<Option<OrderingRequirements>>
fn required_input_ordering(&self) -> Vec<Option<OrderingRequirements>>
ExecutionPlan. Read moreSource§fn maintains_input_order(&self) -> Vec<bool>
fn maintains_input_order(&self) -> Vec<bool>
false if this ExecutionPlan’s implementation may reorder
rows within or between partitions. Read moreSource§fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>
fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>
ExecutionPlans that act as inputs to this plan.
The returned list will be empty for leaf nodes such as scans, will contain
a single value for unary nodes, or two values for binary nodes (such as
joins).Source§fn with_new_children(
self: Arc<SortPreservingMergeExec>,
children: Vec<Arc<dyn ExecutionPlan>>,
) -> Result<Arc<dyn ExecutionPlan>, DataFusionError>
fn with_new_children( self: Arc<SortPreservingMergeExec>, children: Vec<Arc<dyn ExecutionPlan>>, ) -> Result<Arc<dyn ExecutionPlan>, DataFusionError>
ExecutionPlan where all existing children were replaced
by the children, in orderSource§fn execute(
&self,
partition: usize,
context: Arc<TaskContext>,
) -> Result<Pin<Box<dyn RecordBatchStream<Item = Result<RecordBatch, DataFusionError>> + Send>>, DataFusionError>
fn execute( &self, partition: usize, context: Arc<TaskContext>, ) -> Result<Pin<Box<dyn RecordBatchStream<Item = Result<RecordBatch, DataFusionError>> + Send>>, DataFusionError>
Source§fn metrics(&self) -> Option<MetricsSet>
fn metrics(&self) -> Option<MetricsSet>
Metrics for this
ExecutionPlan. If no Metrics are available, return None. Read moreSource§fn statistics(&self) -> Result<Statistics, DataFusionError>
fn statistics(&self) -> Result<Statistics, DataFusionError>
partition_statistics method insteadExecutionPlan node. If statistics are not
available, should return Statistics::new_unknown (the default), not
an error. Read moreSource§fn partition_statistics(
&self,
_partition: Option<usize>,
) -> Result<Statistics, DataFusionError>
fn partition_statistics( &self, _partition: Option<usize>, ) -> Result<Statistics, DataFusionError>
ExecutionPlan node.
If statistics are not available, should return Statistics::new_unknown
(the default), not an error.
If partition is None, it returns statistics for the entire plan.Source§fn supports_limit_pushdown(&self) -> bool
fn supports_limit_pushdown(&self) -> bool
Source§fn static_name() -> &'static strwhere
Self: Sized,
fn static_name() -> &'static strwhere
Self: Sized,
name but can be called without an instance.Source§fn check_invariants(&self, check: InvariantLevel) -> Result<(), DataFusionError>
fn check_invariants(&self, check: InvariantLevel) -> Result<(), DataFusionError>
Source§fn reset_state(
self: Arc<Self>,
) -> Result<Arc<dyn ExecutionPlan>, DataFusionError>
fn reset_state( self: Arc<Self>, ) -> Result<Arc<dyn ExecutionPlan>, DataFusionError>
ExecutionPlan. Read moreSource§fn repartitioned(
&self,
_target_partitions: usize,
_config: &ConfigOptions,
) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
fn repartitioned( &self, _target_partitions: usize, _config: &ConfigOptions, ) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
ExecutionPlan to
produce target_partitions partitions. Read moreSource§fn cardinality_effect(&self) -> CardinalityEffect
fn cardinality_effect(&self) -> CardinalityEffect
Source§fn gather_filters_for_pushdown(
&self,
_phase: FilterPushdownPhase,
parent_filters: Vec<Arc<dyn PhysicalExpr>>,
_config: &ConfigOptions,
) -> Result<FilterDescription, DataFusionError>
fn gather_filters_for_pushdown( &self, _phase: FilterPushdownPhase, parent_filters: Vec<Arc<dyn PhysicalExpr>>, _config: &ConfigOptions, ) -> Result<FilterDescription, DataFusionError>
ExecutionPlan::gather_filters_for_pushdown: Read moreSource§fn handle_child_pushdown_result(
&self,
_phase: FilterPushdownPhase,
child_pushdown_result: ChildPushdownResult,
_config: &ConfigOptions,
) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>, DataFusionError>
fn handle_child_pushdown_result( &self, _phase: FilterPushdownPhase, child_pushdown_result: ChildPushdownResult, _config: &ConfigOptions, ) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>, DataFusionError>
ExecutionPlan::gather_filters_for_pushdown.
It allows the current node to process the results of filter pushdown from
its children, deciding whether to absorb filters, modify the plan, or pass
filters back up to its parent. Read moreAuto Trait Implementations§
impl Freeze for SortPreservingMergeExec
impl !RefUnwindSafe for SortPreservingMergeExec
impl Send for SortPreservingMergeExec
impl Sync for SortPreservingMergeExec
impl Unpin for SortPreservingMergeExec
impl !UnwindSafe for SortPreservingMergeExec
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more