Expand description
EnforceDistribution optimizer rule inspects the physical plan with respect
to distribution requirements and adds RepartitionExecs to satisfy them
when necessary. If increasing parallelism is beneficial (and also desirable
according to the configuration), this rule increases partition counts in
the physical plan.
StructsΒ§
- Enforce
Distribution - The
EnforceDistributionrule ensures that distribution requirements are met. In doing so, this rule will increase the parallelism in the plan by introducing repartitioning operators to the physical plan. - Join
KeyPairs π - Repartition
Requirement πStatus - A struct to keep track of repartition requirements for each child node.
FunctionsΒ§
- add_
hash_ πon_ top - Adds a hash repartition operator:
- add_
merge_ πon_ top - Adds a
SortPreservingMergeExecor aCoalescePartitionsExecoperator on top of the given plan node to satisfy a single partition requirement while preserving ordering constraints. - add_
roundrobin_ πon_ top - Adds RoundRobin repartition operator to the plan increase parallelism.
- adjust_
input_ keys_ ordering - When the physical planner creates the Joins, the ordering of join keys is from the original query. That might not match with the output partitioning of the join nodeβs children A Top-Down process will use this method to adjust childrenβs output partitioning based on the parent key reordering requirements:
- ensure_
distribution - This function checks whether we need to add additional data exchange operators to satisfy distribution requirements. Since this function takes care of such requirements, we should avoid manually adding data exchange operators in other places.
- expected_
expr_ πpositions - Return the expected expressions positions. For example, the current expressions are [βcβ, βaβ, βaβ, bβ], the expected expressions are [βbβ, βcβ, βaβ, βaβ],
- extract_
join_ πkeys - get_
repartition_ πrequirement_ status - Calculates the
RepartitionRequirementStatusfor each children to generate consistent and sensible (in terms of performance) distribution requirements. As an example, a hash joinβs left (build) child might produce - new_
join_ πconditions - remove_
dist_ πchanging_ operators - Updates the physical plan inside
DistributionContextso that distribution changing operators are removed from the top. If they are necessary, they will be added in subsequent stages. - reorder_
aggregate_ keys - reorder_
current_ πjoin_ keys - Reorder the current join keys ordering based on either left partition or right partition
- reorder_
join_ keys_ to_ inputs - When the physical planner creates the Joins, the ordering of join keys is from the original query. That might not match with the output partitioning of the join nodeβs children This method will try to change the ordering of the join keys to match with the partitioning of the join nodesβ children. If it can not match with both sides, it will try to match with one, either the left side or the right side.
- reorder_
partitioned_ join_ keys - replace_
order_ preserving_ variants - Updates the
DistributionContextif preserving ordering while changing partitioning is not helpful or desirable. - shift_
right_ πrequired - try_
reorder π - update_
children π
Type AliasesΒ§
- Distribution
Context - Keeps track of distribution changing operators (like
RepartitionExec,SortPreservingMergeExec,CoalescePartitionsExec) and their ancestors. Using this information, we can optimize distribution of the plan if/when necessary. - Plan
With KeyRequirements - Keeps track of parent required key orderings.