EquivalenceProperties

Struct EquivalenceProperties 

Source
pub struct EquivalenceProperties {
    eq_group: EquivalenceGroup,
    oeq_class: OrderingEquivalenceClass,
    oeq_cache: OrderingEquivalenceCache,
    constraints: Constraints,
    schema: SchemaRef,
}
Expand description

EquivalenceProperties stores information about the output of a plan node that can be used to optimize the plan. Currently, it keeps track of:

  • Sort expressions (orderings),
  • Equivalent expressions; i.e. expressions known to have the same value.
  • Constants expressions; i.e. expressions known to contain a single constant value.

Please see the Using Ordering for Better Plans blog for more details.

§Example equivalent sort expressions

Consider table below:

┌-------┐
| a | b |
|---|---|
| 1 | 9 |
| 2 | 8 |
| 3 | 7 |
| 5 | 5 |
└---┴---┘

In this case, both a ASC and b DESC can describe the table ordering. EquivalenceProperties tracks these different valid sort expressions and treat a ASC and b DESC on an equal footing. For example, if the query specifies the output sorted by EITHER a ASC or b DESC, the sort can be avoided.

§Example equivalent expressions

Similarly, consider the table below:

┌-------┐
| a | b |
|---|---|
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 5 | 5 |
└---┴---┘

In this case, columns a and b always have the same value. With this information, Datafusion can optimize various operations. For example, if the partition requirement is Hash(a) and output partitioning is Hash(b), then DataFusion avoids repartitioning the data as the existing partitioning satisfies the requirement.

§Code Example

use datafusion_physical_expr_common::sort_expr::{LexOrdering, PhysicalSortExpr};
// This object represents data that is sorted by a ASC, c DESC
// with a single constant value of b
let mut eq_properties = EquivalenceProperties::new(schema);
eq_properties.add_constants(vec![ConstExpr::from(col_b)]);
eq_properties.add_ordering([
    PhysicalSortExpr::new_default(col_a).asc(),
    PhysicalSortExpr::new_default(col_c).desc(),
]);

assert_eq!(
    eq_properties.to_string(),
    "order: [[a@0 ASC, c@2 DESC]], eq: [{members: [b@1], constant: (heterogeneous)}]"
);

Fields§

§eq_group: EquivalenceGroup

Distinct equivalence classes (i.e. expressions with the same value).

§oeq_class: OrderingEquivalenceClass

Equivalent sort expressions (i.e. those define the same ordering).

§oeq_cache: OrderingEquivalenceCache

Cache storing equivalent sort expressions in normal form (i.e. without constants/duplicates and in standard form) and a map associating leading terms with full sort expressions.

§constraints: Constraints

Table constraints that factor in equivalence calculations.

§schema: SchemaRef

Schema associated with this object.

Implementations§

Source§

impl EquivalenceProperties

Source

pub fn new(schema: SchemaRef) -> Self

Creates an empty EquivalenceProperties object.

Source

pub fn with_constraints(self, constraints: Constraints) -> Self

Adds constraints to the properties.

Source

pub fn new_with_orderings( schema: SchemaRef, orderings: impl IntoIterator<Item = impl IntoIterator<Item = PhysicalSortExpr>>, ) -> Self

Creates a new EquivalenceProperties object with the given orderings.

Source

pub fn schema(&self) -> &SchemaRef

Returns the associated schema.

Source

pub fn oeq_class(&self) -> &OrderingEquivalenceClass

Returns a reference to the ordering equivalence class within.

Source

pub fn eq_group(&self) -> &EquivalenceGroup

Returns a reference to the equivalence group within.

Source

pub fn constraints(&self) -> &Constraints

Returns a reference to the constraints within.

Source

pub fn constants(&self) -> Vec<ConstExpr>

Returns all the known constants expressions.

Source

pub fn output_ordering(&self) -> Option<LexOrdering>

Returns the output ordering of the properties.

Source

pub fn extend(self, other: Self) -> Result<Self>

Extends this EquivalenceProperties with the other object.

Source

pub fn clear_orderings(&mut self)

Clears (empties) the ordering equivalence class within this object. Call this method when existing orderings are invalidated.

Source

pub fn clear_per_partition_constants(&mut self)

Removes constant expressions that may change across partitions. This method should be used when merging data from different partitions.

Source

pub fn add_orderings( &mut self, orderings: impl IntoIterator<Item = impl IntoIterator<Item = PhysicalSortExpr>>, )

Adds new orderings into the existing ordering equivalence class.

Source

pub fn add_ordering( &mut self, ordering: impl IntoIterator<Item = PhysicalSortExpr>, )

Adds a single ordering to the existing ordering equivalence class.

Source

fn update_oeq_cache(&mut self) -> Result<()>

Source

pub fn add_equivalence_group( &mut self, other_eq_group: EquivalenceGroup, ) -> Result<()>

Incorporates the given equivalence group to into the existing equivalence group within.

Source

pub fn normalized_oeq_class(&self) -> OrderingEquivalenceClass

Returns the ordering equivalence class within in normal form. Normalization standardizes expressions according to the equivalence group within, and removes constants/duplicates.

Source

pub fn add_equal_conditions( &mut self, left: Arc<dyn PhysicalExpr>, right: Arc<dyn PhysicalExpr>, ) -> Result<()>

Adds a new equality condition into the existing equivalence group. If the given equality defines a new equivalence class, adds this new equivalence class to the equivalence group.

Source

pub fn add_constants( &mut self, constants: impl IntoIterator<Item = ConstExpr>, ) -> Result<()>

Track/register physical expressions with constant values.

Source

fn discover_new_orderings( &mut self, normal_expr: Arc<dyn PhysicalExpr>, ) -> Result<()>

Discover new valid orderings in light of a new equality. Accepts a single argument (expr) which is used to determine the orderings to update. When constants or equivalence classes change, there may be new orderings that can be discovered with the new equivalence properties. For a discussion, see: https://github.com/apache/datafusion/issues/9812

Source

pub fn reorder( &mut self, ordering: impl IntoIterator<Item = PhysicalSortExpr>, ) -> Result<bool>

Updates the ordering equivalence class within assuming that the table is re-sorted according to the argument ordering, and returns whether this operation resulted in any change. Note that equivalence classes (and constants) do not change as they are unaffected by a re-sort. If the given ordering is already satisfied, the function does nothing.

Source

pub fn normalize_sort_exprs( &self, sort_exprs: impl IntoIterator<Item = PhysicalSortExpr>, ) -> Option<LexOrdering>

Normalizes the given sort expressions (i.e. sort_exprs) using the equivalence group within. Returns a LexOrdering instance if the expressions define a proper lexicographical ordering. For more details, see EquivalenceGroup::normalize_sort_exprs.

Source

pub fn normalize_sort_requirements( &self, sort_reqs: impl IntoIterator<Item = PhysicalSortRequirement>, ) -> Option<LexRequirement>

Normalizes the given sort requirements (i.e. sort_reqs) using the equivalence group within. Returns a LexRequirement instance if the expressions define a proper lexicographical requirement. For more details, see EquivalenceGroup::normalize_sort_exprs.

Source

pub fn ordering_satisfy( &self, given: impl IntoIterator<Item = PhysicalSortExpr>, ) -> Result<bool>

Iteratively checks whether the given ordering is satisfied by any of the existing orderings. See Self::ordering_satisfy_requirement for more details and examples.

Source

pub fn ordering_satisfy_requirement( &self, given: impl IntoIterator<Item = PhysicalSortRequirement>, ) -> Result<bool>

Iteratively checks whether the given sort requirement is satisfied by any of the existing orderings.

§Example Scenarios

In these scenarios, assume that all expressions share the same sort properties.

§Case 1: Sort Requirement [a, c]

Existing orderings: [[a, b, c], [a, d]], constants: []

  1. The function first checks the leading requirement a, which is satisfied by [a, b, c].first().
  2. a is added as a constant for the next iteration.
  3. Normal orderings become [[b, c], [d]].
  4. The function fails for c in the second iteration, as neither [b, c] nor [d] satisfies c.
§Case 2: Sort Requirement [a, d]

Existing orderings: [[a, b, c], [a, d]], constants: []

  1. The function first checks the leading requirement a, which is satisfied by [a, b, c].first().
  2. a is added as a constant for the next iteration.
  3. Normal orderings become [[b, c], [d]].
  4. The function returns true as [d] satisfies d.
Source

fn common_sort_prefix_length( &self, normal_ordering: &LexOrdering, ) -> Result<usize>

Returns the number of consecutive sort expressions (starting from the left) that are satisfied by the existing ordering.

Source

pub fn extract_common_sort_prefix( &self, ordering: LexOrdering, ) -> Result<(Vec<PhysicalSortExpr>, bool)>

Determines the longest normal prefix of ordering satisfied by the existing ordering. Returns that prefix as a new LexOrdering, and a boolean indicating whether all the sort expressions are satisfied.

Source

fn satisfied_by_constraints_ordering( &self, normal_exprs: &[PhysicalSortExpr], ) -> bool

Checks if the sort expressions are satisfied by any of the table constraints (primary key or unique). Returns true if any constraint fully satisfies the expressions (i.e. constraint indices form a valid prefix of an existing ordering that matches the expressions). For unique constraints, also verifies nullable columns.

Source

fn satisfied_by_constraints( &self, normal_reqs: &[PhysicalSortRequirement], ) -> bool

Checks if the sort requirements are satisfied by any of the table constraints (primary key or unique). Returns true if any constraint fully satisfies the requirements (i.e. constraint indices form a valid prefix of an existing ordering that matches the requirements). For unique constraints, also verifies nullable columns.

Source

pub fn requirements_compatible( &self, given: LexRequirement, reference: LexRequirement, ) -> bool

Checks whether the given sort requirements are equal or more specific than the reference sort requirements.

Source

fn substitute_oeq_class( schema: &SchemaRef, mapping: &ProjectionMapping, oeq_class: OrderingEquivalenceClass, ) -> OrderingEquivalenceClass

Modify existing orderings by substituting sort expressions with appropriate targets from the projection mapping. We substitute a sort expression when its physical expression has a one-to-one functional relationship with a target expression in the mapping.

After substitution, we may generate more than one LexOrdering for each existing equivalent ordering. For example, [a ASC, b ASC] will turn into [CAST(a) ASC, b ASC] and [a ASC, b ASC] when applying projection expressions a, b, CAST(a).

TODO: Handle all scenarios that allow substitution; e.g. when x is sorted, atan(x + 1000) should also be substituted. For now, we only consider single-column CAST expressions.

Source

pub fn project_expr( &self, expr: &Arc<dyn PhysicalExpr>, mapping: &ProjectionMapping, ) -> Option<Arc<dyn PhysicalExpr>>

Projects argument expr according to the projection described by mapping, taking equivalences into account.

For example, assume that columns a and c are always equal, and that the projection described by mapping encodes the following:

a -> a1
b -> b1

Then, this function projects a + b to Some(a1 + b1), c + b to Some(a1 + b1) and d to None, meaning that it is not projectable.

Source

pub fn project_expressions<'a>( &'a self, expressions: impl IntoIterator<Item = &'a Arc<dyn PhysicalExpr>> + 'a, mapping: &'a ProjectionMapping, ) -> impl Iterator<Item = Option<Arc<dyn PhysicalExpr>>> + 'a

Projects the given expressions according to the projection described by mapping, taking equivalences into account. This function is similar to Self::project_expr, but projects multiple expressions at once more efficiently than calling project_expr for each expression.

Source

fn construct_dependency_map( &self, oeq_class: OrderingEquivalenceClass, mapping: &ProjectionMapping, ) -> DependencyMap

Constructs a dependency map based on existing orderings referred to in the projection.

This function analyzes the orderings in the normalized order-equivalence class and builds a dependency map. The dependency map captures relationships between expressions within the orderings, helping to identify dependencies and construct valid projected orderings during projection operations.

§Parameters
  • mapping: A reference to the ProjectionMapping that defines the relationship between source and target expressions.
§Returns

A DependencyMap representing the dependency map, where each [DependencyNode] contains dependencies for the key PhysicalSortExpr.

§Example

Assume we have two equivalent orderings: [a ASC, b ASC] and [a ASC, c ASC], and the projection mapping is [a -> a_new, b -> b_new, b + c -> b + c]. Then, the dependency map will be:

a ASC: Node {Some(a_new ASC), HashSet{}}
b ASC: Node {Some(b_new ASC), HashSet{a ASC}}
c ASC: Node {None, HashSet{a ASC}}
Source

fn insert_to_dependency_map( &self, mapping: &ProjectionMapping, sort_expr: PhysicalSortExpr, dependency: Option<PhysicalSortExpr>, map: &mut DependencyMap, ) -> bool

Projects the sort expression according to the projection mapping and inserts it into the dependency map with the given dependency. Returns a boolean flag indicating whether the given expression is projectable.

Source

fn normalize_mapping(&self, mapping: &ProjectionMapping) -> ProjectionMapping

Returns a new ProjectionMapping where source expressions are in normal form. Normalization ensures that source expressions are transformed into a consistent representation, which is beneficial for algorithms that rely on exact equalities, as it allows for more precise and reliable comparisons.

§Parameters
  • mapping: A reference to the original ProjectionMapping to normalize.
§Returns

A new ProjectionMapping with source expressions in normal form.

Source

fn projected_orderings( &self, mapping: &ProjectionMapping, oeq_class: OrderingEquivalenceClass, ) -> Vec<LexOrdering>

Computes projected orderings based on a given projection mapping.

This function takes a ProjectionMapping and computes the possible orderings for the projected expressions. It considers dependencies between expressions and generates valid orderings according to the specified sort properties.

§Parameters
  • mapping: A reference to the ProjectionMapping that defines the relationship between source and target expressions.
  • oeq_class: The OrderingEquivalenceClass containing the orderings to project.
§Returns

A vector of all valid (but not in normal form) orderings after projection.

Source

fn projected_constraints( &self, mapping: &ProjectionMapping, ) -> Option<Constraints>

Projects constraints according to the given projection mapping.

This function takes a projection mapping and extracts column indices of target columns. It then projects the constraints to only include relationships between columns that exist in the projected output.

§Parameters
  • mapping - A reference to the ProjectionMapping that defines the projection operation.
§Returns

Returns an optional Constraints object containing only the constraints that are valid for the projected columns (if any exists).

Source

pub fn project( &self, mapping: &ProjectionMapping, output_schema: SchemaRef, ) -> Self

Projects the equivalences within according to mapping and output_schema.

Source

pub fn find_longest_permutation( &self, exprs: &[Arc<dyn PhysicalExpr>], ) -> Result<(Vec<PhysicalSortExpr>, Vec<usize>)>

Returns the longest (potentially partial) permutation satisfying the existing ordering. For example, if we have the equivalent orderings [a ASC, b ASC] and [c DESC], with exprs containing [c, b, a, d], then this function returns ([a ASC, b ASC, c DESC], [2, 1, 0]). This means that the specification [a ASC, b ASC, c DESC] is satisfied by the existing ordering, and [a, b, c] resides at indices: 2, 1, 0 inside the argument exprs (respectively). For the mathematical definition of “partial permutation”, see:

https://en.wikipedia.org/wiki/Permutation#k-permutations_of_n

Source

pub fn is_expr_constant( &self, expr: &Arc<dyn PhysicalExpr>, ) -> Option<AcrossPartitions>

This function determines whether the provided expression is constant based on the known constants. For example, if columns a and b are constant, then expressions a, b and a + b will all return true whereas expression c will return false.

§Parameters
  • expr: A reference to a Arc<dyn PhysicalExpr> representing the expression to be checked.
§Returns

Returns a Some value if the expression is constant according to equivalence group, and None otherwise. The Some variant contains an AcrossPartitions value indicating whether the expression is constant across partitions, and its actual value (if available).

Source

pub fn get_expr_properties(&self, expr: Arc<dyn PhysicalExpr>) -> ExprProperties

Retrieves the properties for a given physical expression.

This function constructs an [ExprProperties] object for the given expression, which encapsulates information about the expression’s properties, including its [SortProperties] and [Interval].

§Parameters
  • expr: An Arc<dyn PhysicalExpr> representing the physical expression for which ordering information is sought.
§Returns

Returns an [ExprProperties] object containing the ordering and range information for the given expression.

Source

pub fn with_new_schema(self, schema: SchemaRef) -> Result<Self>

Transforms this EquivalenceProperties by mapping columns in the original schema to columns in the new schema by index.

Trait Implementations§

Source§

impl Clone for EquivalenceProperties

Source§

fn clone(&self) -> EquivalenceProperties

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for EquivalenceProperties

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Display for EquivalenceProperties

More readable display version of the EquivalenceProperties.

Format:

order: [[b@1 ASC NULLS LAST]], eq: [{members: [a@0], constant: (heterogeneous)}]
Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl From<EquivalenceProperties> for OrderingEquivalenceClass

Source§

fn from(eq_properties: EquivalenceProperties) -> Self

Converts to this type from the input type.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

§

impl<T> Instrument for T

§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided [Span], returning an Instrumented wrapper. Read more
§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
§

impl<T> PolicyExt for T
where T: ?Sized,

§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] only if self and other return Action::Follow. Read more
§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T> ToString for T
where T: Display + ?Sized,

Source§

fn to_string(&self) -> String

Converts the given value to a String. Read more
§

impl<T> ToStringFallible for T
where T: Display,

§

fn try_to_string(&self) -> Result<String, TryReserveError>

ToString::to_string, but without panic on OOM.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V

§

impl<T> WithSubscriber for T

§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

impl<T> ErasedDestructor for T
where T: 'static,