Expand description
Join related functionality used both on logical and physical plans
Re-exportsΒ§
StructsΒ§
- Batch
Splitter π - Splits large batches into smaller batches with a maximum number of rows.
- Build
Probe πJoin Metrics - Metrics for build & probe joins
- Column
Index - Information about the index and placement (left or right) of the columns
- Join
Filter - Filter applied before join output. Fields are crate-public to allow downstream implementations to experiment with custom joins.
- Noop
Batch πTransformer - A batch transformer that does nothing.
- Once
Async π - A
OnceAsyncruns anasyncclosure once, where multiple calls toOnceAsync::try_oncereturn aOnceFutthat resolves to the result of the same computation. - OnceFut π
- A
OnceFutrepresents a shared asynchronous computation, that will be evaluated once for allCloneβs, withOnceFut::getproviding a non-consuming interface to drive the underlyingFutureto completion - Partial
Join πStatistics - A shared state between statistic aggregators for a join operation.
EnumsΒ§
- Once
FutState π - Stateful
Stream Result - Represents the result of a stateful operation.
TraitsΒ§
- Batch
Transformer π - Trait for incrementally generating Join output.
- Join
Hash MapType - Maps a
u64hash value based on the build side [βonβ values] to a list of indices with this keyβs value.
FunctionsΒ§
- adjust_
indices_ πby_ join_ type - The input is the matched indices for left and right and adjust the indices according to the join type
- adjust_
right_ output_ partitioning - Adjust the right out partitioning to new Column Index
- append_
probe_ πindices_ in_ order - Appends probe indices in order by considering the given build indices.
- append_
right_ πindices - Appends right indices to left indices based on the specified order mode.
- apply_
join_ πfilter_ to_ indices - asymmetric_
join_ πoutput_ partitioning - build_
batch_ πempty_ build_ side - Returns a new [RecordBatch] resulting of a join where the build/left side is empty.
The resulting batch has [Schema]
schema. - build_
batch_ πfrom_ indices - Returns a new [RecordBatch] by combining the
leftandrightaccording toindices. The resulting batch has [Schema]schema. - build_
join_ schema - Creates a schema for a join operation. The fields from the left side are first
- build_
range_ πbitmap - calculate_
join_ output_ ordering - Calculate the output ordering of a given join operation.
- check_
join_ is_ valid - Checks whether the schemas βleftβ and βrightβ and columns βonβ represent a valid join.
They are valid whenever their columnsβ intersection equals the set
on - check_
join_ πset_ is_ valid - Checks whether the sets left, right and on compose a valid join.
They are valid whenever their intersection equals the set
on - compare_
join_ arrays - Get comparison result of two rows of join arrays
- eq_
dyn_ πnull - equal_
rows_ πarr - estimate_
disjoint_ πinputs - Estimates if inputs are non-overlapping, using input statistics. If inputs are disjoint, returns zero estimation, otherwise returns None
- estimate_
inner_ πjoin_ cardinality - Estimate the inner join cardinality by using the basic building blocks of column-level statistics and the total row count. This is a very naive and a very conservative implementation that can quickly give up if there is not enough input statistics.
- estimate_
join_ πcardinality - estimate_
join_ πstatistics - Estimate the statistics for the given joinβs output.
- get_
anti_ πindices - Returns
rangeindices which are not present ininput_indices - get_
final_ πindices_ from_ bit_ map - In the end of join execution, need to use bit map of the matched indices to generate the final left and right indices.
- get_
final_ πindices_ from_ shared_ bitmap - get_
mark_ πindices - get_
semi_ πindices - Returns intersection of
rangeandinput_indicesomitting duplicates - max_
distinct_ πcount - Estimate the number of maximum distinct values that can be present in the given column from its statistics. If distinct_count is available, uses it directly. Otherwise, if the column is numeric and has min/max values, it estimates the maximum distinct count from those. Otherwise, the num_rows is used.
- need_
produce_ πresult_ in_ final - Some type
join_typeof join need to maintain the matched indices bit map for the left side, and use the bit map to generate the part of result of the join. - need_
produce_ πright_ in_ final - Should we use a bitmap to track each incoming right batchβs each rowβs βjoinedβ status.
- output_
join_ πfield - Returns the output field given the input field. Outer joins may insert nulls even if the input was not null
- reorder_
output_ after_ swap - When the order of the join inputs are changed, the output order of columns must remain the same.
- swap_
join_ projection - This function swaps the given joinβs projection.
- swap_
reverting_ πprojection - When the order of the join is changed, the output order of columns must remain the same.
- symmetric_
join_ πoutput_ partitioning - update_
hash - Updates
hash_mapwith new entries frombatchevaluated against the expressionsonusingoffsetas a start value forbatchrow indices.
Type AliasesΒ§
- Once
FutPending π - The shared future type used internally within
OnceAsync