list_unnest_at_level

Function list_unnest_at_level 

Source
fn list_unnest_at_level(
    batch: &[ArrayRef],
    list_type_unnests: &[ListUnnest],
    temp_unnested_arrs: &mut HashMap<ListUnnest, ArrayRef>,
    level_to_unnest: usize,
    options: &UnnestOptions,
) -> Result<Option<Vec<ArrayRef>>>
Expand description

This function is used to execute the unnesting on multiple columns all at once, but one level at a time, and is called n times, where n is the highest recursion level among the unnest exprs in the query.

For example giving the following query:

select unnest(colA, max_depth:=3) as P1, unnest(colA,max_depth:=2) as P2, unnest(colB, max_depth:=1) as P3 from temp;

Then the total times this function being called is 3

It needs to be aware of which level the current unnesting is, because if there exists multiple unnesting on the same column, but with different recursion levels, say unnest(colA, max_depth:=3) and unnest(colA, max_depth:=2), then the unnesting of expr unnest(colA, max_depth:=3) will start at level 3, while unnesting for expr unnest(colA, max_depth:=2) has to start at level 2

Set colA as a 3-dimension columns and colB as an array (1-dimension). As stated, this function is called with the descending order of recursion depth

Depth = 3

  • colA(3-dimension) unnest into temp column temp_P1(2_dimension) (unnesting of P1 starts from this level)
  • colA(3-dimension) having indices repeated by the unnesting operation above
  • colB(1-dimension) having indices repeated by the unnesting operation above

Depth = 2

  • temp_P1(2-dimension) unnest into temp column temp_P1(1-dimension)
  • colA(3-dimension) unnest into temp column temp_P2(2-dimension) (unnesting of P2 starts from this level)
  • colB(1-dimension) having indices repeated by the unnesting operation above

Depth = 1

  • temp_P1(1-dimension) unnest into P1
  • temp_P2(2-dimension) unnest into P2
  • colB(1-dimension) unnest into P3 (unnesting of P3 starts from this level)

The returned array will has the same size as the input batch and only contains original columns that are not being unnested.