normalize_union_schema

Function normalize_union_schema 

Source
pub(super) fn normalize_union_schema(plan: &LogicalPlan) -> Result<LogicalPlan>
Expand description

Normalize the schema of a union plan to remove qualifiers from the schema fields and sort expressions.

DataFusion will return an error if two columns in the schema have the same name with no table qualifiers. There are certain types of UNION queries that can result in having two columns with the same name, and the solution was to add table qualifiers to the schema fields. See https://github.com/apache/datafusion/issues/5410 for more context on this decision.

However, this causes a problem when unparsing these queries back to SQL - as the table qualifier has logically been erased and is no longer a valid reference.

The following input SQL:

SELECT table1.foo FROM table1
UNION ALL
SELECT table2.foo FROM table2
ORDER BY foo

Would be unparsed into the following invalid SQL without this transformation:

SELECT table1.foo FROM table1
UNION ALL
SELECT table2.foo FROM table2
ORDER BY table1.foo

Which would result in a SQL error, as table1.foo is not a valid reference in the context of the UNION.