pub(super) fn normalize_union_schema(plan: &LogicalPlan) -> Result<LogicalPlan>Expand description
Normalize the schema of a union plan to remove qualifiers from the schema fields and sort expressions.
DataFusion will return an error if two columns in the schema have the same name with no table qualifiers. There are certain types of UNION queries that can result in having two columns with the same name, and the solution was to add table qualifiers to the schema fields. See https://github.com/apache/datafusion/issues/5410 for more context on this decision.
However, this causes a problem when unparsing these queries back to SQL - as the table qualifier has logically been erased and is no longer a valid reference.
The following input SQL:
SELECT table1.foo FROM table1
UNION ALL
SELECT table2.foo FROM table2
ORDER BY fooWould be unparsed into the following invalid SQL without this transformation:
SELECT table1.foo FROM table1
UNION ALL
SELECT table2.foo FROM table2
ORDER BY table1.fooWhich would result in a SQL error, as table1.foo is not a valid reference in the context of the UNION.