Skip to content

[SPARK-56739][SQL] Normalize CTE ids of orphan CTERelationRef in NormalizeCTEIds#56083

Draft
shrirangmhalgi wants to merge 1 commit into
apache:masterfrom
shrirangmhalgi:SPARK-56739-normalize-orphan-cte-refs
Draft

[SPARK-56739][SQL] Normalize CTE ids of orphan CTERelationRef in NormalizeCTEIds#56083
shrirangmhalgi wants to merge 1 commit into
apache:masterfrom
shrirangmhalgi:SPARK-56739-normalize-orphan-cte-refs

Conversation

@shrirangmhalgi
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Normalize CTE IDs of orphan CTERelationRef nodes in NormalizeCTEIds. Previously, only CTERelationRef nodes inside WithCTE were normalized via canonicalizeCTE. Refs that exist outside any WithCTE (orphans) kept their original IDs.

Why are the changes needed?

After InlineCTE or MergeSubplans runs, some CTERelationRef nodes can end up outside their parent WithCTE node. When NormalizeCTEIds processes the plan, these orphan refs are skipped, leaving non-normalized IDs. This breaks plan comparison and caching because the same logical plan gets different CTE IDs across sessions (since CTERelationDef uses a global monotonically increasing counter).

Does this PR introduce any user-facing change?

No. This is an internal plan normalization fix that affects plan caching correctness.

How was this patch tested?

Added NormalizeCTEIdsSuite with a test that constructs a plan with a CTERelationRef outside WithCTE and verifies all ref IDs are normalized. Without the fix, the orphan ref retains its original ID (100); with the fix, it's normalized to 0.

Was this patch authored or co-authored using generative AI tooling?

Yes.

@shrirangmhalgi shrirangmhalgi marked this pull request as draft May 24, 2026 04:25
…alizeCTEIds

CTERelationRef nodes that exist outside any WithCTE node (e.g., after InlineCTE or MergeSubplans removes the parent WithCTE) were not getting their IDs normalized. This causes plan comparison and caching to fail because the same logical plan gets different CTE IDs across sessions.

Add a case in applyInternal to normalize orphan CTERelationRefs whose cteId is already in the mapping.
@shrirangmhalgi shrirangmhalgi force-pushed the SPARK-56739-normalize-orphan-cte-refs branch from 4d36b6e to 59ebea5 Compare May 24, 2026 04:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant