[CALCITE-7514] MultiJoinOptimizeBushyRule throws AssertionError when a join condition references 3 or more factors#4934
Open
sbroeder wants to merge 1 commit into
Open
[CALCITE-7514] MultiJoinOptimizeBushyRule throws AssertionError when a join condition references 3 or more factors#4934sbroeder wants to merge 1 commit into
sbroeder wants to merge 1 commit into
Conversation
tmater
reviewed
May 12, 2026
mihaibudiu
approved these changes
May 12, 2026
Contributor
mihaibudiu
left a comment
There was a problem hiding this comment.
If there are no other comments, let's merge this.
Please squash the commits to a single one.
tmater
approved these changes
May 12, 2026
xuzifu666
approved these changes
May 13, 2026
…a join condition references 3 or more factors Conditions in a MultiJoin's joinFilters that reference anything other than exactly two factors cannot be represented as binary join edges. Passing such a condition to createEdge produced an edge with factors.cardinality() != 2, causing an AssertionError in the edge comparator's rowCountDiff method, and at two further assertion sites in the greedy loop. The fix separates these conditions from the edge list upfront. After the greedy join-ordering loop completes, the remaining conditions are remapped from original MultiJoin field positions to the final join tree's output positions via RexPermuteInputsShuttle, then applied as a LogicalFilter above the join tree before the reordering project. For inner joins this is semantically equivalent to applying them as join predicates. Two TODO items are resolved: - "Join conditions that touch 3 factors" is fully handled. - "More than 1 join conditions that touch the same pair of factors" was stale from the original commit; the conditions loop already collects all edges subsumed by newFactors at each greedy step. A remaining TODO notes that 1-factor conditions are applied as a filter above the join tree rather than pushed down to the individual scan.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Jira Link
CALCITE-7514
Changes Proposed
MultiJoinOptimizeBushyRulecrashes with anAssertionErrorwhen aMultiJoin's join filters contain a condition that references anythingother than exactly two factors (e.g. a CASE expression spanning three
tables). Such conditions cannot be represented as binary join edges, so
passing them to
createEdgeproduced an edge withfactors.cardinality() != 2, which violated assertions in the edgecomparator and the greedy ordering loop.
Fix: conditions that do not touch exactly two factors are separated
from the edge list before the greedy loop runs. After the join tree is
built, these conditions are remapped from the original
MultiJoinfieldpositions to the output positions of the final join tree using
RexPermuteInputsShuttle, then applied as aLogicalFilterabove thejoin tree (before the reordering project). For inner joins this is
semantically equivalent to applying them as join predicates.
This also resolves two long-standing TODOs in the class Javadoc:
crash, correct result); optimal push-down to the scan is left as a
future improvement.
Reproduction: the following query previously threw
AssertionError: