Skip to content

Conversation

@nateab
Copy link
Contributor

@nateab nateab commented Feb 12, 2026

What is the purpose of the change

This pull request fixes FLINK-26051: when combining a ROW_NUMBER() window function with a subsequent query containing CASE WHEN expressions and a WHERE clause, the stream planner throws "The window can only be ordered in ASCENDING mode."

The root cause is a rule ordering problem in the stream planner's optimization phases:

  1. LOGICAL phase (Volcano/cost-based): FlinkCalcMergeRule merges two Calc nodes — the inner ROW_NUMBER filter (WHERE row_num <= 2) and the outer CASE WHEN/WHERE — into a single complex Calc.
  2. LOGICAL_REWRITE phase (HEP/sequential): FlinkLogicalRankRule looks for the pattern FlinkLogicalCalc on FlinkLogicalOverAggregate. By this point, the merged Calc is too complex for the rule to match because the remaining predicates access the rank field, so matches() returns false.
  3. The query falls through to StreamExecOverAggregate which enforces ascending-only ordering, causing the error.

The batch planner already handles this correctly by placing rank rules alongside FlinkCalcMergeRule in the same Volcano phase, so the optimizer explores both alternatives.

The fix adds a new Volcano-safe rank rule variant (FlinkLogicalRankRuleForConstantRangeAllFunctions) to LOGICAL_RULES in the stream planner. This rule handles all SqlRankFunction types (ROW_NUMBER, RANK, DENSE_RANK) with both constant and variable rank ranges. Unlike FlinkLogicalRankRuleForRangeEnd, it does not throw exceptions in matches() — it silently rejects ConstantRankRangeWithoutEnd (rank end not specified) rather than throwing, deferring that error to FlinkLogicalRankRuleForRangeEnd in the later HEP phase where exceptions are properly surfaced to the user.

Placing the rule in the same Volcano phase as FlinkCalcMergeRule allows the optimizer to explore the Rank conversion path before or as an alternative to merging Calcs. The existing FlinkLogicalRankRule.INSTANCE is kept in LOGICAL_REWRITE as a safety net.

Brief change log

  • Added FlinkLogicalRankRuleForConstantRangeAllFunctions: a new rule class that accepts all SqlRankFunction types with constant and variable rank ranges, safe for the Volcano optimizer (no exceptions in matches(), silently rejects ConstantRankRangeWithoutEnd)
  • Added FlinkLogicalRankRule.CONSTANT_RANGE_ALL_FUNCTIONS_INSTANCE to LOGICAL_RULES in FlinkStreamRuleSets, along with CalcRankTransposeRule and ConstantRankNumberColumnRemoveRule, placed before the calc rules block — mirroring the batch planner pattern
  • Added regression test testRowNumberWithCaseWhenAndWhereClause in stream RankTest
  • Updated expected plans for testRankFunctionInMiddle (stream RankTest) and testMultiSameRankFunctionsWithSameGroup (FlinkLogicalRankRuleForRangeEndTest) to reflect cosmetic field naming changes from earlier rank conversion

Verifying this change

This change added a test and can be verified as follows:

  • Added testRowNumberWithCaseWhenAndWhereClause that reproduces the exact bug scenario: ROW_NUMBER() OVER (PARTITION BY a ORDER BY c DESC) with WHERE row_num <= 2, wrapped in a query with CASE WHEN c > 10 THEN 'big' ELSE 'small' END and WHERE b <> 'z'. The optimized plan correctly shows a Rank node instead of OverAggregate.
  • Verified all 76 existing rank-related tests pass: stream RankTest (46 tests), batch RankTest (16 tests), FlinkLogicalRankRuleForRangeEndTest (14 tests)

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no (optimizer rule only, no runtime change)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@flinkbot
Copy link
Collaborator

flinkbot commented Feb 12, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

…ausing window ordering exception

When combining a ROW_NUMBER() query with a subsequent query containing
CASE WHEN expressions and a WHERE clause, the stream planner throws
"The window can only be ordered in ASCENDING mode." This happens because
FlinkCalcMergeRule in the Volcano phase merges the inner rank filter Calc
with the outer CASE WHEN/WHERE Calc, creating a complex Calc that
FlinkLogicalRankRule can no longer match in the later HEP phase.

Fix: Add a new Volcano-safe rank rule variant
(FlinkLogicalRankRuleForConstantRangeAllFunctions) to LOGICAL_RULES in
the stream planner. This rule handles all SqlRankFunction types
(ROW_NUMBER, RANK, DENSE_RANK) with constant rank ranges, and unlike
FlinkLogicalRankRuleForRangeEnd, does not throw exceptions in matches(),
making it safe for the Volcano optimizer. Placing it in the same phase
as FlinkCalcMergeRule allows the optimizer to explore the Rank conversion
path before or as an alternative to merging Calcs.
@nateab nateab force-pushed the fix/FLINK-26051-row-number-case-when branch from 7733abd to f1ef239 Compare February 12, 2026 06:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants