Fix column-to-column comparison in WHERE clause#18587
Conversation
The PredicateComparisonRewriter previously rewrote column-to-column comparisons like `WHERE a = b` into `minus(a, b) = 0`. This arithmetic rewrite forced DOUBLE coercion on both operands, which threw NumberFormatException for any non-numeric string values (e.g., json_extract_scalar with a STRING result type). The fix replaces the minus() rewrite with the type-safe comparison transform functions (equals, not_equals, greater_than, etc.) that already exist in BinaryOperatorTransformFunction. These properly dispatch on the stored types of both operands — STRING comparisons use String.compareTo(), numeric comparisons use the appropriate Number.compare(), with no forced coercion. This fixes both v1 (single-stage) and v2 (multi-stage) engines, since the v2 leaf stage applies the same rewriter.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #18587 +/- ##
=============================================
- Coverage 64.27% 36.83% -27.45%
+ Complexity 1137 1136 -1
=============================================
Files 3314 3335 +21
Lines 204064 205908 +1844
Branches 31760 32129 +369
=============================================
- Hits 131165 75841 -55324
- Misses 62373 123209 +60836
+ Partials 10526 6858 -3668
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Existing tests in CalciteSqlCompilerTest still expected the old minus(a, b) > 0 pattern; update to match the new greater_than(a, b) = true rewrite. Also add missing <= coverage in PredicateComparisonRewriterTest so all six comparison operators are exercised. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
hmm I thought #10600 is already resolved... no? |
My understanding is that there was a fix in #10444 for comparing the same column. This change also supports different columns The ticket mentions the |
Summary
Fixes #10600
The
PredicateComparisonRewriterpreviously rewrote column-to-column comparisons likeWHERE a = bintominus(a, b) = 0. This arithmetic rewrite forced DOUBLE coercion on both operands, which:Double.parseDouble("some string")throwsNumberFormatExceptionThe fix replaces the
minus()rewrite with the type-safe comparison transform functions (equals,not_equals,greater_than, etc.) that already exist inBinaryOperatorTransformFunction. These dispatch on the actual stored types of both operands —Integer.comparefor INT,Long.comparefor LONG,String.compareTofor STRING — with no forced coercion.Why the rewrite is still needed
The v1 filter engine requires a literal on the RHS of predicates. The rewrite moves the column-to-column comparison into a transform function, leaving a literal
trueon the RHS:Related issues
WHERE string_col = string_colthrowsNumberFormatException(open since Apr 2023)Changes
PredicateComparisonRewriter.java—a op b→EQUALS(comparison_func(a, b), true)instead ofEQUALS(minus(a, b), 0)PredicateComparisonRewriterTest.java— covers all 6 comparison operators, function-on-LHS, operand preservation, and BETWEEN rejectionJsonExtractScalarTest.java— end-to-end integration tests for string and numeric column-to-column comparisons viajson_extract_scalarTest plan
PredicateComparisonRewriterTest— verifies rewrite structure for=,!=,<,>,>=and function-on-LHSJsonExtractScalarTest#testJsonExtractScalarComparedToColumn— end-to-end across all 7 JSON column variants:=(no match → empty result, no error)=same expression both sides (all rows match)!=(all rows satisfy)=(no match → empty result, no error)>(all rows satisfy)json_extract_scalar(properties, '$.hs_name', 'STRING', '') = expectedNamereturns correct matching rows