Summary
The rewriter from #16682 only rewrites the type literal on each CAST node and does not recurse into the first operand (the expression being cast). So CAST nodes nested inside the first operand of another CAST (or under that subtree, e.g. inside divide) can still leave BIGINT (and similarly VARBINARY, etc.) in the plan. Mixed-version clusters (new broker (with 1.4), pre-#16682-server semantics on 1.3 servers) then hit:
QueryExecutionError:
org.apache.pinot.spi.exception.BadQueryRequestException: Caught exception while initializing transform function: cast
at org.apache.pinot.core.operator.transform.function.TransformFunctionFactory.get(TransformFunctionFactory.java:352)
at org.apache.pinot.core.operator.transform.function.TransformFunctionFactory.get(TransformFunctionFactory.java:347)
at org.apache.pinot.core.operator.transform.function.TransformFunctionFactory.get(TransformFunctionFactory.java:347)
at org.apache.pinot.core.operator.transform.function.TransformFunctionFactory.get(TransformFunctionFactory.java:347)
...
Caused by: java.lang.IllegalArgumentException: Unable to cast expression to type - BIGINT
at org.apache.pinot.core.operator.transform.function.CastTransformFunction.init(CastTransformFunction.java:112)
at org.apache.pinot.core.operator.transform.function.BaseTransformFunction.init(BaseTransformFunction.java:120)
at org.apache.pinot.core.operator.transform.function.TransformFunctionFactory.get(TransformFunctionFactory.java:350)
... 19 more
Same error received for EXPLAIN PLAN FOR
During rolling upgrades (Controller+Broker -> 1.4 and Server -> 1.3), queries that nest CAST(... AS LONG) (→ BIGINT in the plan) under another CAST or under an expression that is the first operand of an outer CAST can fail on older servers because CastTypeAliasRewriter never visited those inner CASTs before dispatch.
Sample queries / patterns that can fail
- Minimal nested cast (inner LONG, outer non-rewritten type like DOUBLE):
SELECT CAST(CAST(event_timestamp AS LONG) AS DOUBLE)
FROM my_table
LIMIT 1;
SELECT CAST(CAST(metric_col AS LONG) AS INT)
FROM my_table
LIMIT 10;
- Same shape in GROUP BY :
SELECT COUNT(*)
FROM my_table
GROUP BY
dateTrunc(
'DAY',
CAST((CAST(event_timestamp AS LONG) / 1000) AS DOUBLE),
'SECONDS',
'UTC',
'MILLISECONDS'
)
- Inner cast under arithmetic inside outer cast:
SELECT CAST((CAST(event_timestamp AS LONG) / 1000) AS DOUBLE)
FROM my_table
LIMIT 1;
- Filter / HAVING (same expression shape; failure is not SELECT-only):
SELECT *
FROM my_table
WHERE CAST(CAST(event_timestamp AS LONG) AS DOUBLE) > 0
LIMIT 1;
A query like CAST(CAST(x AS DOUBLE) AS LONG) may not fail the same way: the inner cast often uses types old servers already accept, and the outer LONG/BIGINT is still visited at the top-level CAST. The failure mode is specific to inner LONG→BIGINT (and similar aliases) hidden under another CAST before recursion into operand 0 is fixed.
Summary
The rewriter from #16682 only rewrites the type literal on each CAST node and does not recurse into the first operand (the expression being cast). So CAST nodes nested inside the first operand of another CAST (or under that subtree, e.g. inside divide) can still leave BIGINT (and similarly VARBINARY, etc.) in the plan. Mixed-version clusters (new broker (with 1.4), pre-#16682-server semantics on 1.3 servers) then hit:
Same error received for
EXPLAIN PLAN FORDuring rolling upgrades (Controller+Broker -> 1.4 and Server -> 1.3), queries that nest CAST(... AS LONG) (→ BIGINT in the plan) under another CAST or under an expression that is the first operand of an outer CAST can fail on older servers because CastTypeAliasRewriter never visited those inner CASTs before dispatch.
Sample queries / patterns that can fail
A query like CAST(CAST(x AS DOUBLE) AS LONG) may not fail the same way: the inner cast often uses types old servers already accept, and the outer LONG/BIGINT is still visited at the top-level CAST. The failure mode is specific to inner LONG→BIGINT (and similar aliases) hidden under another CAST before recursion into operand 0 is fixed.