Null Handling in jsonExtractScalar and jsonExtractIndex SV transforms#18569
Open
siddharthteotia wants to merge 11 commits into
Open
Null Handling in jsonExtractScalar and jsonExtractIndex SV transforms#18569siddharthteotia wants to merge 11 commits into
siddharthteotia wants to merge 11 commits into
Conversation
When query-level null handling is enabled and no default literal is set, the typed SV paths now emit SQL NULL for unresolved JSON paths instead of throwing IllegalArgumentException.
Mirror the scalar fix: when query-level null handling is enabled and no default literal is set, the six typed SV paths emit SQL NULL for unresolved JSON paths instead of throwing RuntimeException.
Named single-row JSON constants with pretty-printed comments, named fixture rows, helper methods that eliminate the inline FluentQueryTest boilerplate, and per-test Javadocs with a concrete example query and expected result.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #18569 +/- ##
=============================================
- Coverage 64.27% 36.79% -27.49%
- Complexity 1130 1136 +6
=============================================
Files 3311 3335 +24
Lines 203946 206088 +2142
Branches 31736 32158 +422
=============================================
- Hits 131093 75835 -55258
- Misses 62341 123402 +61061
+ Partials 10512 6851 -3661
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Replace scattered single-row JSON constants with one 4-row x 2-column fixture (flatJson + nestedJson) appended to the end of the test class. Tests are grouped by query shape: projection -> DISTINCT -> GROUP BY, each with NH-on and NH-off variants. Per-test Javadocs include a concrete query and expected result; result rows are ordered ASC NULLS LAST.
…ionTest Replace the minimal unit-level null-handling tests with the same fixture + projection/DISTINCT/GROUP BY structure as the scalar test, so both test files cover the same query shapes against the same data model. The helper configures a JSON index on both columns via FieldConfig.
In each SV transform, by the time control reaches the null-handling branch, the earlier `if (_defaultValue != null)` check has already filled defaults, so the remaining decision reduces to `_nullHandlingEnabled`. getNullBitmap inlines the equivalent short-circuit condition. No behavior change; 209 tests unchanged.
Three tests per file (projection, DISTINCT, GROUP BY) pin the SV transform's priority order: a user-supplied default wins over the null-handling placeholder. Without these, a future refactor could silently swap the order so unresolved rows surface as SQL NULL instead of the user's default.
Default-precedence tests in both files now parameterize over flatJson + nestedJson (mirrors the projection/DISTINCT/GROUP BY data providers). Scalar test adds Section 5 covering 4-arg with SQL NULL literal under NH on, pinning the _defaultIsNull code path in getNullBitmap.
Both test files now have Section 1 (Projection), 2 (DISTINCT), 3 (GROUP BY) where each section owns its own sub-cases (.1 NH-on 3-arg, .2 NH-on with SQL-NULL default [scalar only], .3 NH-on with non-null default, .4 NH-off). Previously, default-precedence and SQL-NULL-default tests lived in separate sections 4/5 even though they belonged to the same three query shapes.
- Cache the per-ValueBlock getValuesSV result so the SV transform and getNullBitmap don't each hit the index reader under null handling (perf MAJOR). - Add null-handling note to the class Javadoc (doc MAJOR). - Use valueBlock.getNumDocs() for the loop bound in getNullBitmap to match sibling SV transforms (MINOR). - Swap @OverRide @nullable to @nullable @OverRide to match the parent's convention (MINOR).
init now detects a SQL NULL literal as the 4-arg default via literal.isNull() and leaves _defaultValue null, so 3-arg semantics apply (NH on -> SQL NULL, NH off -> throw). Without this, STRING/JSON silently emitted "" and numerics failed at init -- both inconsistent with the scalar function's fix. Adds matching Section 1.2 / 2.2 / 3.2 tests in the index test file (6 new parameterized cases) covering flat + nested paths under projection, DISTINCT, and GROUP BY. Also fixes a comment typo (broker -> server) flagged by review.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #18568.
Improvement Summary
Both
JsonExtractScalarTransformFunctionandJsonExtractIndexTransformFunctionhad_nullHandlingEnabledplumbed through (or accessible fromBaseTransformFunction) but their six typed SV methods (transformTo{Int,Long,Float,Double,BigDecimal,String}ValuesSV) ignored it and threw unconditionally when the JSON path didn't resolve and no default literal was supplied. This PR makes both transforms honor_nullHandlingEnabled: when query-level null handling is on, unresolved rows surface as SQLNULL(typed null placeholder + a bit ingetNullBitmap()) instead of throwing.See #18568 for the full behavioral matrix (before/after, NH on/off, with and without default value) and the worked country/clicks example.
MV transforms are unchanged. The legacy throw is preserved when null handling is off, so callers that haven't opted in see no behavior change.
JsonIndexDistinctOperatoralready handles this correctly and is untouched.Testing Done