Skip to content

chore(audit): audit predicate expressions across Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1#4480

Open
andygrove wants to merge 1 commit into
apache:mainfrom
andygrove:worktree-audit-predicate-funcs
Open

chore(audit): audit predicate expressions across Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1#4480
andygrove wants to merge 1 commit into
apache:mainfrom
andygrove:worktree-audit-predicate-funcs

Conversation

@andygrove
Copy link
Copy Markdown
Member

Which issue does this PR close?

Closes #.

Rationale for this change

Continuation of the per-category expression audit. Same pattern as #4479 (bitwise), #4478 (map), #4476 (hash), #4475 (conditional), #4474 (misc), #4473 (collection), #4470 (json), #4469 (struct), using the updated audit-comet-expression skill in #4468.

What changes are included in this PR?

Support-doc audit notes

Add per-version audit sub-bullets to all 19 supported predicate SQL function names (!, <, <=, <=>, =, ==, >, >=, and, between, ilike, in, isnan, isnotnull, isnull, like, not, or, rlike).

The Spark expression classes are byte-for-byte identical across the four versions; only the NullIntolerant -> nullIntolerant trait refactor lands in Spark 4.0, with no runtime change. Highlights:

  • ! and == are registry aliases for Not and EqualTo.
  • between is rewritten by the parser to expr >= low AND expr <= high.
  • ilike is RuntimeReplaceable and rewrites to Like(Lower(left), Lower(right)).
  • like and rlike cross-reference the existing string-expressions audit (chore(audit): audit string expressions across Spark 3.4.3, 3.5.8, 4.0.1 #4461).
  • CometNot already optimizes a few special cases (Not(EqualTo), Not(EqualNullSafe), Not(In)).

Support-level consistency fixes

None. The 12 backing serdes were already clean.

Tracking issues filed for follow-up

None.

Audit process

Audited directly using the audit-comet-expression skill (4 Spark versions per #4468). Twelve serdes, no parallel subagents needed.

How are these changes tested?

  • make core succeeds (no code changes; doc only).
  • Existing predicate test coverage in CometExpressionSuite and the various SQL-file suites remains unchanged.

…4.0.1, 4.1.1

Add per-version audit sub-bullets to all 19 supported predicate SQL
function names (`!`, `<`, `<=`, `<=>`, `=`, `==`, `>`, `>=`,
`and`, `between`, `ilike`, `in`, `isnan`, `isnotnull`,
`isnull`, `like`, `not`, `or`, `rlike`) in
`docs/source/contributor-guide/spark_expressions_support.md`.

The Spark expression classes are byte-for-byte identical across the
four versions; only the `NullIntolerant` -> `nullIntolerant` trait
refactor lands in Spark 4.0, with no runtime change. `!` and `==` are
registry aliases for `Not` and `EqualTo`. `between` is rewritten by
the parser to `expr >= low AND expr <= high`. `ilike` is
`RuntimeReplaceable` and rewrites to `Like(Lower(left), Lower(right))`.
`like` and `rlike` cross-reference the existing string-expressions
audit (apache#4461).

No support-level consistency issues were found in the predicate serdes.
`CometNot` already optimizes a few special cases (`Not(EqualTo)`,
`Not(EqualNullSafe)`, `Not(In)`). No new tracking issues are filed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant