Skip to content

chore(audit): audit struct expressions across Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1#4469

Open
andygrove wants to merge 1 commit into
apache:mainfrom
andygrove:worktree-audit-struct-funcs
Open

chore(audit): audit struct expressions across Spark 3.4.3, 3.5.8, 4.0.1, 4.1.1#4469
andygrove wants to merge 1 commit into
apache:mainfrom
andygrove:worktree-audit-struct-funcs

Conversation

@andygrove
Copy link
Copy Markdown
Member

Which issue does this PR close?

Closes #.

Rationale for this change

Following the same pattern as #4436 (any), #4437 (bit_and), and #4461 (string expressions), this PR audits the struct_funcs category in Comet against Spark 3.4.3, 3.5.8, 4.0.1, and now 4.1.1 (per the updated audit skill in #4468), records the findings inline in the support doc, and applies the one support-level consistency fix surfaced.

What changes are included in this PR?

Support-doc audit notes

Add per-version audit sub-bullets to named_struct and struct in docs/source/contributor-guide/spark_expressions_support.md. The Spark CreateNamedStruct class is byte-for-byte identical for behaviour across all four versions; only internal optimizer flags (stateful on 4.0, contextIndependentFoldable on 4.1) were added. Both named_struct and struct SQL functions lower to the same CreateNamedStruct node, so Comet handles them through one serde.

Support-level consistency fix (in structs.scala)

  • CometCreateNamedStruct: lift the duplicate-field-names fallback out of convert and into getSupportLevel via a shared private val so the dispatcher handles the fallback uniformly and getUnsupportedReasons() documents the restriction for the compatibility guide.

Tracking issues filed for follow-up

None. No correctness divergences were found for this category.

Audit process

Audited directly using the audit-comet-expression skill (4 Spark versions per the update in #4468). The category is small (one backing serde), so it did not need parallel subagents.

How are these changes tested?

  • ./mvnw test -Dsuites="org.apache.comet.CometSqlFileTestSuite expressions/struct/" -Dtest=none (5 tests pass)
  • make core succeeds with the serde change.

….1, 4.1.1

Add per-version audit sub-bullets to `named_struct` and `struct` in
`docs/source/contributor-guide/spark_expressions_support.md`. The Spark
`CreateNamedStruct` class is byte-for-byte identical for behaviour
across all four versions; only internal optimizer flags (`stateful`
on 4.0, `contextIndependentFoldable` on 4.1) were added. Both
`named_struct` and `struct` SQL functions lower to the same
`CreateNamedStruct` node, so Comet handles them through one serde.

Apply the one support-level consistency fix surfaced by the audit:

- `CometCreateNamedStruct`: lift the duplicate-field-names fallback
  out of `convert` and into `getSupportLevel`, so the dispatcher
  handles the fallback uniformly and `getUnsupportedReasons()`
  documents the restriction for the compatibility guide.

No correctness divergences were found, so no tracking issues are filed
for this category.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant