Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 63 additions & 13 deletions .claude/skills/audit-comet-expression/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
name: audit-comet-expression
description: Audit an existing Comet expression for correctness and test coverage. Studies the Spark implementation across versions 3.4.3, 3.5.8, and 4.0.1, reviews the Comet and DataFusion implementations, identifies missing test coverage, and offers to implement additional tests.
description: Audit an existing Comet expression for correctness and test coverage. Studies the Spark implementation across versions 3.4.3, 3.5.8, 4.0.1, and 4.1.1, reviews the Comet and DataFusion implementations, identifies missing test coverage, and offers to implement additional tests.
argument-hint: <expression-name>
---

Expand All @@ -10,7 +10,7 @@ Audit the Comet implementation of the `$ARGUMENTS` expression for correctness an

This audit covers:

1. Spark implementation across versions 3.4.3, 3.5.8, and 4.0.1
1. Spark implementation across versions 3.4.3, 3.5.8, 4.0.1, and 4.1.1
2. Comet Scala serde implementation
3. Comet Rust / DataFusion implementation
4. Existing test coverage (Comet SQL Tests and Comet Scala Tests)
Expand All @@ -24,7 +24,7 @@ Clone specific Spark version tags (use shallow clones to avoid polluting the wor

```bash
set -eu -o pipefail
for tag in v3.4.3 v3.5.8 v4.0.1; do
for tag in v3.4.3 v3.5.8 v4.0.1 v4.1.1; do
dir="/tmp/spark-${tag}"
if [ ! -d "$dir" ]; then
git clone --depth 1 --branch "$tag" https://github.com/apache/spark.git "$dir"
Expand All @@ -37,7 +37,7 @@ done
Search the Catalyst SQL expressions source:

```bash
for tag in v3.4.3 v3.5.8 v4.0.1; do
for tag in v3.4.3 v3.5.8 v4.0.1 v4.1.1; do
dir="/tmp/spark-${tag}"
echo "=== $tag ==="
find "$dir/sql/catalyst/src/main/scala" -name "*.scala" | \
Expand All @@ -48,7 +48,7 @@ done
If the expression is not found in catalyst, also check core:

```bash
for tag in v3.4.3 v3.5.8 v4.0.1; do
for tag in v3.4.3 v3.5.8 v4.0.1 v4.1.1; do
dir="/tmp/spark-${tag}"
echo "=== $tag ==="
find "$dir/sql" -name "*.scala" | \
Expand All @@ -73,6 +73,7 @@ Produce a concise diff summary of what changed between:

- 3.4.3 → 3.5.8
- 3.5.8 → 4.0.1
- 4.0.1 → 4.1.1

Pay attention to:

Expand All @@ -87,7 +88,7 @@ Pay attention to:
## Step 2: Locate the Spark Tests

```bash
for tag in v3.4.3 v3.5.8 v4.0.1; do
for tag in v3.4.3 v3.5.8 v4.0.1 v4.1.1; do
dir="/tmp/spark-${tag}"
echo "=== $tag ==="
find "$dir/sql" -name "*.scala" -path "*/test/*" | \
Expand Down Expand Up @@ -411,6 +412,11 @@ For an untested case, run it manually first to determine the current
behaviour, then commit either a regression test (passes) or a
`query ignore(<issue-url>)` test (fails).

Every item in this bucket either becomes an inline fix + test, or a
filed GitHub issue + ignored regression test, in the audit PR. Step 7
spells out the workflow: never leave a high-priority finding as PR-body
prose only.

### Medium priority: missing test coverage

Low-risk coverage gaps: additional input permutations on already-tested
Expand All @@ -429,9 +435,13 @@ etc. These come from the Step 5 consistency audit.

High-priority findings (correctness divergences and high-risk coverage
gaps) and consistency issues from Step 5 / Step 6 must not be left as
prose. Apply them in the same PR as the audit. Only low-risk missing
coverage requires the user's go-ahead, because adding tests for cases
that already work on well-exercised paths is incremental polish.
prose. Apply them in the same PR as the audit. Anything you cannot fix
inline (because it needs a semantics decision, native code change, or
larger design work) must still be captured as a GitHub issue per the
"Findings that need follow-up" section below: prose recommendations in
the PR body alone are insufficient. Only low-risk missing coverage
requires the user's go-ahead, because adding tests for cases that
already work on well-exercised paths is incremental polish.

### High-priority findings: capture as tests

Expand Down Expand Up @@ -502,13 +512,53 @@ need user approval. The classes of fix are:

Each fix is one of these patterns. If a finding requires a semantics
decision (e.g. is a specific branch really `Unsupported`, or is it
`Incompatible`?), do not guess: leave it as a prose recommendation in
the PR description and call it out for the reviewer.
`Incompatible`?), do not guess: **file a GitHub issue per the
"Findings that need follow-up" section below** and link it from the PR
description. Do not leave the recommendation as prose only: prose in a
PR description gets buried as soon as the PR merges.

After every fix, build the affected module to make sure the edit
compiles. Do not run the full suite; targeted tests suffice if the
fix could plausibly affect behaviour.

### Findings that need follow-up: always file a tracking issue

Any high-priority finding (correctness divergence, robustness gap,
behavioural difference from Spark, missing-collation guard, etc.) that
this PR does not fix inline **must** be filed as a GitHub issue before
the PR is opened. This includes:

- Semantics decisions the audit surfaces but should not unilaterally
resolve (e.g. promote `Compatible` to `Incompatible`, change a
default).
- Architectural concerns that span multiple expressions (e.g. Spark 4.0
collation propagation across an entire family).
- Bugs that are fixable in principle but need more design or native
changes than fit in the audit PR.
- Documentation gaps surfaced by the audit (e.g. an expression that
doesn't appear in the auto-generated compatibility doc).

For each follow-up:

1. Search for an existing issue first
(`gh issue list --search "<expression> <symptom> in:title,body" --state all --limit 5`).
If one exists, link it from the PR description and the support-doc
sub-bullet, and stop.
2. If no issue exists, file one with `gh issue create` using the
`correctness` label (or `documentation` for doc-only gaps) plus any
relevant area labels (e.g. `spark 4.0`). Title format:
`[Bug] <expression> <one-line symptom>`. Body includes: Spark version
range affected, a minimal repro, the divergent result, the relevant
Comet file/line, and a one-line note that the issue was surfaced by
this audit PR.
3. Reference the issue number from both the support-doc sub-bullet and
the PR description "What changes are included" section so reviewers
can see what work the audit intentionally deferred.

Prose-only "future work" notes in the PR description are not enough.
The whole point of the audit is to leave behind durable artefacts; an
unfiled finding evaporates after the PR merges.

### Low-risk missing test coverage: ask the user

This is the only Step 7 category that pauses for user input. It only
Expand Down Expand Up @@ -586,7 +636,7 @@ entry in `docs/source/contributor-guide/spark_expressions_support.md`.

Add one sub-bullet per Spark version checked, each including:

- Spark version (e.g. 3.4.3, 3.5.8, 4.0.1)
- Spark version (e.g. 3.4.3, 3.5.8, 4.0.1, 4.1.1)
- Today's date
- A brief note for any version-specific finding (behavioral difference, known incompatibility); omit if nothing notable

Expand All @@ -597,7 +647,7 @@ Add one sub-bullet per Spark version checked, each including:
Present the audit as:

1. **Expression Summary** - Brief description of what `$ARGUMENTS` does, its input/output types, and null behavior
2. **Spark Version Differences** - Summary of any behavioral or API differences across Spark 3.4.3, 3.5.8, and 4.0.1
2. **Spark Version Differences** - Summary of any behavioral or API differences across Spark 3.4.3, 3.5.8, 4.0.1, and 4.1.1
3. **Comet Implementation Notes** - Summary of how Comet implements this expression and any concerns
4. **Coverage Gap Analysis** - The gap table from Step 5, plus implementation gaps
5. **Recommendations** - Prioritized list from Step 6
Expand Down