Skip to content

[VL] Add cross-config / cross-build-cycle invariant tests for ColumnarCachedBatchSerializer#12124

Open
yaooqinn wants to merge 1 commit into
apache:mainfrom
yaooqinn:users/kentyao/cache-stats-fu-b4-test-coverage-v2
Open

[VL] Add cross-config / cross-build-cycle invariant tests for ColumnarCachedBatchSerializer#12124
yaooqinn wants to merge 1 commit into
apache:mainfrom
yaooqinn:users/kentyao/cache-stats-fu-b4-test-coverage-v2

Conversation

@yaooqinn
Copy link
Copy Markdown
Member

What changes are proposed in this pull request?

Extend ColumnarCachedBatchE2ESuite with three lifecycle invariant tests for ColumnarCachedBatchSerializer, exercising the cached-batch wire format across SQLConf transitions:

  1. cross-config (forward): build with spark.gluten.sql.columnar.maxBatchSize-stats path enabled, read with it disabled. The wire format is build-time-decided; a v2-with-stats payload must survive a reader-time downgrade and partition pruning must still engage.
  2. cross-config (reverse): build with stats disabled, read with stats enabled. The legacy v1 payload (stats=null) at build time must NOT be retro-fitted by the reader; the query must fall back to a full scan.
  3. cross-build-cycle: same logical query rebuilt twice with different stats settings. Round 2 must re-honor its own SQLConf rather than reuse stale gate state from round 1.

Each test asserts (a) result row-count correctness, (b) the cached batches are served by ColumnarCachedBatchSerializer (R-path), and (c) when expectPrune=true, that InMemoryTableScanExec.numOutputRows reflects partition pruning (R-prune).

The shared assertion logic is factored into a private helper assertGlutenCachedPlanAndPrune(df, expectPrune). Its scaladoc documents an intentional asymmetry: the reverse "no prune" direction is not observable through numOutputRows on the Gluten native scan path — the existing baseline test "numOutputRows reflects post-filter row count" already notes that outRows may legitimately be 0 even under full pruning, because the surviving row is delivered via the native scan metrics path. The expectPrune=false branch therefore intentionally performs path-only verification.

How was this patch tested?

ColumnarCachedBatchE2ESuite — 15 succeeded / 0 failed / 2 canceled (pre-existing baseline) locally on Spark 3.5, Velox backend.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.7

@github-actions github-actions Bot added the VELOX label May 21, 2026
…rCachedBatchSerializer

Extend ColumnarCachedBatchE2ESuite with three lifecycle invariant tests that
exercise the cached-batch wire format across SQLConf transitions:

  1. cross-config: build with stats=true, read with stats=false
     -- wire format is build-time-decided; v2-with-stats payload must
        survive a reader-time downgrade and prune must still engage.
  2. cross-config (reverse): build with stats=false, read with stats=true
     -- legacy v1 payload (stats=null) at build time; reader must NOT
        fabricate stats and must fall back to full scan.
  3. cross-build-cycle: same logical query rebuilt twice with different
     stats settings (round 1 stats=true, round 2 stats=false). Round 2
     must re-honor stats=false; the serializer must not reuse stale gate
     state from round 1.

Each test asserts (R-correct) result row count, (R-path) the cached
batches are served by ColumnarCachedBatchSerializer (not vanilla
DefaultCachedBatchSerializer), and (R-prune, when expectPrune=true)
that InMemoryTableScanExec.numOutputRows reflects partition pruning.

The shared assertion logic is factored into a private helper
`assertGlutenCachedPlanAndPrune(df, expectPrune)`. Its scaladoc
documents an intentional asymmetry: the reverse "no prune" direction
is not observable via numOutputRows on the Gluten native path (the same
baseline test "numOutputRows reflects post-filter row count" already
notes that outRows may legitimately be 0 even with full pruning, because
the surviving row is delivered through the native scan metrics path).
The expectPrune=false branch therefore intentionally performs path-only
verification.

All 15 suite tests pass locally (Spark 3.3, Velox backend).

Generated-by: Claude claude-opus-4.7
@yaooqinn yaooqinn force-pushed the users/kentyao/cache-stats-fu-b4-test-coverage-v2 branch from 021d402 to cfe1058 Compare May 21, 2026 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants