Skip to content

[SPARK-56619][SQL][TESTS] Add DSv2 repeated table access tests with internal/external changes #55462

Open
longvu-db wants to merge 4 commits into
apache:masterfrom
longvu-db:dsv2-pr2-repeated-sql
Open

[SPARK-56619][SQL][TESTS] Add DSv2 repeated table access tests with internal/external changes #55462
longvu-db wants to merge 4 commits into
apache:masterfrom
longvu-db:dsv2-pr2-repeated-sql

Conversation

@longvu-db
Copy link
Copy Markdown
Contributor

@longvu-db longvu-db commented Apr 21, 2026

What changes were proposed in this pull request?

Add 9 tests verifying that DSv2 tables reflect the latest state when accessed repeatedly via sql() (without CACHE TABLE). Each session.sql("SELECT * FROM t") call creates a fresh QueryExecution, so it always sees the most recent data, schema, and table identity.

The tests are extracted into a shared trait DSv2RepeatedTableAccessWithExternalChangesTests backed by DSv2ExternalMutationTestBase, following the same pattern as DSv2TempViewWithStoredPlanTests (PR #55571). This allows a future Connect suite to reuse the same tests by mixing in the trait and providing Connect-specific implementations.

The tests cover three scenarios, each with a session-write, external-write, and caching-connector variant:

  • Scenario 1 (data writes): After a writer adds rows (via session SQL or catalog API), a subsequent sql() query sees the new data.
  • Scenario 2 (schema changes): After a writer adds a column and inserts data with the new schema (via session SQL or catalog API), a subsequent sql() query reflects the updated schema.
  • Scenario 3 (drop/recreate): After a writer drops and recreates the table (via session SQL or catalog API), a subsequent sql() query sees the empty recreated table.

For each scenario, the caching-connector variant (cachingcat) demonstrates that a connector with its own loadTable cache returns stale results until REFRESH TABLE invalidates the cache.

External writes use direct catalog API calls (loadTable with write privileges, alterTable, dropTable/createTable), matching the pattern used by the existing temp view tests in the same suite.

New files

Modified files

  • DataSourceV2DataFrameSuite: Mixes in DSv2RepeatedTableAccessWithExternalChangesTests and implements the 4 classic-mode abstract methods.

Why are the changes needed?

These tests document and lock down the expected behavior: repeated sql() access without CACHE TABLE always sees the latest table state. This prevents regressions if internal resolution or caching logic changes. The trait extraction enables Connect-mode reuse without duplicating test logic.

Does this PR introduce any user-facing change?

No. This PR is test-only.

How was this patch tested?

9 new tests in DataSourceV2DataFrameSuite, all passing:

```
build/sbt 'sql/testOnly *DataSourceV2DataFrameSuite -- -z "repeated sql()"'
```

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (claude-opus-4-6)

@longvu-db longvu-db marked this pull request as draft April 21, 2026 20:48
@longvu-db longvu-db force-pushed the dsv2-pr2-repeated-sql branch from 77c6144 to 242bdaa Compare April 24, 2026 14:42
@longvu-db longvu-db changed the title [SPARK-XXXXX][SQL][TESTS] Add DSv2 repeated SQL access refresh tests [SPARK-XXXXX][SQL][TESTS] Add DSv2 repeated table access tests with external changes Apr 24, 2026
@longvu-db longvu-db changed the title [SPARK-XXXXX][SQL][TESTS] Add DSv2 repeated table access tests with external changes [SPARK-XXXXX][SQL][TESTS] Add DSv2 repeated table access tests with internal/external changes Apr 24, 2026
@longvu-db longvu-db changed the title [SPARK-XXXXX][SQL][TESTS] Add DSv2 repeated table access tests with internal/external changes [SPARK-56619][SQL][TESTS] Add DSv2 repeated table access tests with internal/external changes Apr 24, 2026
@longvu-db longvu-db marked this pull request as ready for review April 24, 2026 15:15
@longvu-db longvu-db force-pushed the dsv2-pr2-repeated-sql branch from 4b5c3a5 to 489e519 Compare April 30, 2026 16:52
Copy link
Copy Markdown
Contributor

@andreaschat-db andreaschat-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please update the PR description with the correct number of tests added.

@longvu-db longvu-db force-pushed the dsv2-pr2-repeated-sql branch from a668d67 to 49fc102 Compare May 20, 2026 14:09
longvu-db added 2 commits May 24, 2026 20:06
…nternal/external changes

Rebased on latest master. Removed CachingInMemoryTableCatalog.scala
from the PR since it was already merged via SPARK-56643.
Move the 9 repeated-sql tests from DataSourceV2DataFrameSuite into a
DSv2RepeatedSqlTests trait backed by DSv2ExternalMutationTestBase,
following the same pattern as DSv2TempViewWithStoredPlanTests (PR apache#55571).
This allows a future Connect suite to reuse the same tests by mixing in
the trait and providing Connect-specific implementations.

Co-authored-by: Isaac
@longvu-db longvu-db force-pushed the dsv2-pr2-repeated-sql branch from 49fc102 to 7621d0d Compare May 24, 2026 20:06
longvu-db added 2 commits May 24, 2026 20:09
- Remove build/sbt-launch-1.12.8.jar.part (accidental partial download)
- Fix DSv2ExternalMutationTestBase Scaladoc: DSv2RepeatedSqlTests ->
  DSv2RepeatedTableAccessWithExternalChangesTests

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants