Skip to content

[SPARK-55706][SQL][TESTS] Disable DB2 JDBC Driver tests#54505

Closed
dongjoon-hyun wants to merge 1 commit intoapache:masterfrom
dongjoon-hyun:SPARK-55706
Closed

[SPARK-55706][SQL][TESTS] Disable DB2 JDBC Driver tests#54505
dongjoon-hyun wants to merge 1 commit intoapache:masterfrom
dongjoon-hyun:SPARK-55706

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Feb 26, 2026

What changes were proposed in this pull request?

This PR aims to disable DB2 JDBC driver test coverage until it removes its side-effect.

  • ConnectionProviderSuite is revised to use another connection provider instead of DB2.
  • SPARK-55707 is filed to re-enable the disabled tests.

Why are the changes needed?

To avoid a side-effect on LZ4 test dependency. We had better focus on the non-test dependency first.

BACKGROUND
The discussion originally initiated over 2 months ago and we tried many approaches to minimize the gap as much as possible we can. DB2 JDBC was the last hurdle to block Apache Spark from upgrading our LZ4 library.

DB2 is simply just a test dependency (or test coverage), but it causes java.lang.NoSuchMethodError: 'net.jpountz.lz4.LZ4BlockInputStream$Builder net.jpountz.lz4.LZ4BlockInputStream.newBuilder()' at org.apache.spark.io.LZ4CompressionCodec.compressedInputStream(CompressionCodec.scala:156) when there is a mismatch between Apache Spark's built-in LZ4 library and DB2's embedded LZ4 library.

Apache Spark distribution has no documentation, assumptions or requirements for JDBC drivers so far. That's the reason why I suggested to simply disable Db2 test coverage for a while as the alternative #53920 .

spark/pom.xml

Lines 1360 to 1365 in 1d56813

<dependency>
<groupId>com.ibm.db2</groupId>
<artifactId>jcc</artifactId>
<version>${db2.jcc.version}</version>
<scope>test</scope>
</dependency>

We don't want to give an extra recommendation like using 12.1.3.0_special_74723 which it may contaminate Spark's classpaths. IBM may want to give a recommendation from their side.

We have still two on-going JIRA issue.

  1. [SPARK-54571][CORE][SQL] Use LZ4 safeDecompressor to mitigate perf regression #53454 is the main body which @pan3793 has been driving (after @dbtsai )
  2. SPARK-55707 aims to re-enable the disabled tests after [SPARK-54571][CORE][SQL] Use LZ4 safeDecompressor to mitigate perf regression #53454 and IBM is able to deliver a normal JDBC driver jar which can work with Apache Spark distributions with different LZ4 versions.

I'm not sure about when (2) is done because it's 3-rd party decision. The bottom line is that the Apache Spark should be able to proceed without being blocked by 3rd-party library packaging issue.

Does this PR introduce any user-facing change?

No Spark's behavior change because this is only a test dependency and coverage.

How was this patch tested?

Pass the CIs.

Was this patch authored or co-authored using generative AI tooling?

No.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-55706] Disable DB2 JDBC Driver tests [SPARK-55706][SQL][TESTS] Disable DB2 JDBC Driver tests Feb 26, 2026
@dongjoon-hyun
Copy link
Member Author

cc @pan3793 and @LuciferYang

@pan3793
Copy link
Member

pan3793 commented Feb 26, 2026

@dongjoon-hyun thanks, I rebased #53454 on this one, let's wait for the CI result. I think it should pass.

@dongjoon-hyun
Copy link
Member Author

Thank you.

@dongjoon-hyun
Copy link
Member Author

Oops. I missed to run docker-integration-tests. Let me verify more.

@dongjoon-hyun dongjoon-hyun marked this pull request as draft February 26, 2026 07:33
@dongjoon-hyun dongjoon-hyun marked this pull request as ready for review February 26, 2026 07:35
@dongjoon-hyun dongjoon-hyun marked this pull request as draft February 26, 2026 07:43
@dongjoon-hyun
Copy link
Member Author

$ build/sbt -Pdocker-integration-tests "docker-integration-tests/testOnly *DB2*Suite"
...
[info] DB2IntegrationSuite:
[info] - SPARK-52184: Wrap external engine syntax error !!! IGNORED !!!
[info] - SPARK-53386: Parameter `query` should work when ending with semicolons !!! IGNORED !!!
[info] - Basic test !!! IGNORED !!!
[info] - Numeric types !!! IGNORED !!!
[info] - Date types !!! IGNORED !!!
[info] - String types !!! IGNORED !!!
[info] - Basic write test !!! IGNORED !!!
[info] - query JDBC option !!! IGNORED !!!
[info] - SPARK-30062 !!! IGNORED !!!
[info] - SPARK-42534: DB2 Limit pushdown test !!! IGNORED !!!
[info] - SPARK-48269: boolean type !!! IGNORED !!!
[info] - SPARK-48269: GRAPHIC types !!! IGNORED !!!
[info] - SPARK-48269: binary types !!! IGNORED !!!
[info] DB2IntegrationSuite:
[info] - SPARK-33034: ALTER TABLE ... add new columns !!! IGNORED !!!
[info] - SPARK-33034: ALTER TABLE ... drop column !!! IGNORED !!!
[info] - SPARK-33034: ALTER TABLE ... update column type !!! IGNORED !!!
[info] - SPARK-33034: ALTER TABLE ... rename column !!! IGNORED !!!
[info] - SPARK-33034: ALTER TABLE ... update column nullability !!! IGNORED !!!
[info] - CREATE TABLE with table comment !!! IGNORED !!!
[info] - CREATE TABLE with table property !!! IGNORED !!!
[info] - SPARK-36895: Test INDEX Using SQL !!! IGNORED !!!
[info] - SPARK-48172: Test CONTAINS !!! IGNORED !!!
[info] - SPARK-48172: Test ENDSWITH !!! IGNORED !!!
[info] - SPARK-48172: Test STARTSWITH !!! IGNORED !!!
[info] - SPARK-48172: Test LIKE !!! IGNORED !!!
[info] - SPARK-37038: Test TABLESAMPLE (true) !!! IGNORED !!!
[info] - SPARK-37038: Test TABLESAMPLE (false) !!! IGNORED !!!
[info] - simple scan (true) !!! IGNORED !!!
[info] - simple scan (false) !!! IGNORED !!!
[info] - simple scan with LIMIT (true) !!! IGNORED !!!
[info] - simple scan with LIMIT (false) !!! IGNORED !!!
[info] - simple scan with top N (true) !!! IGNORED !!!
[info] - simple scan with top N (false) !!! IGNORED !!!
[info] - simple scan with OFFSET (true) !!! IGNORED !!!
[info] - simple scan with OFFSET (false) !!! IGNORED !!!
[info] - simple scan with LIMIT and OFFSET (true) !!! IGNORED !!!
[info] - simple scan with LIMIT and OFFSET (false) !!! IGNORED !!!
[info] - simple scan with paging: top N and OFFSET (true) !!! IGNORED !!!
[info] - simple scan with paging: top N and OFFSET (false) !!! IGNORED !!!
[info] - scan with aggregate push-down: VAR_POP with DISTINCT (true) !!! IGNORED !!!
[info] - scan with aggregate push-down: VAR_POP with DISTINCT (false) !!! IGNORED !!!
[info] - scan with aggregate push-down: VAR_SAMP with DISTINCT (true) !!! IGNORED !!!
[info] - scan with aggregate push-down: VAR_SAMP with DISTINCT (false) !!! IGNORED !!!
[info] - scan with aggregate push-down: STDDEV_POP with DISTINCT (true) !!! IGNORED !!!
[info] - scan with aggregate push-down: STDDEV_POP with DISTINCT (false) !!! IGNORED !!!
[info] - scan with aggregate push-down: STDDEV_SAMP with DISTINCT (true) !!! IGNORED !!!
[info] - scan with aggregate push-down: STDDEV_SAMP with DISTINCT (false) !!! IGNORED !!!
[info] - scan with aggregate push-down: COVAR_POP with DISTINCT (true) (excluded) !!! IGNORED !!!
[info] - scan with aggregate push-down: COVAR_POP with DISTINCT (false) (excluded) !!! IGNORED !!!
[info] - scan with aggregate push-down: COVAR_SAMP with DISTINCT (true) (excluded) !!! IGNORED !!!
[info] - scan with aggregate push-down: COVAR_SAMP with DISTINCT (false) (excluded) !!! IGNORED !!!
[info] - scan with aggregate push-down: CORR with DISTINCT (true) (excluded) !!! IGNORED !!!
[info] - scan with aggregate push-down: CORR with DISTINCT (false) (excluded) !!! IGNORED !!!
[info] - scan with aggregate push-down: REGR_INTERCEPT with DISTINCT (true) (excluded) !!! IGNORED !!!
[info] - scan with aggregate push-down: REGR_INTERCEPT with DISTINCT (false) (excluded) !!! IGNORED !!!
[info] - scan with aggregate push-down: REGR_SLOPE with DISTINCT (true) (excluded) !!! IGNORED !!!
[info] - scan with aggregate push-down: REGR_SLOPE with DISTINCT (false) (excluded) !!! IGNORED !!!
[info] - scan with aggregate push-down: REGR_R2 with DISTINCT (true) (excluded) !!! IGNORED !!!
[info] - scan with aggregate push-down: REGR_R2 with DISTINCT (false) (excluded) !!! IGNORED !!!
[info] - scan with aggregate push-down: REGR_SXY with DISTINCT (true) (excluded) !!! IGNORED !!!
[info] - scan with aggregate push-down: REGR_SXY with DISTINCT (false) (excluded) !!! IGNORED !!!
[info] - scan with aggregate push-down: VAR_POP without DISTINCT (true) !!! IGNORED !!!
[info] - scan with aggregate push-down: VAR_POP without DISTINCT (false) !!! IGNORED !!!
[info] - scan with aggregate push-down: VAR_SAMP without DISTINCT (true) !!! IGNORED !!!
[info] - scan with aggregate push-down: VAR_SAMP without DISTINCT (false) !!! IGNORED !!!
[info] - scan with aggregate push-down: STDDEV_POP without DISTINCT (true) !!! IGNORED !!!
[info] - scan with aggregate push-down: STDDEV_POP without DISTINCT (false) !!! IGNORED !!!
[info] - scan with aggregate push-down: STDDEV_SAMP without DISTINCT (true) !!! IGNORED !!!
[info] - scan with aggregate push-down: STDDEV_SAMP without DISTINCT (false) !!! IGNORED !!!
[info] - scan with aggregate push-down: COVAR_POP without DISTINCT (true) !!! IGNORED !!!
[info] - scan with aggregate push-down: COVAR_POP without DISTINCT (false) !!! IGNORED !!!
[info] - scan with aggregate push-down: COVAR_SAMP without DISTINCT (true) !!! IGNORED !!!
[info] - scan with aggregate push-down: COVAR_SAMP without DISTINCT (false) !!! IGNORED !!!
[info] - scan with aggregate push-down: CORR without DISTINCT (true) (excluded) !!! IGNORED !!!
[info] - scan with aggregate push-down: CORR without DISTINCT (false) (excluded) !!! IGNORED !!!
[info] - scan with aggregate push-down: REGR_INTERCEPT without DISTINCT (true) !!! IGNORED !!!
[info] - scan with aggregate push-down: REGR_INTERCEPT without DISTINCT (false) !!! IGNORED !!!
[info] - scan with aggregate push-down: REGR_SLOPE without DISTINCT (true) !!! IGNORED !!!
[info] - scan with aggregate push-down: REGR_SLOPE without DISTINCT (false) !!! IGNORED !!!
[info] - scan with aggregate push-down: REGR_R2 without DISTINCT (true) !!! IGNORED !!!
[info] - scan with aggregate push-down: REGR_R2 without DISTINCT (false) !!! IGNORED !!!
[info] - scan with aggregate push-down: REGR_SXY without DISTINCT (true) !!! IGNORED !!!
[info] - scan with aggregate push-down: REGR_SXY without DISTINCT (false) !!! IGNORED !!!
[info] - SPARK-48618: Renaming the table to the name of an existing table !!! IGNORED !!!
[info] - SPARK-48618: Test table does not exists error !!! IGNORED !!!
[info] - scan with filter push-down with date time functions !!! IGNORED !!!
[info] - SPARK-50792: Format binary data as a binary literal in JDBC. !!! IGNORED !!!
[info] - SPARK-52262: FAILED_JDBC.TABLE_EXISTS not thrown on connection error !!! IGNORED !!!
[info] DB2NamespaceSuite:
[info] - listNamespaces: basic behavior !!! IGNORED !!!
[info] - Drop namespace !!! IGNORED !!!
[info] DB2KrbIntegrationSuite:
[info] - Basic read test in query option !!! IGNORED !!!
[info] - Basic read test in create table path !!! IGNORED !!!
[info] - Basic write test !!! IGNORED !!!
[info] - SPARK-35226: JDBCOption should accept refreshKrb5Config parameter !!! IGNORED !!!
[info] Run completed in 958 milliseconds.
[info] Total number of tests run: 0
[info] Suites: completed 4, aborted 0
[info] Tests: succeeded 0, failed 0, canceled 0, ignored 94, pending 0
[info] No tests were executed.
[success] Total time: 14 s, completed Feb 25, 2026, 11:48:08 PM

@dongjoon-hyun dongjoon-hyun marked this pull request as ready for review February 26, 2026 07:49
@dongjoon-hyun
Copy link
Member Author

Since it's too late for me in Cupertino, I'll check the CI result tomorrow morninn. i'll ping you nex time adter the PR passes the CI. Sorry for the trouble.

@pan3793
Copy link
Member

pan3793 commented Feb 26, 2026

@dongjoon-hyun thank you, I will keep eyes here.

@pan3793
Copy link
Member

pan3793 commented Feb 26, 2026

thanks, merging to master

@pan3793 pan3793 closed this in 811c3fa Feb 26, 2026
@viirya
Copy link
Member

viirya commented Feb 26, 2026

It would be good to explain what the side-effect is in the PR description. Neither the JIRA or this PR has a clear description on it. Although #53920 has the discussion, it would be good to explicitly write it here.

@pan3793
Copy link
Member

pan3793 commented Feb 26, 2026

@viirya #53454 (comment) might answer your question

@viirya
Copy link
Member

viirya commented Feb 26, 2026

@viirya #53454 (comment) might answer your question

Yeah, I know. What I mean is we should make the PR self-descriptive for others who don't know the context to understand this change immediately instead of needing to look for information across several PRs.

@pan3793
Copy link
Member

pan3793 commented Feb 26, 2026

What I mean is we should make the PR self-descriptive for others who don't know the context to understand this change immediately instead of needing to look for information across several PRs.

Definitely! Sorry for the inconvenience.

@dongjoon-hyun
Copy link
Member Author

Thank you, @viirya . You are right. I agree with you that the current description (and the commit log) wasn't enough. Let me revise the PR title at least (although it's already committed)

To all reviewers:

Just for the record, the discussion originally initiated over 2 months ago and we tried many approaches to minimize the gap as much as possible we can. DB2 JDBC was the last hurdle to block Apache Spark from upgrading our LZ4 library.

DB2 is simply just a test dependency (or test coverage). Apache Spark distribution has no documentation, assumptions or requirements for JDBC drivers so far. That's the reason why I suggested to simply disable Db2 test coverage for a while as the alternative #53920 .

spark/pom.xml

Lines 1360 to 1365 in 1d56813

<dependency>
<groupId>com.ibm.db2</groupId>
<artifactId>jcc</artifactId>
<version>${db2.jcc.version}</version>
<scope>test</scope>
</dependency>

We don't want to give an extra recommendation like using 12.1.3.0_special_74723 which it may contaminate Spark's classpaths. IBM may want to give a recommendation from their side.

We have still two on-going JIRA issue.

  1. [SPARK-54571][CORE][SQL] Use LZ4 safeDecompressor to mitigate perf regression #53454 is the main body which @pan3793 has been driving (after @dbtsai )
  2. SPARK-55707 aims to re-enable the disabled tests after [SPARK-54571][CORE][SQL] Use LZ4 safeDecompressor to mitigate perf regression #53454 and IBM is able to deliver a normal JDBC driver jar which can work with Apache Spark distributions with different LZ4 versions.

I'm not sure about when (2) is done because it's 3-rd party decision. The bottom line is that the Apache Spark should be able to proceed without being blocked by 3rd-party library packaging issue.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-55706 branch February 26, 2026 16:55
@dongjoon-hyun
Copy link
Member Author

The PR description is updated, @viirya and @pan3793 . It's my bad.

@viirya
Copy link
Member

viirya commented Feb 26, 2026

Thank you @dongjoon-hyun @pan3793 for making the description better and unblocking https:.github.com/apache/spark/pull/53454.

@dongjoon-hyun
Copy link
Member Author

Thank you so much for your consistently thoughtful and sincere reviews, @viirya . I truly appreciate the time and care you put into every review. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants