Skip to content

ByteBuf leak fixes#1876

Open
rozza wants to merge 7 commits intomongodb:ByteBuffrom
rozza:JAVA-6081
Open

ByteBuf leak fixes#1876
rozza wants to merge 7 commits intomongodb:ByteBuffrom
rozza:JAVA-6081

Conversation

@rozza
Copy link
Member

@rozza rozza commented Feb 2, 2026

  • Ensure Default Server Monitor calls close on resources before interrupt
  • Update ByteBufferBsonOutput documentation
  • Improve ReplyHeader testing and ensure resources are closed
  • Improve ServerSessionPool testing
  • Ensure reactive client session closing is idempotent
  • Added System.gc to unified test cleanup. Should cause more gc when testing.

JAVA-6081

- Ensure Default Server Monitor calls close on resources before interrupt
- Update ByteBufferBsonOutput documentation
- Improve ReplyHeader testing and ensure resources are closed
- Improve ServerSessionPool testing
- Ensure reactive client session closing is idempotent
- Added System.gc to unified test cleanup. Should cause more gc when testing.

JAVA-6081
@rozza rozza requested a review from a team as a code owner February 2, 2026 15:11
@rozza rozza requested a review from katcharov February 2, 2026 15:11
@rozza
Copy link
Member Author

rozza commented Feb 2, 2026

DefaultServerMonitor

  • Resource cleanup before thread interrupt
  • Prevents leaks during shutdown scenarios

Other changes

  • Test migrations
  • Improved documentation in ByteBufBsonOutput
  • Improved resource handling in BsonDocument & legacy
  • Updated reactive stream Fixture to increase use of Netty when transport setting configured
  • Updated unified tests to increase the gc churn

@rozza rozza requested a review from stIncMale February 2, 2026 15:14
@strogiyotec strogiyotec mentioned this pull request Feb 9, 2026
@katcharov katcharov requested review from nhachicha and removed request for katcharov and stIncMale February 19, 2026 16:31
@nhachicha nhachicha requested a review from Copilot March 10, 2026 12:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets ByteBuf/leak-related issues by tightening resource lifecycle management in core/legacy/reactive code paths and by updating tests/docs to better enforce buffer ownership semantics.

Changes:

  • Adjust server monitor and reactive session shutdown paths to close/cleanup resources more safely and make session close idempotent.
  • Replace Spock specifications with JUnit 5 tests for ServerSessionPool and ReplyHeader, and add new monitor edge-case tests.
  • Clarify/document ByteBuf ownership in ByteBufferBsonOutput and add explicit ByteBuf releasing in BSON serialization code.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
driver-sync/src/test/functional/com/mongodb/client/unified/UnifiedTest.java Adds GC request during unified-test cleanup to help leak detection.
driver-sync/src/test/functional/com/mongodb/client/AbstractSessionsProseTest.java Avoids assuming lsid exists in all commands during sessions prose tests.
driver-reactive-streams/src/test/functional/com/mongodb/reactivestreams/client/Fixture.java Applies overridden TransportSettings to reactive-streams test client settings.
driver-reactive-streams/src/main/com/mongodb/reactivestreams/client/internal/ClientSessionPublisherImpl.java Makes reactive client session closing idempotent and clears transaction context on close.
driver-legacy/src/main/com/mongodb/DBDecoderAdapter.java Converts manual closes to try-with-resources for BSON writer/output.
driver-core/src/test/unit/com/mongodb/internal/session/ServerSessionPoolTest.java New JUnit 5 tests replacing the removed Spock spec.
driver-core/src/test/unit/com/mongodb/internal/session/ServerSessionPoolSpecification.groovy Removed Spock specification (replaced by JUnit 5 test).
driver-core/src/test/unit/com/mongodb/internal/connection/DefaultServerMonitorTest.java Adds tests for close-during-open and null connection description scenarios.
driver-core/src/test/functional/com/mongodb/internal/connection/ReplyHeaderTest.java New JUnit 5 tests replacing the removed Spock spec.
driver-core/src/test/functional/com/mongodb/internal/connection/ReplyHeaderSpecification.groovy Removed Spock specification (replaced by JUnit 5 test).
driver-core/src/main/com/mongodb/internal/connection/DefaultServerMonitor.java Changes monitor shutdown/heartbeat flow to close connections under lock and handle null descriptions.
driver-core/src/main/com/mongodb/internal/connection/ByteBufferBsonOutput.java Expands ByteBuf ownership documentation and uses ResourceUtil.release for safer cleanup.
config/spotbugs/exclude.xml Adds a SpotBugs exclusion entry for DefaultServerMonitor.
bson/src/main/org/bson/BsonDocument.java Explicitly releases ByteBufs during serialization proxy creation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@rozza rozza requested a review from Copilot March 10, 2026 13:47
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

List<ByteBuf> byteBuffers = buffer.getByteBuffers();
for (ByteBuf cur : byteBuffers) {
System.arraycopy(cur.array(), cur.position(), bytes, curPos, cur.limit());
curPos += cur.position();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling cur.release() was also an option, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it is, this follows the pattern used through out to clean up after the loop.

return result;
});

if (localConnection != null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Condition 'localConnection != null' is always 'true' ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If close was called twice it would be null on the second iteration

return null;
});
if (description != null) {
if (!shouldStreamResponses) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a sample should not be gated by description != null RTT sampling is independent of logging

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch

while (!isClosed) {
try {
if (connection == null) {
InternalConnection localConnection = connection;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use withLock to dereference the connection?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous lock comment

logStateChange(previousServerDescription, currentServerDescription);
sdamProvider.get().monitorUpdate(currentServerDescription);

InternalConnection localConnection = connection;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Connection accessed without the lock?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this use case is fine as its a single snapshot of a volatile here.

I double checked this with claude:

The simple snapshots (L228, L243, L260, L622, L646) are just a single volatile read followed by use of the local. They don't need atomicity with any other field — they just need to avoid reading connection twice. volatile alone is sufficient for that.

So the lock is needed reading a volatile field multiple times without atomicity. Between any two reads, close() or cancelCurrentCheck() on another thread can null the field, causing the second read to return null and NPE on the dereference.

eg Previous Code:

// This contains two read checks and the connection could be nulled inbetween.
if (connection == null || connection.isClosed()) {
    // Do something
}

Fix for only do if not closed:

InternalConnection localConnection = withLock(lock, () -> {
    if (connection == null || connection.isClosed()) {
        return null;
    }
    return connection;
});

if (localConnection != null) {
    // Do something
}

Copy link
Member Author

@rozza rozza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responded to comments.

logStateChange(previousServerDescription, currentServerDescription);
sdamProvider.get().monitorUpdate(currentServerDescription);

InternalConnection localConnection = connection;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this use case is fine as its a single snapshot of a volatile here.

I double checked this with claude:

The simple snapshots (L228, L243, L260, L622, L646) are just a single volatile read followed by use of the local. They don't need atomicity with any other field — they just need to avoid reading connection twice. volatile alone is sufficient for that.

So the lock is needed reading a volatile field multiple times without atomicity. Between any two reads, close() or cancelCurrentCheck() on another thread can null the field, causing the second read to return null and NPE on the dereference.

eg Previous Code:

// This contains two read checks and the connection could be nulled inbetween.
if (connection == null || connection.isClosed()) {
    // Do something
}

Fix for only do if not closed:

InternalConnection localConnection = withLock(lock, () -> {
    if (connection == null || connection.isClosed()) {
        return null;
    }
    return connection;
});

if (localConnection != null) {
    // Do something
}

return null;
});
if (description != null) {
if (!shouldStreamResponses) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch

while (!isClosed) {
try {
if (connection == null) {
InternalConnection localConnection = connection;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous lock comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants