fix: free line number tables while class pins held#510
fix: free line number tables while class pins held#510
Conversation
cleanupUnreferencedMethods() was called after finishChunk() released GetLoadedClasses pins. If a JVM freed JVMTI-allocated line number table memory on class unload, the subsequent jvmti->Deallocate in SharedLineNumberTable::~SharedLineNumberTable() would hit freed memory. Move the cleanup call into finishChunk() before releasing class pins so Deallocate always runs while the corresponding class is still loaded. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nup) Exercise the crash scenario: dynamic class gets profiled (line number table stored in method_map), class loader is dropped and GC'd, then AGE_THRESHOLD+1 dump cycles age the method out and trigger cleanup. Profiler must survive. Catches the bug on JVMs that free JVMTI- allocated memory on class unload. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CI Test ResultsRun: #25502881382 | Commit:
Status Overview
Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled Summary: Total: 32 | Passed: 32 | Failed: 0 Updated: 2026-05-07 15:21:32 UTC |
|
@codex review |
💡 Codex ReviewIn runs where this loader is not cleared before the 8-second deadline, the loop silently falls through and the test continues with the dynamic class still loaded. That means the later dump cycles exercise cleanup while the class is still pinned/live, so the regression can pass without reproducing the class-unload scenario it is meant to guard; please assert or use an assumption that ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
Addressed — added |
- CleanupAfterClassUnloadTest.java:98 — assumeTrue after GC poll so test is skipped (not silently passed) on JVMs where class unloading is non-deterministic (J9, Graal, musl variants) Co-Authored-By: muse <muse@noreply>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b8f17cf30f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } finally { | ||
| try { Files.deleteIfExists(baseFile); } catch (IOException ignored) {} | ||
| try { Files.deleteIfExists(dumpFile); } catch (IOException ignored) {} | ||
| } |
There was a problem hiding this comment.
Stop the manually restarted profiler on abort
When this test aborts at the class-unload assumption or throws anywhere after the manual profiler.execute(...), the finally block only deletes files and leaves the profiler running. Because stopProfiler() was called before the manual start, AbstractProfilerTest.stopped remains true, so the @AfterEach cleanup also skips profiler.stop(). This can leave an active recording that makes subsequent tests fail or observe the wrong profiler state; the manual start should be paired with a stop in this finally path.
Useful? React with 👍 / 👎.
What does this PR do?:
Fixes SIGSEGV in
Recording::cleanupUnreferencedMethodsby moving the cleanup call insidefinishChunk(), beforeGetLoadedClassespins are released viaDeleteLocalRef.Motivation:
PROF-14545: Production crash (SIGSEGV) in
Recording::cleanupUnreferencedMethods, first seen in dd-trace-java 1.56.1 after PR #327 introduced method-map cleanup.Root cause:
cleanupUnreferencedMethods()was called afterfinishChunk()returned, at which point the local references fromGetLoadedClasseshad already been released. On JVM implementations that free JVMTI-allocated line number table memory when a class is unloaded, thejvmti->Deallocate(_ptr)call insideSharedLineNumberTable::~SharedLineNumberTable()would access freed memory → SIGSEGV.finishChunk()callsGetLoadedClassesat its start, which pins all loaded classes in memory for the duration. By moving cleanup to run beforeDeleteLocalRefreleases those pins, we guarantee thatDeallocateis called while the corresponding class is still loaded.Additional Notes:
The fix is a 2-line change:
finishChunk()gains ado_cleanup=falseoptional parameter, andswitchChunk()passesdo_cleanup=trueinstead of callingcleanupUnreferencedMethods()separately.How to test the change?:
Added
CleanupAfterClassUnloadTestwhich reproduces the crash scenario:GetLineNumberTableand stores the table inmethod_map)System.gc()is called until the class is confirmed unloaded viaWeakReferenceAGE_THRESHOLD+1dump cycles age the method out ofmethod_mapand trigger cleanupWithout the fix, this test will crash on JVM implementations that free JVMTI-allocated memory on class unload. With the fix, it reliably passes.
For Datadog employees: