Preserve live-allocation tracking across worker resets#550
Conversation
Workers are restarted by forking a fresh process from the parent, which loses everything in DDProfWorkerContext — including the heap-tracking aggregator in LiveAllocation. Until natural alloc/free traffic refills the map, live-heap is undercounted for the rest of the target's life. Add a serialisation path that survives the fork: - main_loop allocates a memfd that the parent keeps open and every worker child inherits. - On 'restart_worker', the outgoing child resolves its UnwindOutput handles back to portable strings (via libdatadog Function2/Mapping2 read-back) and writes a self-owned snapshot to the memfd. - The new child reads the snapshot in worker_library_init, re-interns mappings/functions into its fresh ProfilesDictionary and rebuilds the LiveAllocation maps before the poll loop starts draining events. - LiveAllocation owns a string deque backing the string_views of restored UnwindOutputs; live entries built from incoming events keep using Process/base-frame views. Budget enforcement, value-preserving: - Default target 4 MB, hard ceiling 20 MB. - When over budget, rank stacks by aggregate value and drop the lowest; their addresses are remapped to a synthetic [live-alloc cleared] common frame so per-PID heap totals remain correct. - If still over after dropping all stacks, drop entire PIDs from the lowest aggregate value upwards. In-flight events between the old child exit and the new child's first poll are still lost; a library-side pause hook is a separate change.
Add a third live-heap variant to simple_malloc-ut.sh that drives the
worker into at least one reset (upload_period=2s, worker_period=2) with
--skip-free 100 keeping ~99% of allocations live, and checks:
- at least one '[live-alloc] Snapshot restored' log line
- zero 'Tracked address count mismatch' warnings between the profiler
and the in-target library after restore
Adds ~7s to the simple_malloc suite (target needs to outlive 2 export
cycles). Same test runs under DD_PROFILING_REORDER_EVENTS=1 too.
This comment has been minimized.
This comment has been minimized.
clang-tidy errors flagged by CI: - readability-math-missing-parentheses on sizeof(T) * N + ... arithmetic - cppcoreguidelines-avoid-const-or-ref-data-members on Writer::_out (switch the reference member to a non-owning pointer) - readability-uppercase-literal-suffix (0u -> 0U) - misc-const-correctness on loop indices (uint32_t idx -> uint32_t const idx) Also adds a TODO block above portable_to_uo() spelling out the four overlapping caches (ProfilesDictionary, SymbolTable/MapInfoTable, RuntimeSymbolLookup et al., _restored_strings), the duplicate-entry cost we accept on the restore path, and how a future PR can unify the model by making FunLoc identity content-based on libdatadog handles.
- DD_PROFILING_NATIVE_LIVE_ALLOC_SNAPSHOT_MAX_BYTES overrides the per-capture budget. Capped at the hard ceiling. Lets tests force the cleared-stack remap path and the dropped-pid fallback without rebuilding the binary. - simple_malloc --unique-sites N spreads allocations across up to 256 templated alloc_at_site<Tag> instantiations, each producing a distinct innermost frame to the unwinder. Used to stress-test the snapshot path with many unique stacks per cycle. Verified locally at three budget levels: full preservation, cleared remap (stacks=30 cleared=582 dropped_pids=0 at 240 KB), and pid drop (dropped_pids=1 at 8 KB). All paths keep 'Tracked address count mismatch' warnings at zero in the steady state.
|
@codex review |
|
Codex Review: Something went wrong. Try again later by commenting “@codex review”. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
|
Codex Review: Something went wrong. Try again later by commenting “@codex review”. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a74f2b018a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const UnwindOutput &uo = (a.stack_idx == k_cleared_stack_idx) | ||
| ? cleared_uo | ||
| : rebuilt[a.stack_idx]; | ||
| live_alloc.register_allocation(uo, a.addr, a.value, p.watcher_pos, p.pid); |
There was a problem hiding this comment.
Register restored PIDs with process tracking
When a worker restores live allocations for a PID and that process exits before the new worker unwinds or maps that PID, the PID is only reintroduced into LiveAllocation here, not into process_hdr. clear_unvisited_pids() only removes PIDs known to process_hdr, and ddprof_pr_exit() only logs exits, so the restored live allocations can survive as stale in-use heap samples for a dead process across later exports.
Useful? React with 👍 / 👎.
What
Snapshot
LiveAllocationto a parent-held memfd before a worker restart andrestore it in the newly-forked worker, so live-heap tracking survives the
periodic worker reset instead of starting from zero each time.
Why
Workers are reset by forking a fresh process from the parent (every
worker_periodexports).Everything in
DDProfWorkerContext— including the heap-tracking aggregatorin
LiveAllocation— is discarded. The target process keeps allocatingaddresses the profiler no longer has stacks for, so live-heap is undercounted
until natural alloc/free traffic refills the map. The library has no way to
replay since it only tracks addresses, not stacks.
How
main_loopcreates a memfd that the parent keeps open and every workerchild inherits.
UnwindOutputhandles areresolved back to portable strings via libdatadog
Function2/Mapping2read-back; all string_views into
Process/base-frame caches are copied.final synchronous export).
worker_library_initafter the newSymbolHdr/Symbolizer/ProfilesDictionaryare constructed butbefore the poll loop drains events. Mappings and functions are re-interned
into the fresh dictionary;
LiveAllocationowns a string deque backingthe
string_views of restoredUnwindOutputs.state into the next worker.
Budget enforcement (value-preserving)
addresses get remapped to a synthetic
[live-alloc cleared]common frameso per-PID heap totals remain correct even when detail is shed.
live_alloc.snapshot.bytes,live_alloc.snapshot.cleared_stacks,live_alloc.snapshot.dropped_pids.Known gap
Events arriving between the old child's exit and the new child's first poll
are still lost. A library-side pause hook can be added separately.
Tests
live_allocation_snapshot-ut(binary round-trip, bad-magic andtruncation rejection, empty snapshot).
live_allocation-utstill green.