feat(pprof): emit per-sample process_language label#552
Draft
r1viollet wants to merge 1 commit into
Draft
Conversation
Heuristically detect the native language family (go/rust/cpp) of each
profiled process' main executable and attach it as a pprof per-sample
label `process_language`. Unknown/mixed cases fall back to the existing
"native" tag (no label emitted).
Detection runs once per PID, lazily on first sample, using the Elf*
already opened by libdwfl (via dwfl_module_getelf) -- no extra file
open. The check is cheap and never reads DWARF:
* Go -> .go.buildinfo / .gopclntab ELF section
* Rust -> .note.rustc section, or rustc-mangled symbols
(`_R...` v0, or legacy `...17h<16 hex>E` tail)
* Cpp -> any `_Z...` Itanium-mangled symbol
* else -> kUnknown (caller leaves label unset)
Symbol-table scan is bounded to 4096 entries. Result is cached on the
Process object and cleared with the rest of its state on PID exit.
A fallback path opening /proc/<pid>/exe is provided for callers without
an Elf* in hand (unit tests, early bring-up).
Collaborator
Author
|
@codex review |
|
Codex Review: Didn't find any major issues. 🎉 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Emit a per-sample pprof label
process_languagecarrying a heuristic native-language family for the profiled process' main executable (go/rust/cpp). When detection is inconclusive, no label is set and the existinglanguage="native"tag continues to be the sole signal.Why
Today every native sample is reported as
language="native"— Rust, C++, plain C and (mostly) Go are indistinguishable on the backend side. With this label, the backend can refine the per-process language for the dominant case where one language clearly drives a binary, without splitting profiles.Mirrors the approach we will likely use in the OTel eBPF profiler / Full-Host Profiler (also no DWARF involvement, also per-sample label rather than a global tag), so the two producers will be consistent.
How
NativeLanguageenum + cheap detection (include/native_language.hpp/src/native_language.cc):.go.buildinfo/.gopclntabELF section.note.rustcsection, or rustc-mangled symbols in.symtab/.dynsym(_R…v0, or legacy…17h<16 hex>Etail)_Z…Itanium-mangled symbolkUnknown(label omitted)Elf*already cached by libdwfl (dwfl_module_getelf) — no extra file open.Process, cleared via the existingProcessHdr::clear(pid)on process exit./proc/<pid>/exefallback path is kept for callers without anElf*in hand (tests / early bring-up).process_languagealongsideprocess_id,process_name, etc. (k_max_pprof_labelsraised 8 → 9).Verification
tools/style-check.shninja(853 targets)ctest -j4ddprof_stats-ut,ipc-ut,simple_malloc-with-event-reordering) reproduce onmainwithout this patch — pre-existing, unrelatedSmoke-tested end-to-end with
ddprof -p <pid>:[NATIVE-LANG] -> go(single detection log across all samples; libdwfl-cached Elf* reused)cpp/bin/cat,sleep(C) →kUnknown→ no label, falls back tonativetagNotes for review
language="native"tag is unchanged..dynsymC++ symbols, Cpp detection may miss — expected, label stays unset →native.