Skip to content

feat: add Perfetto performance tracing support#497

Open
wdconinc wants to merge 9 commits into
JeffersonLab:masterfrom
wdconinc:perfetto
Open

feat: add Perfetto performance tracing support#497
wdconinc wants to merge 9 commits into
JeffersonLab:masterfrom
wdconinc:perfetto

Conversation

@wdconinc
Copy link
Copy Markdown
Member

@wdconinc wdconinc commented May 20, 2026

Briefly, what does this PR introduce?

Adds optional Perfetto SDK integration to JANA2 (-DUSE_PERFETTO=ON) that enables real-time, per-thread tracing of factory execution for identifying performance bottlenecks and inter-factory blocking in multithreaded runs. The resulting .perfetto trace file can be opened at https://ui.perfetto.dev.

Six trace categories are recorded:

Category Content
jana Arrow dispatch in the execution engine (one span per arrow->Fire())
factory Full factory activation: Init + run callbacks + Process (one span per factory per event)
factory_init Init() callback — fires once, on first factory activation
factory_begin_run BeginRun() callback — fires on run-number change
factory_change_run ChangeRun() callback — fires on run-number change
factory_end_run EndRun() callback — fires on run-number change

Inter-factory dependency flow arrows are drawn between parent and child factory spans, showing exactly which factory triggered which other factory and at what point in its execution, making the data-dependency graph directly visible in the timeline.

What is the urgency of this PR?

  • High (please describe reason below)
  • Medium
  • Low

What kind of change does this PR introduce?

  • Bug fix (issue #__)
  • New feature (issue #__)
  • Optimization (issue #__)
  • Updated parameters, constants (issue #__)
  • Updated documentation
  • other: __

Please check if any of the following apply

  • This PR requires changes to geometry (epic PR: __)
  • This PR requires changes to EDM4eic (EDM PR: __)
  • This PR introduces breaking changes. Please describe changes users need to make below.
  • This PR changes default behavior. Please describe changes below.
  • AI was used in preparing this PR. Please describe usage below.

Perfetto is opt-in at both build time (-DUSE_PERFETTO=ON, off by default) and runtime (-Pplugins=perfetto). No existing behavior is changed when Perfetto is not enabled.

AI usage: GitHub Copilot was used to design and implement the tracing integration, debug timing gaps, and refine the category structure.


Implementation details

Build system (CMakeLists.txt): new USE_PERFETTO option; downloads Perfetto SDK v55.3 amalgamation from GitHub releases; builds perfetto_sdk static library; sets JANA2_HAVE_PERFETTO for conditional compilation. JVersion.h.in gains a HasPerfetto() capability flag.

JPerfettoService (new JService): manages the Perfetto in-process tracing session (256 MB ring buffer); names each worker thread track on startup via RegisterCurrentThread(worker_id); flushes and writes the trace file on shutdown.

JFactory.cc: the factory span (TRACE_EVENT_BEGIN/END with RAII SpanGuard) wraps the entire activation — DoInit() + run callbacks + Process() — so that factory_init/factory_begin_run/factory_change_run spans appear as children in the hierarchy. A thread_local g_current_executing_factory pointer tracks the currently-executing factory on each thread; when a child factory's Create() is entered, a TRACE_EVENT_INSTANT with Flow::Global(id) is emitted inside the still-open parent span, and the child's span opens with TerminatingFlow::Global(id), drawing a dependency arrow in the UI.

JExecutionEngine.cc: TRACE_EVENT around arrow->Fire() in the worker loop (category jana); RegisterCurrentThread() at worker startup.

perfetto plugin (src/plugins/perfetto/): loads JPerfettoService when -Pplugins=perfetto is passed.

Usage

cmake -DUSE_PERFETTO=ON -DCMAKE_INSTALL_PREFIX=install ...
eicrecon -Pplugins=perfetto -Pjana:perfetto_output=trace.perfetto input.edm4hep.root
# Open trace.perfetto at https://ui.perfetto.dev

wdconinc and others added 6 commits May 20, 2026 13:00
Adds optional Perfetto SDK integration (USE_PERFETTO=ON) that enables
real-time per-thread tracing of JANA2 factory execution for identifying
performance bottlenecks and inter-factory blocking in multithreaded runs.

Changes:
- CMakeLists.txt: new USE_PERFETTO option; downloads Perfetto SDK v55.3
  amalgamation from GitHub releases; builds perfetto_sdk static library;
  sets JANA2_HAVE_PERFETTO for conditional compilation
- JVersion.h.in: adds JANA2_HAVE_PERFETTO capability flag and HasPerfetto()
- JPerfettoService: new JService managing the Perfetto tracing session;
  initializes in-process backend, writes trace file on shutdown;
  exposes RegisterCurrentThread(worker_id) for named thread tracks
- JFactory.cc: TRACE_EVENT_BEGIN/END around Process() calls (category 'factory')
- JExecutionEngine.cc: TRACE_EVENT_BEGIN/END around arrow->Fire() calls
  (category 'jana'); RegisterCurrentThread() at worker startup
- perfetto plugin: loads JPerfettoService via -Pplugins=perfetto

Usage:
  cmake -DUSE_PERFETTO=ON ...
  eicrecon -Pplugins=perfetto -Pjana:perfetto_output=trace.perfetto ...
  # Open trace.perfetto at https://ui.perfetto.dev

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a new 'factory_init' Perfetto category to distinguish factory
lifecycle callbacks from per-event Process() execution:

- factory_init: Init(), ChangeRun(), BeginRun(), EndRun() callbacks
- factory:      Process() per event (existing)

Each factory_init span carries a 'phase' annotation ('Init',
'ChangeRun', 'BeginRun', 'EndRun') and a 'run_nr' annotation on
run-boundary spans. The factory name is used as the slice name in
both categories, making the timeline immediately readable.

The new category doubles the trace size on a typical EICrecon run
(349 kB → 797 kB) because every factory initializes on first use.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tions

Replace all TRACE_EVENT_BEGIN/END pairs with the scoped TRACE_EVENT
macro, which uses RAII to call TRACE_EVENT_END in its destructor.
This ensures trace spans are always properly closed even when factory
callbacks (Process, Init, ChangeRun, BeginRun, EndRun) or arrow Fire()
throw exceptions.

Previously, factories that threw (e.g. 'No beam protons found') would
appear as 'did not end' in the Perfetto UI because the exception
propagated past the TRACE_EVENT_END call to the outer catch block.

Changes:
- JFactory.cc: DoInit(), Create() Process/EndRun/ChangeRun/BeginRun
  all wrapped in explicit {} blocks with scoped TRACE_EVENT
- JExecutionEngine.cc: arrow->Fire() wrapped in explicit {} block
- Remove now-redundant manual TRACE_EVENT_END from DoInit catch block

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Previously the factory TRACE_EVENT only covered Process(), making Init
and run callback (BeginRun/ChangeRun/EndRun) spans appear as siblings
before the factory span rather than children inside it.

Now the factory span opens before DoInit(), so all factory_init category
spans (Init, ChangeRun, BeginRun, EndRun) appear as visible children
inside the factory span in the Perfetto UI. Clicking any factory span
will show the Init phase as a nested child span with its own duration
and phase annotation.

SQL verification on a 10-event trace confirms:
  factory -> factory_init: 3372 (all Init/run callback spans are children)
  factory -> factory:      2320 (dependent factory spans as before)
  jana    -> factory:       880 (top-level factory activations from engine)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When factory B is triggered from within factory A's Process(), Perfetto
now draws an arrow from the point in A's span where B was triggered to
the start of B's span, visualizing the data-dependency graph.

Implementation:
- Thread-local g_current_executing_factory tracks which factory span is
  currently open on each thread (set on span open, restored on close via
  RAII CallerGuard).
- When B's Create() is entered and a caller factory is recorded, compute
  a unique FNV-1a-inspired flow ID from (caller, callee, event_number).
- Emit a TRACE_EVENT_INSTANT with Flow::Global(flow_id) while still
  executing within A's open span — this is the arrow's tail.
- Open B's span with TRACE_EVENT_BEGIN + TerminatingFlow::Global(flow_id)
  — this is the arrow's head.  Using TRACE_EVENT_BEGIN instead of the
  scoped TRACE_EVENT macro avoids the if-else early-close problem (the
  scoped TRACE_EVENT in an if-else branch would close before DoInit/Process).
- A SpanGuard RAII struct calls TRACE_EVENT_END on destruction, ensuring
  the span is always closed even on exception.

SQL verification on a 10-event trace:
  2320 flow-start instants (= factory spans that have a parent factory)
  Top edge: CalorimeterClusterShape → CalorimeterClusterRecoCoG (240 times)
  Full dependency graph visible: CKFTracking → AmbiguitySolver →
    ActsToTracks → TrackerHitReconstruction → SiliconTrackerDigi etc.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…run/end_run

Use separate Perfetto categories for each factory lifecycle callback so they
appear as distinct tracks in the trace, instead of all being labeled factory_init:
- factory_init       : Init() callback (first activation only)
- factory_begin_run  : BeginRun() callback (on run number change)
- factory_change_run : ChangeRun() callback (on run number change)
- factory_end_run    : EndRun() callback (on run number change)

The redundant 'phase' annotation is removed since the category name is
now self-describing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@wdconinc wdconinc marked this pull request as ready for review May 21, 2026 15:14
Copilot AI review requested due to automatic review settings May 21, 2026 15:14
@wdconinc
Copy link
Copy Markdown
Member Author

Sample trace: eicrecon_trace_v5.perfetto.zip

image

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds optional Perfetto SDK integration (build-time USE_PERFETTO, runtime -Pplugins=perfetto) to emit per-thread tracing spans for arrow dispatch and factory execution, including inter-factory dependency flow arrows for diagnosing multithreaded performance bottlenecks.

Changes:

  • Introduces JPerfettoService to manage an in-process Perfetto tracing session and write a .perfetto trace file on shutdown.
  • Instruments core execution paths (JExecutionEngine::RunWorker, JFactory::Create/DoInit) with Perfetto trace events and dependency flows.
  • Extends the build system and plugin system to optionally build/link Perfetto and provide a perfetto plugin.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
CMakeLists.txt Adds USE_PERFETTO option, downloads/builds perfetto_sdk, and sets JANA2_HAVE_PERFETTO.
src/libraries/JANA/CMakeLists.txt Links perfetto_sdk into JANA targets and conditionally adds JPerfettoService.cc.
src/libraries/JANA/JVersion.h.in Adds JANA2_HAVE_PERFETTO and JVersion::HasPerfetto().
src/libraries/JANA/Services/JPerfettoService.h Declares JPerfettoService and Perfetto trace categories.
src/libraries/JANA/Services/JPerfettoService.cc Implements tracing session startup, flush/stop, and trace file output.
src/libraries/JANA/JFactory.cc Wraps factory activation in Perfetto spans; emits dependency flow arrows; adds callback spans.
src/libraries/JANA/Engine/JExecutionEngine.cc Adds Perfetto spans around arrow->Fire() and registers worker thread tracks.
src/plugins/CMakeLists.txt Adds the new perfetto plugin subdirectory.
src/plugins/perfetto/CMakeLists.txt Builds the perfetto plugin when enabled and adds a basic integration test.
src/plugins/perfetto/perfetto_plugin.cc Plugin entrypoint that provides JPerfettoService and enables call graph recording.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread CMakeLists.txt
Comment thread CMakeLists.txt Outdated
Comment thread src/libraries/JANA/JFactory.cc Outdated
Comment thread src/libraries/JANA/JFactory.cc
Comment thread src/libraries/JANA/CMakeLists.txt Outdated
@wdconinc
Copy link
Copy Markdown
Member Author

FYI @nathanwbrei. This is a useful way to trace performance and identify visually the misbehaving factories. Tags, event number, and phase are stored as metadata and can be selected for in the sqlite database. Typical overheads for trace events are (empirically) on the order of 5 to 10 us, but are not incurred when this is not enabled at compile time or if the plugin is not included.

wdconinc and others added 3 commits May 21, 2026 10:35
The FNV-1a hash (3 XOR+MUL ops) and GetEventNumber() ran unconditionally
whenever a parent factory was present, even when the 'factory' category is
disabled (no active tracing session). These are not protected by the lazy
lambda evaluation inside TRACE_EVENT_INSTANT/BEGIN.

Add an explicit TRACE_EVENT_CATEGORY_ENABLED("factory") check so the hash
and all flow-related work is skipped entirely when tracing is off.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Five issues raised in PR JeffersonLab#497 review:

1. Add SHA256 integrity hash for Perfetto SDK download (supply-chain
   pinning). CMake will fail if the downloaded zip doesn't match.

2. Guard -Wno-* compile flags on perfetto_sdk with a CXX_COMPILER_ID
   generator expression (GNU/Clang/AppleClang only) so MSVC and other
   toolchains don't receive unknown flags.

3. factory_end_run span now records mPreviousRunNumber (the run being
   ended) instead of run_number (the new run). The previous value is
   the semantically correct label for the EndRun callback.

4. Flow-ID hash casts JFactory* through uintptr_t before widening to
   uint64_t, avoiding direct reinterpret_cast<uint64_t> on a pointer
   which is non-portable on non-64-bit platforms.

5. JPerfettoService.cc is now always compiled (not gated on
   USE_PERFETTO). The file already contains a JANA2_HAVE_PERFETTO==0
   stub, so downstream code that includes JPerfettoService.h will
   link correctly regardless of whether Perfetto was built in.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@wdconinc
Copy link
Copy Markdown
Member Author

Failing halld_recon check is due to disk space issue: https://github.com/JeffersonLab/JANA2/actions/runs/26239971577/job/77223373341?pr=497#step:4:3030

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants