Skip to content

fix(tracing): finish active trace on crash#1667

Open
jpnurmi wants to merge 21 commits into
masterfrom
jpnurmi/fix/trace-finish
Open

fix(tracing): finish active trace on crash#1667
jpnurmi wants to merge 21 commits into
masterfrom
jpnurmi/fix/trace-finish

Conversation

@jpnurmi
Copy link
Copy Markdown
Collaborator

@jpnurmi jpnurmi commented Apr 23, 2026

Before this PR, an in-flight trace was silently dropped on crash. And even with a manual sentry_transaction_finish right before the crash, the crash event still floated at the trace root rather than nesting under the active span — sentry_transaction_finish clears scope->span / scope->transaction_object, so by the time the crash handler captures its event, scope-apply falls through to the stale propagation context.

This PR auto-finishes the active span/transaction on crash with status aborted, and preserves scope->span / scope->transaction_object across the finish, so the trace reaches Sentry and the crash nests under it.

Before (manual finish) After (auto-finish)
image image

For comparison:

Comment thread src/sentry_tracing.c
Comment thread src/sentry_tracing.c
Comment thread src/sentry_tracing.c
Comment thread src/sentry_tracing.c
Comment thread src/sentry_tracing.c Outdated
Comment thread src/sentry_tracing.c
Comment thread src/sentry_core.c
@jpnurmi jpnurmi force-pushed the jpnurmi/fix/trace-finish branch from 1764758 to 570ed57 Compare April 24, 2026 15:35
@jpnurmi jpnurmi changed the title WIP: fix(tracing): finish active trace on crash fix(tracing): finish active trace on crash Apr 25, 2026
@jpnurmi jpnurmi force-pushed the jpnurmi/fix/trace-finish branch from 570ed57 to 264a69b Compare April 27, 2026 08:13
Comment thread src/sentry_tracing.c
Comment thread src/sentry_tracing.c
@jpnurmi jpnurmi changed the title fix(tracing): finish active trace on crash WIP: fix(tracing): finish active trace on crash Apr 27, 2026
Comment thread src/backends/sentry_backend_breakpad.cpp Outdated
Comment thread src/backends/sentry_backend_breakpad.cpp Outdated
Comment thread src/sentry_envelope.c Outdated
Comment thread src/backends/native/sentry_crash_daemon.c
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 4 total unresolved issues (including 2 from previous reviews).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 6a2e71b. Configure here.

Comment thread src/backends/sentry_backend_inproc.c
Comment thread src/backends/sentry_backend_native.c
jpnurmi and others added 12 commits May 20, 2026 16:16
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The prior crash-path trace finish only closed `scope->span` (the deepest
active span) and its root transaction. Intermediate spans between them —
e.g. Qt event dispatch handlers nested above a crashing slot — were never
finished, never added to `tx.spans`, and ended up orphaning the crash
event at the trace root in Sentry's UI.

Track every span on its root transaction under `children_mutex` at
`sentry__span_new`, deregister on normal finish, and drain the list in
leaf-first order inside `sentry__trace_finish`. Matches sentry-cocoa's
`SentryTracer` `_children` + `finishForCrash` and sentry-java's
`SentryTracer.forceFinish`.

Preserve `scope->span` / `scope->transaction_object` across the drain so
the subsequent crash event inherits the full trace context from scope
(mirrors cocoa's `finishTracer:shouldCleanUp:NO`). Without this, the
crash event fell through to the stale propagation context and Sentry
could not nest it under the active span chain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Trim WHAT-narration comments in sentry__trace_finish and its tests, keeping
the sentry-cocoa alignment anchor and the non-obvious "detach to skip
remove-scan" rationale. Warn on the tracking-list allocation failure so a
silent crash-finalize gap is audible. Use the typedef name for the forward
declaration in sentry_sampling_context.h for consistency with the rest of
the codebase.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Split sentry__trace_finish into save_active_trace / restore_active_trace
(as a pair around the scope mutation) and finish_children for the atomic
children swap + finish loop, so the top-level function reads as a short
narrative. No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
sentry_transaction_finish_ts consumes its argument, so the ref that
save_active_trace took for active_tx is released by the finish call.
restore_active_trace was decref'ing it again. Harmless in crash context
where the process exits, but a real refcount underflow otherwise.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
sentry_span_finish_ts consumes its argument (decref at sentry_core.c:1607
on success, :1611 on fail), so the explicit sentry__span_decref after the
finish call released the children-list ref a second time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The transaction's `children` list held an incref'd reference on each live
span, while each span holds a ref on its transaction via `span->transaction`.
Any code path that decref'd a span without going through `sentry_span_finish_ts`
(e.g. the `span_data` unit test using the low-level `sentry__span_new` /
`sentry__span_decref` API) left both sides stuck at refcount 1, leaking both.

Make `tx->children` weak: `sentry__span_new` no longer increfs on add, and
`sentry__span_decref` unlists the span from the children list on its final
drop. `sentry__transaction_remove_child` correspondingly no longer decrefs.
The drain path in `sentry__trace_finish` continues to work — the spans on
the swapped-out list are alive via their other refs (scope / saved_span /
user var), and `sentry_span_finish_ts` consumes one of those refs as the
caller's, exactly as in the non-crash flow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After the weak-children refactor, `finish_children` handed each span to
`sentry_span_finish_ts` without a caller ref. `finish_ts` consumes one
ref on its argument, so when the drained span had only the user's own ref
outstanding, finish_ts would drop it to zero and free the span — leaving
the user's pointer dangling.

Incref inside the drain loop so finish_ts consumes "our" ref, leaving user
refs intact. Also update the `trace_finish` unit test to release the refs
it held on `tx`/`child`/`grand` (without the children-list strong ref,
unfinished spans now properly leak if the user forgets to decref).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SENTRY_WITH_SCOPE_MUT calls sentry__scope_flush_unlock on exit, which
invokes the backend's flush_scope_func — file I/O in both crashpad and
native backends. sentry__trace_finish runs from signal-handler context
in the inproc/breakpad/native crash handlers, so the flush is unsafe
there. Use the NO_FLUSH variant the codebase provides for this exact
situation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Call restore_active_trace before the early return so any refs taken by
save_active_trace are released even if active_tx is NULL. No-op under
the current invariant (saved_span->transaction is always non-NULL), but
defensive if the invariant ever changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jpnurmi jpnurmi force-pushed the jpnurmi/fix/trace-finish branch from 6a2e71b to 336c007 Compare May 21, 2026 05:15
@jpnurmi jpnurmi changed the title WIP: fix(tracing): finish active trace on crash fix(tracing): finish active trace on crash May 22, 2026
Comment thread src/backends/sentry_backend_inproc.c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant