Skip to content

fix(tracing): shut down replaced trace provider in set_trace_provider#3241

Draft
adityasingh2400 wants to merge 1 commit intoopenai:mainfrom
adityasingh2400:fix/tracing-setup-shutdown-replaced-provider
Draft

fix(tracing): shut down replaced trace provider in set_trace_provider#3241
adityasingh2400 wants to merge 1 commit intoopenai:mainfrom
adityasingh2400:fix/tracing-setup-shutdown-replaced-provider

Conversation

@adityasingh2400
Copy link
Copy Markdown
Contributor

Summary

  • set_trace_provider swaps the global provider but never shuts down the previous one, leaking the default BatchTraceProcessor worker thread, the exporter's httpx.Client, and any custom processors that hold sockets / file handles. Anything that re-registers a provider (tests, hot-reload, switching exporters between OpenAI and a self-hosted backend) accumulates zombie threads and open connections until interpreter exit.
  • Snapshot the existing provider under the lock, install the new one, then call previous.shutdown() outside the lock so a slow shutdown cannot block concurrent get_trace_provider() callers. Identity-check first so re-registering the same provider stays idempotent. Swallow shutdown() exceptions so a misbehaving processor cannot prevent the swap.

Files

  • src/agents/tracing/setup.py (+15/-1)
  • tests/tracing/test_setup.py (+44/-0)

Test plan

  • pytest tests/tracing tests/test_trace*.py -v -> 78 passed (existing test_set_trace_provider_registers_shutdown_once updated to assert the previous provider is shut down exactly once).
  • New test verifies the replaced provider's shutdown() is called.
  • New test verifies a RuntimeError raised by the previous provider's shutdown() is logged but does not abort the swap (global state still points at the new provider).
  • New test verifies re-registering the same provider instance does not call shutdown() (idempotent).
  • ruff check clean on changed files.

Surgical 16-line fix in setup.py, focused regression tests. No docs / behavior changes for users who never call set_trace_provider twice.

`set_trace_provider` swapped the global provider but never shut down the
previous one, so its background processors (the default
`BatchTraceProcessor` worker thread, the exporter's `httpx.Client`, any
custom processors holding sockets/file handles) leaked until the
interpreter exited. Code that swaps providers — tests, hot-reload
workflows, applications switching exporters between OpenAI and a local
backend — accumulated zombie threads and open connections.

Snapshot the existing provider under the lock, install the new one, then
shut the old one down outside the lock so a slow `shutdown()` call does
not block other callers of `get_trace_provider()`. Identity-check first
so re-registering the same provider stays idempotent, and swallow
exceptions from the previous provider's `shutdown()` so a misbehaving
processor cannot prevent the swap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added bug Something isn't working feature:tracing labels May 8, 2026
@seratch
Copy link
Copy Markdown
Member

seratch commented May 8, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b80dea99e1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


if previous is not None and previous is not provider:
try:
previous.shutdown()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid shutting down the active trace provider

Because shutdown() runs after releasing the global-provider lock, a racing set_trace_provider(previous) can reinstall this previous instance before this line executes. This stale caller then shuts down the provider that is again GLOBAL_TRACE_PROVIDER; with the default batch processor, shutdown leaves its event set so later queued spans are not exported.

Useful? React with 👍 / 👎.

@seratch seratch marked this pull request as draft May 8, 2026 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working feature:tracing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants