fix(tracing): shut down replaced trace provider in set_trace_provider#3241
fix(tracing): shut down replaced trace provider in set_trace_provider#3241adityasingh2400 wants to merge 1 commit intoopenai:mainfrom
Conversation
`set_trace_provider` swapped the global provider but never shut down the previous one, so its background processors (the default `BatchTraceProcessor` worker thread, the exporter's `httpx.Client`, any custom processors holding sockets/file handles) leaked until the interpreter exited. Code that swaps providers — tests, hot-reload workflows, applications switching exporters between OpenAI and a local backend — accumulated zombie threads and open connections. Snapshot the existing provider under the lock, install the new one, then shut the old one down outside the lock so a slow `shutdown()` call does not block other callers of `get_trace_provider()`. Identity-check first so re-registering the same provider stays idempotent, and swallow exceptions from the previous provider's `shutdown()` so a misbehaving processor cannot prevent the swap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b80dea99e1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| if previous is not None and previous is not provider: | ||
| try: | ||
| previous.shutdown() |
There was a problem hiding this comment.
Avoid shutting down the active trace provider
Because shutdown() runs after releasing the global-provider lock, a racing set_trace_provider(previous) can reinstall this previous instance before this line executes. This stale caller then shuts down the provider that is again GLOBAL_TRACE_PROVIDER; with the default batch processor, shutdown leaves its event set so later queued spans are not exported.
Useful? React with 👍 / 👎.
Summary
set_trace_providerswaps the global provider but never shuts down the previous one, leaking the defaultBatchTraceProcessorworker thread, the exporter'shttpx.Client, and any custom processors that hold sockets / file handles. Anything that re-registers a provider (tests, hot-reload, switching exporters between OpenAI and a self-hosted backend) accumulates zombie threads and open connections until interpreter exit.previous.shutdown()outside the lock so a slow shutdown cannot block concurrentget_trace_provider()callers. Identity-check first so re-registering the same provider stays idempotent. Swallowshutdown()exceptions so a misbehaving processor cannot prevent the swap.Files
src/agents/tracing/setup.py(+15/-1)tests/tracing/test_setup.py(+44/-0)Test plan
pytest tests/tracing tests/test_trace*.py -v-> 78 passed (existingtest_set_trace_provider_registers_shutdown_onceupdated to assert the previous provider is shut down exactly once).shutdown()is called.RuntimeErrorraised by the previous provider'sshutdown()is logged but does not abort the swap (global state still points at the new provider).shutdown()(idempotent).ruff checkclean on changed files.Surgical 16-line fix in
setup.py, focused regression tests. No docs / behavior changes for users who never callset_trace_providertwice.