fix(intercept): scoped intercept_context() — prevent ContextVar leak across async tool/LLM boundaries by aural-psynapse · Pull Request #11 · ProvablyAI/verifiable-data-agentkit

aural-psynapse · 2026-05-06T09:59:16Z

What

Adds provably.intercept.intercept_context() — a context manager that scopes the agent_id / action_name / intercept_index tag to a with block and resets the underlying ContextVars on exit.

from provably.intercept import intercept_context

@function_tool
def get_temperature():
    with intercept_context(agent_id="demo", action_name="get_weather"):
        return requests.get("https://api.example.com/...").json()

Removes set_interceptor_context() entirely. The fire-and-forget setter was the leaky API — there's no useful callsite for it that wouldn't be better served by intercept_context. No backward-compat shim. Callers must migrate to the new context manager (one-line change).

Why — the bug being fixed

set_interceptor_context() wrote to a ContextVar and never reset it. Inside an async agent loop the tool function and the subsequent LLM turn run in the same asyncio.Task, so a tag set inside the tool keeps applying to every HTTP call that fires after the tool returns. Those LLM calls get persisted into provably_intercepts carrying the tool's action_name.

Two downstream failure modes in user code that calls build_handoff_payload:

#	Code path	What goes wrong
1	`load_latest_intercept_payload(pg_url, action_name, agent_id)` runs `ORDER BY created_at DESC LIMIT 1` against `(agent_id, action_name)`	Returns the LLM POST that fired after the tool, not the tool's own GET. The claim's `request_payload[\"url\"]` ends up being the LLM provider URL, which trips the trust gate's "endpoints missing from trusted snapshot" check.
2	`get_intercept_row_id(agent_id, action_name)` looks up the in-memory `_action_row_ids` dict, which was overwritten with the LLM row's id	The Provably query record gets created over the LLM row, so the indexed value is the LLM completion JSON. Verbatim claim comparison in `evaluate_handoff` then returns `CAUGHT` no matter what the agent claims.

Both symptoms have one root cause; this PR fixes them both.

Migration

If you previously had:

set_interceptor_context(agent_id="demo", action_name="get_weather")
result = requests.get(...).json()

Replace with:

with intercept_context(agent_id="demo", action_name="get_weather"):
    result = requests.get(...).json()

That's the entire migration. There's no reason to call the setter outside a scope — the leak made the unscoped pattern unsafe in async contexts and offered no advantages in sync contexts.

Tests

tests/unit/test_intercept_context.py — 6 new tests, all green:

intercept_context sets values inside the block ✅
exit restores the default state when no prior values were set ✅
exit restores prior values when nested ✅
exit fires even when the body raises ✅
rationale documentation — test_naked_ctx_var_set_leaks_into_subsequent_calls: simulates an LLM → tool → LLM sequence in one asyncio.Task using a raw ContextVar.set() (which is what set_interceptor_context used to do internally), asserts the leak still happens with that pattern. Documents why the context manager is necessary and pins the asyncio behavior we're working around. ✅
the actual fix — test_intercept_context_does_not_leak_into_subsequent_calls: same scenario with intercept_context, asserts the second LLM call goes back to (\"unknown\", \"unknown\"). ✅

Full suite: 88/88 pass (was 82, +6 new). Ruff clean.

Out of scope (deliberately separate)

Updating any examples / demos that previously called set_interceptor_context — those live on a different branch and will be migrated in their respective PRs (this PR only changes the SDK surface; downstream callers migrate independently).

Suggested reviewer hops

src/provably/intercept/interceptor.py — set_interceptor_context removed, intercept_context added.
src/provably/intercept/init.py and src/provably/init.py — export changes.
tests/unit/test_intercept_context.py — the rationale-documenting test (test_naked_ctx_var_set_leaks_into_subsequent_calls) is the most important one to read; it documents the exact failure mode being prevented.

…ntextVar leak `set_interceptor_context` is fire-and-forget — once called it never resets the underlying ContextVars. Inside an async agent loop, the tool function and the subsequent LLM call run in the same `asyncio.Task`, so the tag set inside the tool keeps applying to LLM calls fired after the tool returns. Those LLM calls end up recorded in `provably_intercepts` with the tool's `action_name`. Two downstream symptoms in user code that calls `build_handoff_payload`: 1. `load_latest_intercept_payload(pg_url, action_name, agent_id)` does `ORDER BY created_at DESC LIMIT 1` and returns the most recent row matching the (agent_id, action_name) key — which, because of the leak, is the LLM POST that fired AFTER the tool, not the tool's own GET. The claim's `request_payload` then carries the LLM provider URL and trips the trust gate's "endpoints missing from trusted snapshot" check. 2. `get_intercept_row_id(agent_id, action_name)` returns the same wrong row id, so the Provably query record indexes the LLM completion JSON. The verbatim claim comparison in `evaluate_handoff` then always returns CAUGHT even when the agent's claim and the actual data agree. Fix: add `intercept_context()` — a contextlib context manager that calls `ContextVar.set` on enter, captures the tokens, and `reset` on exit. Same pattern as the existing `provably_self_egress()` manager. `set_interceptor_context` is kept for backward compatibility; its docstring now carries a warning recommending `intercept_context` for tool bodies. Includes a regression test that pins the leaky behavior of `set_interceptor_context` and a parallel test that proves `intercept_context` does not leak. 88/88 tests pass (was 82, +6 new).

The fire-and-forget setter was the leaky API; ship intercept_context as the single supported way to tag intercepts. Drop the deprecation shim and update the regression test to demonstrate the leak via raw ContextVar.set() instead of the removed public function. 88/88 tests still pass.

…shooting checklist Two doc-only additions surfaced by a debugging session where multiple unrelated misuses of the SDK all produced the same CAUGHT outcome: - intercept_context docstring: explicit "must be used with `with`" callout warning that a bare call is a no-op, plus a note that agent_id must match the intercept_agent_id passed to build_handoff_payload. - README `### eval`: short troubleshooting checklist for unexpected CAUGHT covering the four common causes (tool body never ran, bare context-manager call, agent_id mismatch, wrong row-id helper). No code changes. 88/88 tests still pass.

aural-psynapse requested a review from SimoneBottoni May 6, 2026 09:59

aural-psynapse self-assigned this May 6, 2026

aural-psynapse added 2 commits May 6, 2026 12:09

aural-psynapse mentioned this pull request May 6, 2026

feat(examples): manual-loop demo (no agent framework) #12

Closed

SimoneBottoni merged commit f1c52ab into main May 6, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(intercept): scoped intercept_context() — prevent ContextVar leak across async tool/LLM boundaries#11

fix(intercept): scoped intercept_context() — prevent ContextVar leak across async tool/LLM boundaries#11
SimoneBottoni merged 3 commits into
mainfrom
fix/intercept-context-leak

aural-psynapse commented May 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aural-psynapse commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why — the bug being fixed

Migration

Tests

Out of scope (deliberately separate)

Suggested reviewer hops

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aural-psynapse commented May 6, 2026 •

edited

Loading