Skip to content

Commit ce4c5df

Browse files
bokelleyclaude
andauthored
docs(handler-authoring): expand with salesagent migration production patterns (#326)
* docs(handler-authoring): expand with salesagent migration production patterns Closes #229 Adds six advanced production patterns that surfaced from the salesagent MCP migration, expanding the existing guide from 838 to 1061 lines: - ResolvedIdentity DB-enrichment flow (second DB hop beyond auth) - Pattern 2b: subdomain tenant routing with two-middleware shape - validate_discovery_set() usage for discovery extension guard - @store.wrap idempotency wiring with factory prerequisites - Error handling section fixed (return not raise) + ADCPTaskError client/server distinction + error-code taxonomy table - Troubleshooting section (5 symptom → cause → fix entries) Also adds back-reference to advanced patterns in examples/mcp_with_auth_middleware.py docstring. https://claude.ai/code/session_01FxyYMBreWYeJxcJQCqsBbn * docs(handler-authoring): fix raise/return contradiction and validate_discovery_set comment - _resolve_identity example now returns None on failure so handlers can convert to adcp_error("AUTH_REQUIRED") — eliminates contradiction with the Troubleshooting section - validate_discovery_set inline comment updated to mention it also rejects mutating tools (not only unknown names) https://claude.ai/code/session_01FxyYMBreWYeJxcJQCqsBbn * docs(handler-authoring): apply review fixups Three small fixes on top of triage's PR #326: 1. Stale _impl skeleton at line 84 still showed raise AuthenticationRequired(), which contradicts the new return-None pattern documented immediately below. A first-time reader copying that line would hit the exact 500 failure mode the new section warns against. Replace with the return-None shape and add a one-line callout pointing at the Error handling section. 2. PgBackend import-path missing. Wiring example imported MemoryBackend from adcp.server then said 'swap MemoryBackend for PgBackend' without showing PgBackend's actual import. PgBackend lives in adcp.server.idempotency, not the top-level adcp.server namespace; agent code-gen would guess wrong. Add the explicit import line. 3. Cross-reference inflation. README footer claimed examples/mcp_with_auth_middleware.py demonstrated 'the tenant- routing middleware pattern from Pattern 2b' — it doesn't, the example only wires BearerTokenAuthMiddleware. Soften to 'foundation for Pattern 2b; bring your own subdomain-routing middleware on top.' Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent d92cfb1 commit ce4c5df

2 files changed

Lines changed: 272 additions & 14 deletions

File tree

docs/handler-authoring.md

Lines changed: 268 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -77,18 +77,69 @@ from myagent.identity import ResolvedIdentity
7777
class MyAgent(ADCPHandler):
7878
async def get_products(self, params, context: ToolContext | None = None):
7979
identity = _resolve_identity(context)
80+
if identity is None:
81+
return adcp_error("AUTH_REQUIRED", "Authentication required")
8082
return await get_products_impl(params, identity=identity)
8183

82-
def _resolve_identity(ctx: ToolContext | None) -> ResolvedIdentity:
84+
def _resolve_identity(ctx: ToolContext | None) -> ResolvedIdentity | None:
8385
if ctx is None or ctx.caller_identity is None:
84-
raise AuthenticationRequired()
86+
return None
8587
return ResolvedIdentity(
8688
principal_id=ctx.caller_identity,
8789
tenant_id=ctx.tenant_id,
8890
# … adapter config, feature flags, etc. from your DB
8991
)
9092
```
9193

94+
**Why `return None`, not raise.** Raising a non-``ADCPError`` exception
95+
produces a 500 to the client (see *Error handling* below); the
96+
``return None`` shape lets the handler turn the failure into a
97+
spec-compliant ``adcp_error`` envelope. The next section shows the
98+
DB-enrichment variant of the same pattern.
99+
100+
### ResolvedIdentity with DB enrichment
101+
102+
The `# … adapter config, feature flags, etc. from your DB` comment hides
103+
a second DB hop that most production handlers need. `context_factory`
104+
resolves `caller_identity` from the bearer token; `_resolve_identity`
105+
enriches it with per-principal config that isn't available at auth time.
106+
Return `None` on failure so the calling handler converts it to an error
107+
dict (raising a non-`ADCPError` exception produces a 500 — see
108+
[Troubleshooting](#troubleshooting)):
109+
110+
```python
111+
async def _resolve_identity(ctx: ToolContext | None) -> ResolvedIdentity | None:
112+
if ctx is None or ctx.caller_identity is None:
113+
return None
114+
row = await pool.fetchrow(
115+
"SELECT tenant_id, db_url, feature_flags "
116+
"FROM principals WHERE id = $1",
117+
ctx.caller_identity,
118+
)
119+
if row is None:
120+
return None
121+
return ResolvedIdentity(
122+
principal_id=ctx.caller_identity,
123+
tenant_id=row["tenant_id"],
124+
db_url=row["db_url"],
125+
feature_flags=frozenset(row["feature_flags"] or ()),
126+
)
127+
```
128+
129+
**Resolve once per request** at the top of the handler and check for
130+
`None` before delegating to `_impl`:
131+
132+
```python
133+
async def get_products(self, params, context: ToolContext | None = None):
134+
identity = await _resolve_identity(context)
135+
if identity is None:
136+
return adcp_error("AUTH_REQUIRED")
137+
return await get_products_impl(params, identity=identity)
138+
```
139+
140+
Passing the resolved identity through avoids compounding DB round-trips
141+
when a single handler call delegates to multiple `_impl`s.
142+
92143
## Typed handler params
93144

94145
Handler methods may declare their `params` as a Pydantic model instead
@@ -209,6 +260,61 @@ middleware that populates `adcp.server.auth.current_principal` /
209260
`current_tenant` yourself and keep using `auth_context_factory` — the
210261
`ContextVar`s are the contract, not the middleware class.
211262

263+
#### Pattern 2b — tenant routing via subdomain (nginx → bearer)
264+
265+
Production multi-tenant deployments sometimes route to per-tenant
266+
databases by subdomain (`acme.ads.example.com` → Postgres for tenant
267+
`acme`) before validating the bearer token. The correct shape is two
268+
separate middleware layers — not subdomain logic inside `validate_token`:
269+
270+
```python
271+
from contextvars import ContextVar
272+
from starlette.middleware.base import BaseHTTPMiddleware
273+
from adcp.server import BearerTokenAuthMiddleware, Principal
274+
275+
# Populated by SubdomainTenantMiddleware before BearerTokenAuthMiddleware runs.
276+
_routing_tenant: ContextVar[str | None] = ContextVar("routing_tenant", default=None)
277+
278+
279+
class SubdomainTenantMiddleware(BaseHTTPMiddleware):
280+
"""Extracts tenant from the leftmost hostname label (acme.ads.example.com → 'acme')."""
281+
282+
async def dispatch(self, request, call_next):
283+
host = request.headers.get("host", "")
284+
tenant = host.split(".")[0] if host.count(".") >= 2 else None
285+
token = _routing_tenant.set(tenant)
286+
try:
287+
return await call_next(request)
288+
finally:
289+
_routing_tenant.reset(token)
290+
291+
292+
async def validate_token(token: str) -> Principal | None:
293+
routing_tenant = _routing_tenant.get()
294+
row = await db.fetchrow(
295+
"SELECT principal_id, tenant_id FROM tokens "
296+
"WHERE token_hash = digest($1, 'sha256') AND revoked_at IS NULL",
297+
token,
298+
)
299+
if row is None:
300+
return None
301+
# Reject if the subdomain tenant disagrees with the token's tenant —
302+
# guards against cross-tenant token replay.
303+
if routing_tenant and row["tenant_id"] != routing_tenant:
304+
return None
305+
return Principal(caller_identity=row["principal_id"], tenant_id=row["tenant_id"])
306+
307+
308+
app.add_middleware(BearerTokenAuthMiddleware, validate_token=validate_token)
309+
app.add_middleware(SubdomainTenantMiddleware) # outermost → runs first
310+
```
311+
312+
> **Middleware order.** Starlette applies `add_middleware` calls from
313+
> bottom to top — `SubdomainTenantMiddleware` is added last so it wraps
314+
> outermost and runs first, populating `_routing_tenant` before
315+
> `BearerTokenAuthMiddleware` calls `validate_token`. Invert the order
316+
> and `_routing_tenant.get()` returns `None` on every request.
317+
212318
### Discovery tools bypass auth
213319

214320
Per AdCP spec, `get_adcp_capabilities` is the handshake — clients MUST
@@ -234,6 +340,19 @@ spec (e.g. a public `list_public_formats`); extend with `DISCOVERY_TOOLS
234340
[tools/list is unauthenticated by default](#toolslist-is-unauthenticated-by-default)
235341
for the MCP-layer handshake methods this same gate covers.
236342

343+
Call `validate_discovery_set` at import time to guard against accidentally
344+
including non-discovery tools in your extension (a common copy-paste error):
345+
346+
```python
347+
from adcp.server import DISCOVERY_TOOLS, validate_discovery_set
348+
349+
MY_DISCOVERY_TOOLS = DISCOVERY_TOOLS | {"list_public_formats", "get_vendor_catalog"}
350+
validate_discovery_set(MY_DISCOVERY_TOOLS) # raises ValueError for unknown names or mutating tools
351+
```
352+
353+
`validate_discovery_set` does not register the tools — it only validates
354+
the set you pass to your middleware's discovery bypass.
355+
237356
### `tools/list` is unauthenticated by default
238357

239358
MCP's streamable-HTTP transport accepts three JSON-RPC methods as
@@ -371,29 +490,114 @@ of re-executing the handler.
371490

372491
The store keys on `ToolContext.caller_identity` — if your transport
373492
doesn't populate it, per-principal scoping falls through and dedup is
374-
skipped (with a UserWarning). A2A populates it automatically from
493+
skipped (with a `UserWarning`). A2A populates it automatically from
375494
`ServerCallContext.user`; MCP requires you to wire `context_factory`.
376495

377496
Don't rebuild idempotency in your handler. Import the middleware.
378497

379-
## Error handling
498+
### Wiring `@store.wrap` (production pattern)
499+
500+
Decorate the mutating handler methods — `create_media_buy`,
501+
`update_media_buy`, and any other operation your agent implements that
502+
has side effects — with `@idempotency.wrap`:
503+
504+
```python
505+
from adcp.server import ADCPHandler, IdempotencyStore, MemoryBackend, ToolContext
506+
from adcp.server.responses import capabilities_response
507+
508+
idempotency = IdempotencyStore(backend=MemoryBackend(), ttl_seconds=86_400)
509+
510+
511+
class MySeller(ADCPHandler):
512+
@idempotency.wrap
513+
async def create_media_buy(self, params, context: ToolContext | None = None):
514+
return my_create_logic(params)
515+
516+
@idempotency.wrap
517+
async def update_media_buy(self, params, context: ToolContext | None = None):
518+
return my_update_logic(params)
519+
520+
async def get_adcp_capabilities(self, params, context: ToolContext | None = None):
521+
return capabilities_response(["media_buy"], idempotency=idempotency.capability())
522+
```
523+
524+
For production, swap `MemoryBackend()` for `PgBackend` (note the
525+
import path — `PgBackend` lives in `adcp.server.idempotency`, not the
526+
top-level `adcp.server`):
527+
528+
```python
529+
from adcp.server.idempotency import PgBackend
530+
idempotency = IdempotencyStore(backend=PgBackend(pool=pg_pool), ttl_seconds=86_400)
531+
```
532+
533+
The Pg-backed store survives restarts and is shared across workers.
534+
`PgBackend` commits the cached response atomically with your handler's
535+
business write when both run inside the same transaction — no window
536+
where the side effect lands
537+
but the cache entry doesn't.
380538

381-
Raise `AdCPError` (or a subclass: `ADCPTaskError`, `IdempotencyConflictError`)
382-
from handler code. The SDK translates to the wire-level error shape the
383-
AdCP spec mandates — MCP gets a `ToolError` with the spec error code in
384-
the message, A2A gets a `JSON-RPC error` with the code populated.
539+
**`caller_identity` + `tenant_id` must be populated.** The store keys
540+
its cache on `(tenant_id, caller_identity, idempotency_key)`. If
541+
`context.caller_identity` is `None`, the middleware emits a `UserWarning`
542+
and falls through to your handler with no dedup — repeated requests
543+
re-execute and can double-allocate. Always wire `context_factory` on MCP
544+
servers so the auth middleware populates these fields before the handler
545+
runs.
385546

386-
Use the error classification helpers:
547+
## Error handling
548+
549+
**Handler methods return error dicts — they do not raise.** Use
550+
`adcp_error(code)` from `adcp.server`:
387551

388552
```python
389553
from adcp.server import adcp_error
390554

391-
raise adcp_error("BUDGET_TOO_LOW") # auto-classifies as correctable
392-
raise adcp_error("DOWNSTREAM_TIMEOUT") # auto-classifies as transient
555+
async def create_media_buy(self, params, context=None):
556+
if params.get("budget", 0) < 500:
557+
return adcp_error("BUDGET_TOO_LOW", "Budget must be ≥ $500",
558+
field="budget", suggestion="Increase to at least $500")
559+
if rate_limiter.is_over_limit(context.caller_identity):
560+
return adcp_error("RATE_LIMITED", retry_after=30)
561+
return my_create_logic(params)
562+
```
563+
564+
`adcp_error` builds the spec-mandated `{"errors": [...]}` dict and
565+
auto-populates the `recovery` field from a 20+ code table — no
566+
hand-maintaining recovery hints. The SDK translates the returned dict to
567+
the correct wire shape: `ToolError` on MCP, `JSON-RPC error` on A2A.
568+
569+
### Error-code taxonomy
570+
571+
| Recovery | Codes (sample) | Client action |
572+
|---|---|---|
573+
| `transient` | `RATE_LIMITED`, `SERVICE_UNAVAILABLE` | Retry with backoff |
574+
| `correctable` | `BUDGET_TOO_LOW`, `INVALID_REQUEST`, `MEDIA_BUY_NOT_FOUND`, `CONFLICT` | Fix the request and resubmit |
575+
| `terminal` | `AUTH_REQUIRED`, `ACCOUNT_NOT_FOUND`, `ACCOUNT_SUSPENDED` | Stop; require human intervention |
576+
577+
Full list: `adcp.server.helpers.STANDARD_ERROR_CODES`.
578+
579+
### `adcp_error` vs `ADCPTaskError`
580+
581+
`ADCPTaskError` is the exception the **client SDK** raises when it
582+
receives an error response. Server-side handler authors never construct
583+
or raise it. The distinction matters when you're writing both sides:
584+
585+
```python
586+
# SERVER — return a structured error dict:
587+
async def create_media_buy(self, params, context=None):
588+
return adcp_error("PRODUCT_NOT_FOUND", field="product_id",
589+
suggestion="Use get_products to discover available products")
590+
591+
# CLIENT — catch the exception the SDK raises on your behalf:
592+
try:
593+
await client.create_media_buy(params)
594+
except ADCPTaskError as exc:
595+
if "PRODUCT_NOT_FOUND" in exc.error_codes:
596+
products = await client.get_products(...)
393597
```
394598

395-
The recovery hint (transient / correctable / terminal) gets populated
396-
from 20+ standard codes — don't reinvent the table.
599+
Custom error codes (outside `STANDARD_ERROR_CODES`) default to
600+
`recovery="terminal"`. Override with `adcp_error("MY_CODE", recovery="correctable")`.
397601

398602
## Response builders
399603

@@ -826,10 +1030,60 @@ Sellers typically need both.
8261030
lives at `adcp.types` or `adcp` — and the internal paths renumber
8271031
between releases (see `MIGRATION_v3_to_v4.md`).
8281032

1033+
## Troubleshooting
1034+
1035+
**Idempotency dedup isn't firing — repeated creates still execute.**
1036+
1037+
Check that `context.caller_identity` is non-`None` when the handler
1038+
runs. The idempotency middleware silently falls through (with a
1039+
`UserWarning` in server logs) when it can't scope the cache namespace.
1040+
On MCP servers, this means `context_factory` is absent or returns a
1041+
`ToolContext` without `caller_identity`. On A2A servers, it means the
1042+
request arrived without a `ServerCallContext.user`. Fix: wire
1043+
`context_factory=auth_context_factory` on `create_mcp_server`, and
1044+
ensure your `validate_token` returns a `Principal` with
1045+
`caller_identity` set.
1046+
1047+
**`context_factory` returned a plain `dict` and now the handler explodes
1048+
with `AttributeError: 'dict' object has no attribute 'caller_identity'`.**
1049+
1050+
`context_factory` must return a `ToolContext` instance (or a subclass),
1051+
not a dict. The SDK's dispatcher reads `context.caller_identity`,
1052+
`context.tenant_id`, and any subclass fields as attributes. Returning a
1053+
dict is a type error at dispatch time. Fix: return
1054+
`ToolContext(caller_identity=..., tenant_id=...)` or your own subclass.
1055+
1056+
**`tools/list` returns an empty tool list (or just `get_adcp_capabilities`).**
1057+
1058+
By default the SDK only advertises tools whose handler methods your
1059+
subclass actually overrides. A handler that overrides only
1060+
`get_adcp_capabilities` + `get_products` surfaces exactly those two.
1061+
If you expect all 57 spec tools to appear for a storyboard client,
1062+
pass `advertise_all=True` to `serve()` / `create_mcp_server()`.
1063+
1064+
**`validate_discovery_set` raises `ValueError` listing a tool I know is valid.**
1065+
1066+
The function checks that every name in the extended set is either in
1067+
`DISCOVERY_TOOLS` or an AdCP-defined pre-auth name it recognises. If
1068+
you added a vendor-specific handshake tool, the function can't
1069+
auto-classify it. Pass the validated set directly to your middleware's
1070+
discovery bypass and skip `validate_discovery_set` for your extension
1071+
names, or file an issue to add the name to the shipped default.
1072+
1073+
**Handler raises `AuthenticationRequired` but the client sees `500 Internal Server Error`.**
1074+
1075+
`AuthenticationRequired` (or any exception that isn't an `ADCPError`
1076+
subclass) is translated to an opaque 500 by the executor — intentional,
1077+
to avoid leaking server internals. Return `adcp_error("AUTH_REQUIRED")`
1078+
instead; the SDK maps it to an authenticated-but-rejected error shape the
1079+
client can handle programmatically.
1080+
8291081
## Where to look next
8301082

8311083
- `examples/minimal_sales_agent.py` — handler-only starting point.
832-
- `examples/mcp_with_auth_middleware.py` — full auth + typed context.
1084+
- `examples/mcp_with_auth_middleware.py` — full auth + typed context
1085+
via `BearerTokenAuthMiddleware`. Foundation for Pattern 2b; bring
1086+
your own subdomain-routing middleware on top.
8331087
- `src/adcp/server/responses.py` — response builder reference.
8341088
- `src/adcp/server/helpers.py` — error codes, state machine, account
8351089
resolution.

examples/mcp_with_auth_middleware.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,10 @@
1919
agents also typically load tokens from a database — swap
2020
``validator_from_token_map`` for an ``async def validate_token`` that
2121
hits your token store.
22+
23+
For advanced production patterns — subdomain-based tenant routing
24+
(Pattern 2b), ResolvedIdentity DB enrichment, idempotency wiring, and
25+
error classification — see ``docs/handler-authoring.md``.
2226
"""
2327

2428
from __future__ import annotations

0 commit comments

Comments
 (0)