Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .changeset/4593-test-surfaces-verifiability-gap.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
---

docs(conformance): document test surfaces, the SDK bridge, and the `_bridge` marker (#4593).

Adds a "Test surfaces and the storyboard loop" section to the Conformance Specification covering the two implementations of the test-surface pattern — DB-backed `seed_*` for state-local sellers (SSPs, creative agents) and the TypeScript SDK's `TestControllerBridge` for upstream-proxy sellers (DSPs, retail-media networks, signals brokers). Frames both as the same pattern, not different seller categories, and clarifies that both earn `(Spec)` while neither is what `(Sandbox)` attests. Documents the SDK's non-normative `_bridge` response marker (shipped in adcp-client#1786) and pins the underscore-prefix convention for SDK/runner-stamped metadata reserved for testing tooling. Adds a three-signal disambiguation table covering test controller availability, the `account.sandbox` flag, and `_bridge` participation. The `comply_test_controller` doc keeps a short pointer back to the canonical section, and the AAO Verified doc cross-links into it from the existing controller-relationship Note.
6 changes: 6 additions & 0 deletions docs/building/by-layer/L3/comply-test-controller.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -611,6 +611,12 @@ State transition scenarios (`force_*`) are idempotent: forcing a status that mat

Simulation scenarios (`simulate_*`) are NOT idempotent — `simulate_delivery` adds to existing totals, while `simulate_budget_spend` replaces the current spend level.

## Test surfaces

Where a seller's state-of-record lives determines how the storyboard test loop closes. State-local sellers (typically SSPs, creative agents) write to the seller's DB via the `seed_*` scenarios above; the seller's read handlers consume the same store, and the seed→read loop closes naturally. Upstream-proxy sellers (DSPs proxying to platforms, retail-media networks reading retailer catalogs, signals brokers) cannot close the loop that way because their read handlers reach a system the seller does not control; the TypeScript SDK ships a `TestControllerBridge` that runs the real adapter call first, then merges seeded fixtures into the response. Either path earns the wire-format pass that `AAO Verified (Spec)` attests. Neither path is what `(Sandbox)` attests — that's a separate axis covering whether the seller's production stack honors `account.sandbox: true` without real-world side effects.

The cross-page framing for both implementations of this pattern, the SDK's `_bridge` advisory marker, and the runtime-signals disambiguation table all live in the Conformance Specification → [Test surfaces and the storyboard loop](/docs/building/verification/conformance#test-surfaces-and-the-storyboard-loop).

## Compliance testing modes

The presence of `comply_test_controller` in a seller's tool list determines which mode a compliance tester uses:
Expand Down
2 changes: 1 addition & 1 deletion docs/building/verification/aao-verified.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ The seller-side gate is normative: every comply_test_controller request includes
The (Sandbox) qualifier replaces the earlier draft's `Verified (Live)` framing. The change: instead of attesting "your real-money production code path delivers impressions correctly" (canonical campaigns running through your stack), (Sandbox) attests "your real production code path correctly handles sandbox-flagged traffic across the full storyboard suite." Both are real-prod-surface claims; the difference is what gets tested. (Sandbox) is universally achievable across specialisms with no new AAO operational infrastructure. See [#4379](https://github.com/adcontextprotocol/adcp/issues/4379) for the reframe verdict.

<Note>
**Re: `comply_test_controller`**: the controller is a **dev/staging-only** affordance for adopters' own integration testing. AAO's (Sandbox) grading does not require or use it. Sellers MAY implement controller endpoints in their dev environment to support deterministic local testing, but the production stack does not need to expose `comply_test_controller` to earn (Sandbox). The seller-side sandbox gate is what (Sandbox) attests — schema and lifecycle correctness under flagged traffic, on real prod.
**Re: `comply_test_controller`**: the controller is a **dev/staging-only** affordance for adopters' own integration testing. AAO's (Sandbox) grading does not require or use it. Sellers MAY implement controller endpoints in their dev environment to support deterministic local testing, but the production stack does not need to expose `comply_test_controller` to earn (Sandbox). The seller-side sandbox gate is what (Sandbox) attests — schema and lifecycle correctness under flagged traffic, on real prod. How the dev-time test surface itself is stood up — DB-backed `seed_*` for state-local sellers vs the SDK's `TestControllerBridge` for upstream-proxy sellers — is covered in [Test surfaces and the storyboard loop](/docs/building/verification/conformance#test-surfaces-and-the-storyboard-loop).
</Note>

## Naming history
Expand Down
30 changes: 30 additions & 0 deletions docs/building/verification/conformance.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,36 @@ A second axis — **AAO Verified (Sandbox)** — verifies the seller's real prod

The two qualifiers share one brand mark — **AAO Verified** — and an agent can earn either or both. **(Spec) and (Sandbox) are independent**: each independently demonstrates conformance through different evidence. (Spec) attests wire-format conformance against any registered endpoint; (Sandbox) attests the production code path correctly tolerates sandbox-flagged traffic. See [AAO Verified](/docs/building/verification/aao-verified) for the qualifier model and the [Sandbox framing verdict](https://github.com/adcontextprotocol/adcp/issues/4379); the rest of this page indexes the storyboards that back both qualifiers.

## Test surfaces and the storyboard loop

Every seller exposes a *test surface* — the mechanism that lets a storyboard runner exercise the seller's tools deterministically without triggering real-world side effects. The test surface is what (Spec) is graded against. How a seller stands up that surface depends on where their state-of-record lives; the implementation differs, the goal does not:

| Where state-of-record lives | How the test loop closes |
|---|---|
| Local DB only (typically SSPs, creative agents) | The storyboard runner writes fixtures via `comply_test_controller.seed_*`; the seller's read handlers consume the same store. The seed → read loop closes naturally. |
| Upstream system the seller does not control (DSPs proxying to platforms, retail-media networks reading retailer catalogs, signals brokers) | Seeded writes are dead to the read handler. The TypeScript SDK ships a `TestControllerBridge` that runs the real adapter call first (so a broken upstream call still fails the gate), then merges seeded fixtures into the response. |
| Mixed (some tools local, some upstream) | Both, per tool. |

Both paths earn `(Spec)` — both prove the seller's wire format matches the storyboards. The bridge is **one implementation** of the test-surface pattern, not a separate seller category. A state-local seller without wired seeds and an upstream-proxy seller without a wired bridge are in the same position: storyboards cannot run end-to-end against them. Neither category is what `(Sandbox)` attests; `(Sandbox)` is the separate axis covering whether the seller's production stack honors `account.sandbox: true` without real-world side effects.

### Distinguishing fixture-merged from upstream-derived responses

When a response passes through the SDK's `TestControllerBridge`, the SDK stamps a `_bridge: { callback, tool, merged_count }` marker on the response. Marker presence on a step means the response content was merged from a seeded fixture after the seller's handler returned; marker absence means the response came from the seller's adapter end-to-end (or from a local DB the runner seeded directly). The marker is advisory metadata for runners and downstream leaderboards — it is **not** part of the wire contract. Sellers MUST NOT emit it, and conformance checks ignore it. The leading underscore marks the field as SDK/runner-stamped metadata reserved for testing tooling; future fields with the same prefix follow the same rule.

Marker design: [`adcp-client#1775`](https://github.com/adcontextprotocol/adcp-client/issues/1775). Shipped: [`adcp-client#1786`](https://github.com/adcontextprotocol/adcp-client/pull/1786). Leaderboard policy that consumes the marker: [`adcp-client#1782`](https://github.com/adcontextprotocol/adcp-client/issues/1782).

### Three signals — don't conflate them

Adopters often read these three controls as the same thing. They answer different questions:

| Signal | Question it answers |
|---|---|
| Test controller availability (`comply_test_controller` in `tools/list`) | "Has the seller exposed deterministic-mode forces?" |
| Sandbox flag (`account.sandbox` on requests) | "Is the targeted account a sandbox account, with no real-world side effects?" |
| Bridge participation (`_bridge` marker on a response) | "Did this response come from the adapter's upstream call, or from a fixture the SDK merged in?" |

These are **runtime controls** on individual storyboard steps — distinct from the `(Spec)` and `(Sandbox)` verification qualifiers, which describe what a storyboard pass *attests* over time. A storyboard pass can carry any combination of the three signals.

## Storyboards are the truth

Rather than restate every MUST in prose — which would inevitably drift from the executable suite — **the storyboards ARE the conformance specification.** This document is a navigational index to them, grouped by the declaration that obligates the storyboard to run.
Expand Down
Loading