fix(runner): attribute Zod schema rejects to response_schema (#1709)#1712
Merged
Conversation
When the SDK's response unwrapper rejects an agent response against the codegen-emitted Zod schema for the tool, the exception previously propagated as a generic Error. The runner caught it as stepResult.error but never attributed the failure to schema validation in step.validations. Step-scope invariants then ran against the malformed payload, and whichever fired first (canonically context.no_secret_echo) became the surfaced failure. BidMachine spent 10+ deploys chasing the wrong cause (adcontextprotocol/adcp#4419). Three layers: 1. `ResponseSchemaValidationError` typed error in response-unwrapper.ts. Carries toolName, issues, data. Replaces the two generic throws. Stable `name === 'ResponseSchemaValidationError'` for cross-bundle detection without instanceof. 2. `runStep` in testing/client.ts returns `{ result?, step, caughtError? }`. The new caughtError preserves the raw thrown value so callers can pattern-match on typed exceptions. Pre-existing callers unaffected. 3. Storyboard runner `executeStep` threads caughtError, and on ResponseSchemaValidationError prepends a synthesized response_schema ValidationResult. extractFailures.find now picks it up before any inline or invariant entry. Step-scope invariants in executeStoryboardPass short-circuit when the synthesized entry is present — each bypassed invariant emits a skip marker so consumers see WHICH was bypassed and why. Tests: 6/6 in test/lib/storyboard-schema-validation-attribution.test.js cover synthesized entry presence, prepend ordering, invariant short-circuit + skip markers, step.passed=false correctness, and the typed error class shape. Sequencing dependency for #1707 (dynamic schema fetch + strict→passthrough + codegen regen). Lands first so any remaining Zod rejects during #1707's rollout produce honest signal instead of misattributing. Coordinated stance: #1685 (the SDK is a witness, not a translator). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 12, 2026
bokelley
added a commit
that referenced
this pull request
May 12, 2026
…med fields (#1713) (#1714) The default context.no_secret_echo invariant flagged any response field whose name was in SUSPECT_PROPERTY_NAMES (authorization, api_key, apikey, bearer, x-api-key) regardless of the field's value. This over-rejected spec-legitimate structured fields — notably the authorization object on validation-result.json (structured authorization- validation payload, not a credential) and any seller-side extension fields named authorization under sync_accounts_response.accounts[] (additionalProperties: true). Symptom: BidMachine's sync_accounts failures in adcontextprotocol/adcp#4419 all surfaced as context.no_secret_echo. Diagnosis in #4419 pointed at Zod .strict() codegen lag, but verification of published SDK tarballs (5.25.1 / 6.12.0 / current) shows SyncAccountsResponseSchema.accounts[] already uses .passthrough(). Zod silently accepts the authorization field; the NAME dragnet in this invariant is the actual rejection. Fix: narrow findSecretEcho so the suspect-name check only fires when the field VALUE is a non-empty string. Structured object/array values on suspect-named fields pass through to the recursive walk, which still scans nested strings against BEARER_TOKEN_PATTERN and caller-supplied secret literals. The actual leak shapes the invariant was designed to catch remain caught. Five new test cases: - Suspect name with object value passes (the BidMachine case) - Suspect name with array value passes - Suspect name with empty string passes - Suspect name with non-empty string value still fails - Bearer literal nested inside structured suspect-named object still fails via the value-scan regression guard 113/113 invariant tests pass. Sequencing: - #1709 (merged via #1712) addresses Zod-reject misattribution (different path); this fix addresses the case where Zod ACCEPTS but the invariant over-rejects on field name. - #1707 (dynamic schema fetch + strict→passthrough) remains real architectural cleanup but does NOT unblock BidMachine. - #1711 (fgranata): the "Zod .strict() codegen lag" diagnosis was incorrect for the published SDK; this fix is the actual unblocker. Coordinated stance: #1685 (the SDK is a witness, not a translator). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Merged
This was referenced May 12, 2026
bokelley
added a commit
that referenced
this pull request
May 12, 2026
…#1708) (#1715) Locks the post-7.1.0 attribution invariants so future refactors of comply()'s extractFailures can't silently reintroduce the BidMachine misattribution shape (adcp#4419). What's locked: 1. A storyboard step carrying both a synthesized response_schema failure (prepended by the runner per #1709 / PR #1712) and an assertion entry surfaces validation.check === 'response_schema' in ComplianceResult.failures — never 'assertion'. The attribution that was silently broken pre-7.1.0 (Zod rejects fell through to the next invariant, canonically context.no_secret_echo). 2. Skipped-invariant markers (passed: true entries the runner emits when short-circuiting invariants downstream of a schema failure per #1712) are correctly filtered out — only failed validations surface in `failures`. A future change that included passed: true entries would crowd out the real failure. 3. A clean BidMachine-shape response (structured authorization field passing no_secret_echo per #1713 / PR #1714) produces zero failures through the aggregation layer. 4. Multi-storyboard aggregation preserves per-storyboard (storyboard_id, step_id, validation.check) tuples. 5/6. Clean pass paths (no failures, skipped steps) produce empty failures. API change (minor): extractFailures (previously file-internal) is now exported from src/lib/testing/compliance/comply.ts so the regression test can call it directly with synthetic StoryboardResult fixtures. Functionally identical; just visibility. Scope correction relative to the original #1708 framing: the "cross-evaluator divergence" symptom was version-driven (different @adcp/sdk versions hitting #1713 and #1709 differently), not a true parity gap. Both root causes shipped in 7.1.0; this test is the durable guard for the aggregation-layer invariants those fixes depend on. 7/7 tests pass. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1709.
The bug
When the SDK's response unwrapper rejects an agent response against the codegen-emitted Zod schema for the tool, the exception previously propagated as a generic
Error('Response validation failed for ${toolName}: ...'). The runner's step-execution catch absorbed it intostepResult.errorbut never attributed the failure to schema validation instep.validations[]. Step-scope invariants then ran against the malformed payload, and whichever fired first (canonicallycontext.no_secret_echo— the next default-invariant in the registered queue) became the surfaced failure inextractFailures.BidMachine spent 10+ deploys chasing the
no_secret_echoghost before the strict-Zod root cause surfaced. Full diagnostic trace at adcontextprotocol/adcp#4419.The fix — three layers
1. Typed error class (
src/lib/utils/response-unwrapper.ts).New
ResponseSchemaValidationErrorextendsError. Carries:toolName: string— the AdCP tool whose response failedissues: z.core.$ZodIssue[]— raw Zod issues for structured attributiondata: unknown— the rejected payload for diagnostic reportingname === 'ResponseSchemaValidationError'so cross-bundle consumers can detect by string match withoutinstanceofReplaces the two generic
new Error(...)throws inunwrapProtocolResponse.2.
runSteppreserves the raw thrown value (src/lib/testing/client.ts).Return signature widened to
{ result?, step, caughtError? }. The newcaughtErroris the raw thrown value (typedunknown) so callers can pattern-match on typed exceptions. Pre-existing callers consuming onlyresultandstepare unaffected.3. Storyboard runner attribution + invariant short-circuit (
src/lib/testing/storyboard/runner.ts).executeStepthreadscaughtErrorfrom the dispatch fn. WhencaughtError instanceof ResponseSchemaValidationError:response_schemaValidationResult with the structured Zod issues, the failing tool name, an RFC 6901-shapedjson_pointer, and the rejection message.step.validations[]soextractFailures.find(v => !v.passed)resolves it before any inline or invariant entry.In
executeStoryboardPass, step-scope invariants short-circuit when the synthesized entry is present. Each bypassed invariant emits a single skip marker (passed: true,description: '${spec.id}: skipped — response failed schema validation') so consumers see WHICH invariants were bypassed and why, without crowding out the schema failure inextractFailures.Tests
New
test/lib/storyboard-schema-validation-attribution.test.js:response_schemaentry present instep.validationswhen unwrapper rejectsresponse_schemais FIRST instep.validationssoextractFailurespicks it over invariantsrunStoryboardfull-pass)step.passed === false(schema rejection is a real failure)ResponseSchemaValidationErrorclass shape carries toolName / issues / datanamefield is the literal string for cross-bundle detection6/6 pass. Sweep of
test/lib/storyboard-*.test.jscovered 50+ tests with no regressions through the in-progress monitor.Sequencing
This PR is the upstream dependency for #1707:
schema_version_fetch_failedadvisory needs#1709 lands first so any remaining Zod rejects during #1707's rollout produce honest signal as the strict→passthrough flip reduces reject frequency — instead of misattributing the residual cases.
Validation hook
fgranata (BidMachine) volunteered as a retest target once #1707 + #1709 land. Current comply-suite score is 63/128 (49%) vs CLI runner 45/59 (76%) — the 27-point delta is the diagnostic signal both fixes should close. Filed under adcp-client#1711.
Test plan
npm run buildcleannpm run format:checkcleannode --test test/lib/storyboard-schema-validation-attribution.test.js— 6/6 passnode --test test/lib/storyboard-*.test.js— 50+ tests passing, no regressionssync_accountsfailures should attribute toresponse_schemainstead ofcontext.no_secret_echo; one-step verification once they retest.Part of the #1685 coordinated stance ("the SDK is a witness, not a translator"). Same anti-pattern as adcp-client#1702 / #1705: the SDK was fabricating a different failure (
no_secret_echo) than what actually went wrong (schema rejection).🤖 Generated with Claude Code