Skip to content

fix(addie): get_schema truncates large schemas at 6K chars#4398

Draft
bokelley wants to merge 3 commits into
mainfrom
claude/issue-4397-fix-get-schema-truncation
Draft

fix(addie): get_schema truncates large schemas at 6K chars#4398
bokelley wants to merge 3 commits into
mainfrom
claude/issue-4397-fix-get-schema-truncation

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

@bokelley bokelley commented May 11, 2026

Closes #4397

Summary

The get_schema Addie tool truncated JSON schemas at a hardcoded 6,000-char limit using a raw substring() cut, producing syntactically broken output and hiding entire oneOf branches. creative/preview-render.json (~7.7K) had its third output_format: "both" branch cut mid-sentence; creative/preview-creative-response.json (~11K) had the variant branch hidden entirely. The truncation note compounded the issue by directing the AI to the property parameter — which only works for schema.properties, not for union-type schemas.

Changes:

  • Raise SCHEMA_MAX_DISPLAY_CHARS from 6K → 50K, covering all current v3 schemas except the largest union enumerations (get-adcp-capabilities-response ~75K, brand.json ~74K — documented as known remaining limitation; both reported schemas and core/format.json ~29K, core/product.json ~25K now return in full)
  • Extract formatSchemaJson() as an exported pure function, making truncation logic unit-testable without HTTP mocks
  • Context-aware truncation note: property-hint for schemas with top-level properties; validate_json guidance for oneOf/allOf/anyOf-only schemas (replaces the previous dead-end list_schemas redirect)
  • Fix propNames derivation when property is set: truncation hint now uses displaySchema.properties so the note points to paths the agent can actually drill into, not siblings it has already passed
  • Raise PRESERVE_TOOL_RESULTS threshold in token-limiter.ts from 20K → 55K to stay coherent with new 50K JSON ceiling
  • Bump CODE_VERSION to 2026.05.2 per playbook (mcp/*.ts tool implementation change)
  • Add 8 unit tests for formatSchemaJson including regression guards for get_schema tool truncates large schemas mid-content #4397

Known remaining limitation: Schemas exceeding 50K chars (get-adcp-capabilities-response.json ~75K, brand.json ~74K, adagents.json ~50K borderline) will still trigger the truncation path. For these, validate_json guidance in the truncation note is the correct fallback.

Non-breaking justification

Server-side Addie tool output change only. Existing callers that worked before still work; callers that were receiving truncated data now receive complete data. No protocol schemas, task definitions, or public API surfaces changed.

Pre-PR review

  • code-reviewer: approved — fixed toLocaleString('en-US') locale safety, stale JSDoc. Noted truncated JSON-in-fence is pre-existing; no blockers.
  • prompt-engineer (iterative review): two blockers addressed — 20K → 50K covers high-traffic schemas; union-only truncation note replaces dead-end list_schemas with actionable validate_json guidance. propNames derivation fix added for sub-property truncation.

Triage-managed PR. This bot does not currently iterate on
review comments or PR conversation threads (only on the source
issue). To unblock:

  • Push fixup commits directly: gh pr checkout <num>
    fix → push.
  • Or re-trigger: comment /triage execute on the source
    issue.

See #3121
for context.

Session: https://claude.ai/code/session_01Qy5CniM5du4RLEDiTadFuJ

…ncation note

The 6,000-char substring cut silently hid entire oneOf branches:
creative/preview-render.json (~7.7K) and creative/preview-creative-response.json
(~11K) were both truncated, with the third oneOf branch in preview-render and
the variant branch in preview-creative-response completely hidden from Addie.

Additionally, the old truncation note directed the AI to use the `property`
parameter, which only works for schema.properties — not for union-type schemas
(oneOf/allOf/anyOf). This pointed Addie toward a dead end, causing hallucination.

Changes:
- Raise SCHEMA_MAX_DISPLAY_CHARS from 6K to 20K
- Extract formatSchemaJson() pure function for testability
- Provide context-aware truncation note: property-hint for schemas with
  top-level properties, union-type hint for union-only schemas
- Raise PRESERVE_TOOL_RESULTS threshold in token-limiter.ts from 20K to 25K
  to keep the two layers coherent (25K = 20K JSON + header overhead)
- Bump CODE_VERSION to 2026.05.2 per playbook (mcp/*.ts change)
- Add 6 unit tests for formatSchemaJson covering regression guard for #4397

Fixes #4397.

https://claude.ai/code/session_01Qy5CniM5du4RLEDiTadFuJ
@bokelley bokelley added the claude-triaged Issue has been triaged by the Claude Code triage routine. Remove to re-triage. label May 11, 2026
claude added 2 commits May 11, 2026 13:12
Regenerated by build; reflects the primary_brand_domain → Linked Domains
UI description update committed in #4313 that was not yet reflected in dist.

https://claude.ai/code/session_01Qy5CniM5du4RLEDiTadFuJ
…note

Follow-up to initial fix — prompt-engineer review flagged two blockers:

1. 20K ceiling still truncates core/format.json (~29K) and core/product.json
   (~25K), the most commonly queried schemas. Raise SCHEMA_MAX_DISPLAY_CHARS
   to 50K, covering all current v3 schemas except the largest union enumerations
   (get-adcp-capabilities-response ~75K, brand.json ~74K — documented as known
   remaining limitation). Raise PRESERVE_TOOL_RESULTS in token-limiter to 55K
   to stay coherent.

2. Union-only truncation note suggested 'list_schemas' — a dead end for inline
   union schemas (brand.json, adagents.json) whose branches are not separate
   registry files. Replace with actionable 'validate_json' guidance.

Also: fix propNames derivation when `property` is set — truncation hint now
uses displaySchema.properties (the sub-property's own children) rather than
root schema.properties, so the hint points to paths the agent can actually
drill into next. Add 2 tests covering these regression cases.

https://claude.ai/code/session_01Qy5CniM5du4RLEDiTadFuJ
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

claude-triaged Issue has been triaged by the Claude Code triage routine. Remove to re-triage.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

get_schema tool truncates large schemas mid-content

2 participants