Skip to content

Improve remote site handling: prefetch plan and truncate large API responses#3011

Open
lezama wants to merge 3 commits intotrunkfrom
improve-remote-site-plan-check
Open

Improve remote site handling: prefetch plan and truncate large API responses#3011
lezama wants to merge 3 commits intotrunkfrom
improve-remote-site-plan-check

Conversation

@lezama
Copy link
Copy Markdown
Contributor

@lezama lezama commented Apr 8, 2026

Related issues

How AI was used in this PR

Pair-programmed with Claude Code. All code reviewed and tested manually against a live WordPress.com site.

Proposed Changes

Two improvements to the remote WordPress.com site experience in studio ai:

  • Prefetch plan at site selection: Include plan in the /me/sites API request so it's already available when the user selects a remote site. The agent receives the plan in its prompt context (e.g., plan: free_plan) and no longer needs a separate GET / call — which returned ~103K chars and exceeded the SDK's MCP output token limit.

  • Truncate oversized API responses: When any wpcom_request response exceeds 80K characters (~20K tokens), truncate it and append a hint telling the agent to use the fields query parameter. This is a generic safety net — the SDK's limit is 25K tokens (MAX_MCP_OUTPUT_TOKENS), and at ~4 chars/token, 80K chars stays safely under it.

Testing Instructions

  1. Build the CLI: npm install && npm run cli:build
  2. Run studio ai, select a remote WordPress.com site
  3. Ask: "What can I do with templates on this site?"
  4. Verify:
    • The agent does not make a GET / call to check the plan
    • The plan is shown in the active site context (e.g., plan: free_plan)
    • No "exceeds maximum allowed tokens" errors
    • If on a free plan, the agent correctly notes design limitations

Pre-merge Checklist

  • Have you checked for TypeScript, React or other console errors?

🤖 Generated with Claude Code

@lezama lezama requested review from epeicher and youknowriad April 8, 2026 13:04
name: activeSite.name,
url: activeSite.url ?? '',
id: activeSite.wpcomSiteId!,
planSlug: activeSite.planSlug,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there cases where the plan is undefined and we need a fallback (I don't know wp.com enough)


// Response too large — return top-level keys and a hint
const keys = typeof result === 'object' && result !== null ? Object.keys( result ) : [];
const truncated = json.slice( 0, MAX_RESPONSE_CHARS );
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is a string truncation, it probably results in invalid json, does that matter or cause any issues?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, the agent reads it as text, not parsing it programmatically 🤔 . The hint appended at the end explicitly says the response is truncated, lists the top-level keys, and tells the agent to retry with a fields parameter.

Copy link
Copy Markdown
Contributor

@youknowriad youknowriad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still testing, but it's working well so far.

@youknowriad
Copy link
Copy Markdown
Contributor

I'm still seeing this kind of errors from time to time. doesn't seem to impact the behavior

Error: result (93,893 characters) exceeds maximum allowed tokens.

@youknowriad
Copy link
Copy Markdown
Contributor

The plan is shown in the active site context (e.g., plan: free_plan)

Where can I see this context?

@wpmobilebot
Copy link
Copy Markdown
Collaborator

wpmobilebot commented Apr 8, 2026

📊 Performance Test Results

Comparing 4f69fa6 vs trunk

app-size

Metric trunk 4f69fa6 Diff Change
App Size (Mac) 1252.13 MB 1252.13 MB +0.00 MB ⚪ 0.0%

site-editor

Metric trunk 4f69fa6 Diff Change
load 1857 ms 1939 ms +82 ms 🔴 4.4%

site-startup

Metric trunk 4f69fa6 Diff Change
siteCreation 9108 ms 9159 ms +51 ms 🔴 0.6%
siteStartup 4171 ms 4929 ms +758 ms 🔴 18.2%

Results are median values from multiple test runs.

Legend: 🟢 Improvement (faster) | 🔴 Regression (slower) | ⚪ No change (<50ms diff)

@lezama
Copy link
Copy Markdown
Contributor Author

lezama commented Apr 8, 2026

I'm still seeing this kind of errors from time to time. doesn't seem to impact the behavior

@youknowriad did you get a way to reproduce constantly?

Where can I see this context?

It's injected in the enriched prompt that the agent receives at the start of each turn in commands/ai/index.ts line 377. It shows as [Active site: "Open me Carefully" (ID: 123) at https://... (WordPress.com, plan: free_plan)].
It's not visible in the UI, only in the agent's context. Should we surface it somewhere in the TUI as well 🤔 ?

@youknowriad
Copy link
Copy Markdown
Contributor

I reproduce constantly by picking a site that I have with a paid plan on the wp.com site list
and saying: "Redesign this site"

One of the first (not the first one) requests done results in

 ⏺ mcp__studio__wpcom_request (1.3s)                                                                                              
   ⎿ Error: result (93,858 characters) exceeds maximum allowed tokens. Output has been saved to                                   
/Users/riad/.claude/projects/-Users-riad-Studio/55ff4fbd-87fa-4395-a749-ee4c4a2c1fc6/tool-results/mcp-studio-wpcom_request-1775665
358263.txt.                                                                                                                       
     Format: JSON array with schema: [{type: string, text: string}]                                                               
     Use offset and limit parameters to read specific portions of the file, the Grep tool to search for specific content, and jq  
to make structured queries.                                                                                                       
     REQUIREMENTS FOR SUMMARIZATION/ANALYSIS/REVIEW:                                                                              
     - Yo…                                                                                                                        
   ⎿ {                                                                                                                            
       "id": 5,                                                                                                                   
       "title": {                                                                                                                 
         "raw": "Riopark Luxury White",                                                                                           
         "rendered": "Riopark Luxury White"                                                                                       
     ... 19 more lines · ctrl+o to expand 

lezama and others added 2 commits April 8, 2026 16:57
…sponses

Two improvements to the remote WordPress.com site experience in `studio ai`:

1. **Prefetch plan at site selection**: Include `plan` in the
   `/me/sites` request so it's available when the site is selected.
   The agent no longer needs a separate API call to check the plan —
   it arrives in the prompt context alongside the site name and URL.

2. **Truncate oversized API responses**: When a `wpcom_request` response
   exceeds 80K characters (~20K tokens), truncate it and append a hint
   telling the agent to use the `fields` query parameter. This prevents
   the SDK's 25K token MCP output limit from triggering a file-save
   fallback that breaks the conversation flow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ds, lower limit

The Claude Agent SDK counts actual tokens (not chars/4) for responses over 50K chars.
JSON is token-dense (~3 chars/token), so the previous 80K char limit could exceed the
25K token limit. This fixes it with three changes:

- Compact JSON (no indentation) — ~20-30% fewer chars on every response
- Strip noisy fields (_links, _embedded, guid, rendered when raw exists) — removes
  bloat that isn't useful to the agent
- Lower truncation limit to 50K chars (~16.7K tokens, safely under 25K)
- Tool description now nudges agent to use "fields" and "per_page" upfront

Tested: 35% reduction on /posts endpoint. No more "exceeds maximum allowed tokens" errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@lezama lezama force-pushed the improve-remote-site-plan-check branch from 5f1bba8 to 4a38704 Compare April 8, 2026 21:08
@lezama
Copy link
Copy Markdown
Contributor Author

lezama commented Apr 8, 2026

Pushed some further improvements, I am not seeing exceeds maximum allowed tokens errors in my testing 😅

  • Compact JSON — removed pretty-printing (JSON.stringify(result) instead of JSON.stringify(result, null, 2)), ~30-40% fewer chars on typical responses
  • Strip noisy fields — when a response exceeds the 50K char limit, strips _links, _embedded, guid, ping_status, comment_status, generated_slug, permalink_template, and drops rendered when raw exists. The agent is told which fields were stripped so it can re-request with fields if needed.
  • Lower truncation limit — 80K → 50K chars. The SDK does actual token counting (not estimation) for large responses, and JSON averages ~3 chars/token, not 4. At 50K chars ≈ 16.7K tokens, safely under the 25K limit.
  • Tool description hint — tells the agent upfront to use fields and per_page to avoid oversized responses
  • Tests — added stripNoisyFields unit tests

@lezama
Copy link
Copy Markdown
Contributor Author

lezama commented Apr 8, 2026

Pushed some more improvements! I couldn't hit exceeds maximum allowed tokens on my testing so far:

  • Compact JSON — removed pretty-printing (JSON.stringify(result) instead of JSON.stringify(result, null, 2)), ~30-40% fewer chars on typical responses
  • Strip noisy fields — recursively removes _links, _embedded, guid, ping_status, comment_status, generated_slug, permalink_template, and drops rendered when raw exists in the same object. Tested at 35% reduction on /posts.
  • Lower truncation limit — 80K → 50K chars. The SDK does actual token counting (not estimation) for large responses, and JSON averages ~3 chars/token, not 4. At 50K chars ≈ 16.7K tokens, safely under the 25K limit.
  • Tool description hint — tells the agent upfront to use fields and per_page to avoid oversized responses

'comment_status',
'generated_slug',
'permalink_template',
] );
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are situations where we need these fields, this is a bit dangerous IMO. Like if we want to retrieve the global styles we need the links... It's also a bit random, we seem to be stripping random fields from all requests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to strip them only when results exceed the limit, and note which ones were stripped and why in the reply so the agent is aware and can ask again with some filtering if it needs the full response.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note which ones were stripped and why in the reply so the agent is aware and can ask again with some filtering if it needs the full response

here: https://github.com/Automattic/studio/pull/3011/changes#diff-8bc987fb45f0d99a9ad64c2c77c6fb430ec58bd3d1be1a3a300004f5e70f4a79R172

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants