fix: estimate input/output token split for copilot-cli provider by christso · Pull Request #684 · EntityProcess/agentv

christso · 2026-03-19T22:21:05Z

Summary

Copilot CLI's ACP usage_update only reports cumulative context window tokens (used), not separate input/output counts — output was hardcoded to 0
Tracks characters flowing as input (prompt + tool results) vs output (agent message chunks) and pro-rates used tokens proportionally
Zero performance impact — just incrementing two counters on events already being processed

Closes #683

Test plan

Full test suite passes (1472/1472, matches main)
Typecheck clean
Lint clean
Pre-push hooks all green (Build, Typecheck, Lint, Test)
E2e manual test with copilot CLI target

🤖 Generated with Claude Code

The ACP usage_update event only reports cumulative context window tokens (used), not separate input/output counts. Previously output was hardcoded to 0, making token_usage misleading for copilot-cli targets. This change tracks characters flowing in each direction (prompt + tool results as input, agent message chunks as output) and pro-rates the total used tokens proportionally to estimate the split. Closes #683 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cloudflare-workers-and-pages · 2026-03-19T22:21:58Z

Deploying agentv with Cloudflare Pages

Latest commit:	`d02905e`
Status:	✅ Deploy successful!
Preview URL:	https://51667ae5.agentv.pages.dev
Branch Preview URL:	https://fix-copilot-cli-token-usage.agentv.pages.dev

View logs

Copilot CLI does not currently emit usage_update events via ACP — the usage data is tracked internally but marked ephemeral and not sent to clients (see github/copilot-cli#1152). Previously this meant token_usage was always undefined for copilot-cli targets. This change estimates token usage from observed character counts: - Input chars: prompt text + tool result payloads flowing to the agent - Output chars: agent_message_chunk text flowing from the agent - Both converted to tokens using a ~4 chars/token heuristic When/if copilot CLI starts emitting usage_update events, the provider will prefer those values (with char-based output estimation to split the cumulative `used` count into input/output). Closes #683 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

christso · 2026-03-19T23:33:46Z

Note on estimation accuracy:

Output tokens: Reasonable estimate — ~4 chars/token holds well for English/code LLM output
Input tokens: Significantly underestimated — we only see the user prompt and tool results, not copilot's internal system prompt, tool definitions, and context window setup (easily 2,000-10,000+ tokens). The reported input value is a lower bound, not a true count

This is an inherent limitation of the ACP protocol not exposing token data. Accurate counts are blocked on github/copilot-cli#1152.

The ACP PromptResponse includes a Usage field with inputTokens, outputTokens, thoughtTokens, and cachedReadTokens. Although copilot CLI v1.0.9 doesn't populate this yet (marked @experimental/UNSTABLE in the ACP spec), this positions the provider to use accurate token counts as soon as copilot starts returning them. Token usage resolution order: 1. PromptResponse.usage (accurate, from ACP — not yet populated) 2. usage_update session events (not yet emitted via ACP) 3. Character-based estimation (~4 chars/token heuristic) Also makes raceWithTimeout generic so it preserves the PromptResponse return value instead of discarding it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…rces Char-based estimation produced misleading values (e.g. input: 65 when real input was 3000+ tokens) because we can't observe copilot's system prompt and internal context. Better to report nothing than to mislead. What remains: - PromptResponse.usage capture (accurate when copilot populates it) - usage_update event handler with cost accumulation - Generic raceWithTimeout preserving PromptResponse return value - Code comments documenting why token_usage is currently undefined Blocked on github/copilot-cli#1152 for accurate token reporting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

christso marked this pull request as ready for review March 19, 2026 23:05

christso and others added 3 commits March 19, 2026 23:43

style: fix biome formatting for ternary expressions

8372725

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

christso merged commit 4c67ebc into main Mar 20, 2026
1 check passed

christso deleted the fix/copilot-cli-token-usage-estimation branch March 20, 2026 00:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: estimate input/output token split for copilot-cli provider#684

fix: estimate input/output token split for copilot-cli provider#684
christso merged 5 commits intomainfrom
fix/copilot-cli-token-usage-estimation

christso commented Mar 19, 2026 •

edited

Loading

Uh oh!

cloudflare-workers-and-pages bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

christso commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

christso commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

cloudflare-workers-and-pages bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying agentv with Cloudflare Pages

Uh oh!

christso commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

christso commented Mar 19, 2026 •

edited

Loading

cloudflare-workers-and-pages bot commented Mar 19, 2026 •

edited

Loading