fix: estimate input/output token split for copilot-cli provider#684
Merged
fix: estimate input/output token split for copilot-cli provider#684
Conversation
The ACP usage_update event only reports cumulative context window tokens (used), not separate input/output counts. Previously output was hardcoded to 0, making token_usage misleading for copilot-cli targets. This change tracks characters flowing in each direction (prompt + tool results as input, agent message chunks as output) and pro-rates the total used tokens proportionally to estimate the split. Closes #683 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Deploying agentv with
|
| Latest commit: |
d02905e
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://51667ae5.agentv.pages.dev |
| Branch Preview URL: | https://fix-copilot-cli-token-usage.agentv.pages.dev |
Copilot CLI does not currently emit usage_update events via ACP — the usage data is tracked internally but marked ephemeral and not sent to clients (see github/copilot-cli#1152). Previously this meant token_usage was always undefined for copilot-cli targets. This change estimates token usage from observed character counts: - Input chars: prompt text + tool result payloads flowing to the agent - Output chars: agent_message_chunk text flowing from the agent - Both converted to tokens using a ~4 chars/token heuristic When/if copilot CLI starts emitting usage_update events, the provider will prefer those values (with char-based output estimation to split the cumulative `used` count into input/output). Closes #683 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Collaborator
Author
|
Note on estimation accuracy:
This is an inherent limitation of the ACP protocol not exposing token data. Accurate counts are blocked on github/copilot-cli#1152. |
The ACP PromptResponse includes a Usage field with inputTokens, outputTokens, thoughtTokens, and cachedReadTokens. Although copilot CLI v1.0.9 doesn't populate this yet (marked @experimental/UNSTABLE in the ACP spec), this positions the provider to use accurate token counts as soon as copilot starts returning them. Token usage resolution order: 1. PromptResponse.usage (accurate, from ACP — not yet populated) 2. usage_update session events (not yet emitted via ACP) 3. Character-based estimation (~4 chars/token heuristic) Also makes raceWithTimeout generic so it preserves the PromptResponse return value instead of discarding it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rces Char-based estimation produced misleading values (e.g. input: 65 when real input was 3000+ tokens) because we can't observe copilot's system prompt and internal context. Better to report nothing than to mislead. What remains: - PromptResponse.usage capture (accurate when copilot populates it) - usage_update event handler with cost accumulation - Generic raceWithTimeout preserving PromptResponse return value - Code comments documenting why token_usage is currently undefined Blocked on github/copilot-cli#1152 for accurate token reporting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
usage_updateonly reports cumulative context window tokens (used), not separate input/output counts — output was hardcoded to0usedtokens proportionallyCloses #683
Test plan
🤖 Generated with Claude Code