Before submitting
Area
apps/web
Problem or use case
T3 Code already exposes some useful run state, such as the current model selection in the composer, elapsed message metadata, and a context-window meter. What is still hard to see at a glance is the compact per-turn/session statistics that make a coding run easy to audit after the fact.
A useful reference is the t3.chat footer style, which shows a small row like:
GPT-5.4 (High) · 59.17 tok/sec · 3276 tokens · Time-to-First: 4.3 sec · 1 tool call
For long coding sessions, this information helps answer questions like:
- Which model/effort produced this response?
- Was the run slow because first-token latency was high, because generation was slow, or because tool work dominated?
- How many tokens did this turn consume?
- How tool-heavy was this turn?
- Is one provider/model configuration clearly performing better than another?
Today, users have to infer this from scattered UI state, context-window popovers, message timing, or logs. That makes performance/provenance comparison much harder than it needs to be.
Proposed solution
Add a compact per-turn stats footer under each assistant response, or in the existing completion-divider/message metadata area, with graceful hiding for unavailable fields.
Suggested fields, when available:
- provider/model/effort snapshot for the turn, e.g.
Codex · GPT-5.5 · High
- elapsed duration, e.g.
2m 14s
- output/turn token count and/or total processed tokens
- generation throughput, e.g.
59 tok/sec, computed only when the numerator/denominator are reliable
- time to first assistant output, e.g.
TTFT 4.3s
- tool call count, e.g.
7 tool calls
The footer can stay compact by default and put secondary detail in a tooltip/popover. Missing provider fields should simply be omitted, not shown as zero.
Implementation notes from a Repomix pass over main:
packages/contracts/src/providerRuntime.ts already defines ThreadTokenUsageSnapshot with usedTokens, totalProcessedTokens, input/output/reasoning token fields, toolUses, durationMs, and compaction-related metadata.
apps/server/src/provider/Layers/CodexAdapter.ts already normalizes Codex thread/tokenUsage/updated notifications into ThreadTokenUsageSnapshot.
apps/server/src/provider/Layers/ClaudeAdapter.ts already normalizes Claude usage and emits thread.token-usage.updated from result/task progress events.
apps/server/src/orchestration/Layers/ProviderRuntimeIngestion.ts already projects thread.token-usage.updated into context-window.updated thread activities.
apps/web/src/lib/contextWindow.ts and apps/web/src/components/chat/ContextWindowMeter.tsx already consume the latest context-window snapshot in the composer footer.
apps/web/src/components/ChatView.tsx, apps/web/src/session-logic.ts, and apps/web/src/components/chat/MessagesTimeline.tsx already derive/display turn timing from latestTurn.startedAt, latestTurn.completedAt, and assistant message timestamps.
The feature therefore looks feasible as an incremental UI/projection addition rather than a full provider rewrite. The likely missing pieces are turn-local stats instead of only latest-thread context-window state, a reliable first assistant-output timestamp, and a per-turn provider/model/effort snapshot if #2481 does not cover that first.
Why this matters
Coding-agent sessions are often long, mixed-provider, and tool-heavy. A compact stats footer makes the transcript self-auditing: users can compare model behavior, understand slow turns, and reconstruct what happened without opening logs or guessing from the current composer state.
This is especially valuable when experimenting with provider/model choices. A user should be able to look back at a thread and quickly see that one turn was GPT-5.5 High, another was Claude Opus, one had a high time-to-first, and another spent most of its time in tools.
Smallest useful scope
A first pass could be narrow:
- show a compact footer for the latest completed turn / assistant message only
- reuse existing
latestTurn.startedAt / completedAt for elapsed time
- count tool activities for the turn from existing
tool.* activities
- display available token data from the latest matching
context-window.updated activity for that turn
- add a tooltip explaining when a metric is approximate or unavailable
A stronger follow-up could persist/project an explicit per-turn stats object so older turns remain exact after reloads and after later context-window updates.
Alternatives considered
Risks or tradeoffs
- Provider parity: not every provider exposes identical token or timing fields. The UI should omit unavailable values instead of fabricating them.
- Metric semantics:
tokens/sec can be misleading if it includes tool time, waiting time, or first-token latency. Label the denominator clearly, or only compute it from a well-defined assistant-output window.
- Historical correctness: if the UI reads only the latest context-window snapshot, old turns may show stale/newer usage. A durable per-turn projection is cleaner for full accuracy.
- Visual clutter: the row should be compact and low-contrast, with richer detail behind hover/click.
Examples or references
Contribution
Before submitting
Area
apps/web
Problem or use case
T3 Code already exposes some useful run state, such as the current model selection in the composer, elapsed message metadata, and a context-window meter. What is still hard to see at a glance is the compact per-turn/session statistics that make a coding run easy to audit after the fact.
A useful reference is the t3.chat footer style, which shows a small row like:
For long coding sessions, this information helps answer questions like:
Today, users have to infer this from scattered UI state, context-window popovers, message timing, or logs. That makes performance/provenance comparison much harder than it needs to be.
Proposed solution
Add a compact per-turn stats footer under each assistant response, or in the existing completion-divider/message metadata area, with graceful hiding for unavailable fields.
Suggested fields, when available:
Codex · GPT-5.5 · High2m 14s59 tok/sec, computed only when the numerator/denominator are reliableTTFT 4.3s7 tool callsThe footer can stay compact by default and put secondary detail in a tooltip/popover. Missing provider fields should simply be omitted, not shown as zero.
Implementation notes from a Repomix pass over
main:packages/contracts/src/providerRuntime.tsalready definesThreadTokenUsageSnapshotwithusedTokens,totalProcessedTokens, input/output/reasoning token fields,toolUses,durationMs, and compaction-related metadata.apps/server/src/provider/Layers/CodexAdapter.tsalready normalizes Codexthread/tokenUsage/updatednotifications intoThreadTokenUsageSnapshot.apps/server/src/provider/Layers/ClaudeAdapter.tsalready normalizes Claude usage and emitsthread.token-usage.updatedfrom result/task progress events.apps/server/src/orchestration/Layers/ProviderRuntimeIngestion.tsalready projectsthread.token-usage.updatedintocontext-window.updatedthread activities.apps/web/src/lib/contextWindow.tsandapps/web/src/components/chat/ContextWindowMeter.tsxalready consume the latest context-window snapshot in the composer footer.apps/web/src/components/ChatView.tsx,apps/web/src/session-logic.ts, andapps/web/src/components/chat/MessagesTimeline.tsxalready derive/display turn timing fromlatestTurn.startedAt,latestTurn.completedAt, and assistant message timestamps.The feature therefore looks feasible as an incremental UI/projection addition rather than a full provider rewrite. The likely missing pieces are turn-local stats instead of only latest-thread context-window state, a reliable first assistant-output timestamp, and a per-turn provider/model/effort snapshot if #2481 does not cover that first.
Why this matters
Coding-agent sessions are often long, mixed-provider, and tool-heavy. A compact stats footer makes the transcript self-auditing: users can compare model behavior, understand slow turns, and reconstruct what happened without opening logs or guessing from the current composer state.
This is especially valuable when experimenting with provider/model choices. A user should be able to look back at a thread and quickly see that one turn was
GPT-5.5 High, another wasClaude Opus, one had a high time-to-first, and another spent most of its time in tools.Smallest useful scope
A first pass could be narrow:
latestTurn.startedAt/completedAtfor elapsed timetool.*activitiescontext-window.updatedactivity for that turnA stronger follow-up could persist/project an explicit per-turn stats object so older turns remain exact after reloads and after later context-window updates.
Alternatives considered
Risks or tradeoffs
tokens/seccan be misleading if it includes tool time, waiting time, or first-token latency. Label the denominator clearly, or only compute it from a well-defined assistant-output window.Examples or references
Model (effort) · tok/sec · tokens · Time-to-First · tool callsusage / quota visibility for Codex sessions and accounts)Show the provider and model used for each assistant message)Contribution