Skip to content

feat: in-session model switching and cost awareness (#32)#35

Merged
Keith-CY merged 5 commits intomainfrom
feature/issue-32-session-model-switching
Mar 2, 2026
Merged

feat: in-session model switching and cost awareness (#32)#35
Keith-CY merged 5 commits intomainfrom
feature/issue-32-session-model-switching

Conversation

@dev01lay2
Copy link
Collaborator

Closes #32

Changes

Backend

  • Per-session model override (): In-memory behind a static Mutex with , , Tauri commands
  • Per-session token tracking (): and with a get_session_usage_stats Tauri command
  • Cost estimation (new ): Hardcoded pricing for gpt-4o, gpt-4o-mini, gpt-4.1, claude-3.7-sonnet, claude-3.5-haiku, gemini-2.0-flash, kimi-k2.5 with estimate_query_cost Tauri command

Frontend

  • TokenBadge component: Shows session token count and estimated cost, auto-refreshes every 5s
  • ModelSwitcher component: Popover dropdown to switch model per session with "Session override" indicator
  • Wired into Doctor page zeroclaw engine view

- Add per-session model override commands (set/get/clear) in preferences.rs
- Add per-session token tracking (record_session_usage, get_session_usage) in process.rs
- Add cost estimation module (cost.rs) with hardcoded pricing for common models
- Add TokenBadge component showing session token usage and estimated cost
- Add ModelSwitcher component for switching models per session
- Wire TokenBadge and ModelSwitcher into Doctor page zeroclaw view
- Register all new Tauri commands in lib.rs

Closes #32
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 75675a040f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

STORE.get_or_init(|| Mutex::new(std::collections::HashMap::new()))
}

pub fn record_session_usage(session_id: &str, prompt_tokens: u64, completion_tokens: u64) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Wire per-session usage recording into runtime calls

This new record_session_usage path is never invoked by the zeroclaw execution flow, which still updates only the global store via record_zeroclaw_usage in run_zeroclaw_message (src-tauri/src/runtime/zeroclaw/process.rs). As a result, get_session_usage_stats reads default-zero values for every session, so the per-session token/cost feature introduced in this change does not produce real data.

Useful? React with 👍 / 👎.

Comment on lines +130 to +136
pub fn set_session_model_override(session_id: String, model: String) -> Result<(), String> {
let trimmed = model.trim().to_string();
if trimmed.is_empty() {
return Err("model must not be empty".into());
}
if let Ok(mut map) = session_model_overrides().lock() {
map.insert(session_id, trimmed);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Apply session model override during zeroclaw execution

set_session_model_override stores overrides in-memory, but the runtime path does not read this map when selecting model/provider (it still uses global preference resolution in run_zeroclaw_message). That means the new model switch API updates state that never affects actual requests, so in-session model switching is functionally broken.

Useful? React with 👍 / 👎.

Comment on lines +1043 to +1044
<TokenBadge sessionId="zeroclaw-doctor" model="gpt-4o" />
<ModelSwitcher sessionId="zeroclaw-doctor" defaultModel="gpt-4o" />

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use active doctor session key for usage/override widgets

Both widgets are bound to the fixed ID zeroclaw-doctor, but doctor sessions are created with a per-diagnosis UUID key in use-doctor-agent (sessionKeyRef.current = \agent:...:${crypto.randomUUID()}``). Because these IDs do not match the runtime session key, the badge/override calls target a different session namespace than live conversations, so usage remains empty and overrides cannot apply to the active diagnosis.

Useful? React with 👍 / 👎.

Copy link
Collaborator

@Keith-CY Keith-CY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking findings

  1. defines per-session model overrides, but no model execution path uses them. The PR adds in (around lines 130-155) and /, but model selection calls are still based on global model preference and never consult per-session override. Result: changes UI state only and does not affect backend runtime model choice.

Impact: session-level model switcher is effectively non-functional.

  1. exposes , and adds /, but no execution path in runtime writes to per-session usage.

(around lines 280+) is never called from /streaming execution paths. As added, session usage stays zero and remains empty for real traffic.

Impact: Token/cost badge is misleading and non-functional in practice.

  1. hardcodes and when rendering / (around lines 1040-1042).

This is not guaranteed to match the actual per-session runtime ID or active model used for Doctor messages.

Impact: displayed usage/cost can be incorrect even if backend tracking were wired.

Please rework model override application + per-session usage recording before merge.

Copy link
Collaborator

@Keith-CY Keith-CY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking findings

  1. src-tauri/src/runtime/zeroclaw/process.rs defines per-session model overrides, but no model execution path uses them. The PR adds set/get/clear_session_model_override in src-tauri/src/commands/preferences.rs, and set_session_model_override/get_session_model_override are registered in src-tauri/src/lib.rs, but model selection remains global-only and never consults a session override. As a result, session model switching in UI does not affect actual runtime model selection.

Impact: session-level model switching is non-functional.

  1. src-tauri/src/commands/preferences.rs exposes get_session_usage_stats, and src-tauri/src/runtime/zeroclaw/process.rs adds record_session_usage/get_session_usage, but the runtime execution path never calls record_session_usage for actual requests.

record_session_usage is added in src-tauri/src/runtime/zeroclaw/process.rs, yet no usage stats updates are wired into run_zeroclaw_once/streaming invocation paths. Session stats will stay zero for real sessions, so TokenBadge cannot show real usage/cost.

Impact: token and cost UI is misleading/non-functional.

  1. src/pages/Doctor.tsx hardcodes sessionId="zeroclaw-doctor" and model="gpt-4o" for TokenBadge/ModelSwitcher.

This ID/model may not map to the actual runtime session key or selected model, so usage/cost display can report the wrong session and wrong model.

Impact: displayed metrics can be incorrect even after the runtime wiring is fixed.

Please address these before merging.

@dev01lay2 dev01lay2 force-pushed the feature/issue-32-session-model-switching branch from e13a309 to af6988c Compare March 2, 2026 09:09
@dev01lay2
Copy link
Collaborator Author

Addressed all 3 blocking review findings:

1. Session model override now wired into runtime

  • Added lookup_session_model_override() helper in preferences.rs (non-Tauri, callable from runtime)
  • run_zeroclaw_message now checks per-session override (keyed on instance_id) before falling back to global preference

2. Per-session usage recording now functional

  • try_once closure in run_zeroclaw_message now calls record_session_usage() after parsing tokens from stdout/stderr, and from trace-based fallback path

3. Doctor.tsx uses dynamic session ID and model

  • doctorSessionId derived from instanceId (stable, matches backend lookup key)
  • runtimeModel fetched from get_zeroclaw_runtime_target on mount
  • Removed hardcoded "zeroclaw-doctor" / "gpt-4o"

The frontend passes instanceId as the session key for model override
and usage tracking. The backend was using session_scope (which is the
full storage_key like 'zeroclaw:doctor:local:...'), causing a key
mismatch. Switch to instance_id which matches what the frontend sends.

Addresses review feedback from Keith-CY on blocking findings 1 & 2.
@dev01lay2
Copy link
Collaborator Author

Hi @Keith-CY, thanks for the thorough review — all three blocking findings were legit. Here's what I've fixed across the last two commits:

Finding 1 (model override not wired into runtime):

  • Added lookup_session_model_override() in preferences.rs
  • Wired it into run_zeroclaw_message — it now takes priority over global preference
  • Fixed key mismatch: was using session_scope (full storage_key), now uses instance_id to match the frontend

Finding 2 (usage tracking not wired):

  • Added record_session_usage() calls in both the parse-from-text path and trace fallback path
  • Same key fix: uses instance_id instead of session_scope

Finding 3 (hardcoded session/model in Doctor.tsx):

  • sessionId now uses instanceId || "local" (matches backend)
  • model now fetched dynamically via get_zeroclaw_runtime_target

CI is green. Would appreciate a re-review when you get a chance!

@dev01lay2
Copy link
Collaborator Author

@Keith-CY 三个 blocking findings 都在后续提交里修复了,请 re-review 🙏

Finding 1 (model override not wired into runtime):
af6988carun_zeroclaw_message now calls lookup_session_model_override(instance_id) before falling back to global preference. Override flows through to provider_order_for_runtime.

Finding 2 (session usage never recorded):
af6988carecord_session_usage is now called in run_zeroclaw_message for both parsed stdout/stderr usage and builtin trace fallback paths.

Finding 3 (hardcoded sessionId/model in Doctor.tsx):
3aeb7cb6Doctor.tsx now uses instanceId || "local" as doctorSessionId and fetches runtimeModel from get_zeroclaw_runtime_target() at mount. No more hardcoded values.

CI is green (both frontend and rust pass).

@dev01lay2 dev01lay2 requested a review from Keith-CY March 2, 2026 10:00
@dev01lay2
Copy link
Collaborator Author

Hey @Keith-CY — all three blocking findings from your review have been addressed in the follow-up commits:

  1. Session model override now wired into runtime (af6988ca): run_zeroclaw_message calls lookup_session_model_override(instance_id) before falling back to the global preference, so the per-session switcher actually drives model selection.

  2. Per-session usage recording wired in (af6988ca): record_session_usage is now called from run_zeroclaw_message after parsing usage from stdout/stderr (and from builtin traces as a fallback), so TokenBadge will display real token counts and cost.

  3. Doctor.tsx no longer hardcodes session/model (3aeb7cb6): doctorSessionId is derived from instanceId || "local", and the model comes from get_zeroclaw_runtime_target() at runtime — no more hardcoded "zeroclaw-doctor" or "gpt-4o".

CI is green (frontend + rust both pass). Re-review requested — would appreciate another look when you get a chance.

Copy link
Collaborator

@Keith-CY Keith-CY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings (blocking)

  1. Streaming runtime still ignores per-session model override
  • Severity: Blocking
  • File: src-tauri/src/runtime/zeroclaw/process.rs (run_zeroclaw_message_streaming and its call path)
  • The new session override lookup is wired into run_zeroclaw_message, but run_zeroclaw_message_streaming still reads only load_zeroclaw_model_preference(). Doctor + chat streaming paths call the streaming adapter (ZeroclawDoctorAdapter::start_streaming/send_streaming) which delegates to run_zeroclaw_message_streaming, so the UI ModelSwitcher still won’t affect actual runtime model for real Doctor usage.
  1. Streaming runtime still writes usage to global counters only
  • Severity: Blocking
  • File: src-tauri/src/runtime/zeroclaw/process.rs (stream_once / streaming path)
  • record_session_usage() is added and used in non-streaming execution, but streaming path keeps record_zeroclaw_usage() only (and trace fallback updates usage_store only). Doctor sessions are streaming, so get_session_usage_stats(sessionId) can stay zero while tokens are actually consumed.
  1. Cost estimate may show wrong model
  • Severity: Medium
  • File: src/pages/Doctor.tsx + src/components/TokenBadge.tsx
  • TokenBadge receives model={runtimeModel} from get_zeroclaw_runtime_target() while session override can change model via ModelSwitcher. With overrides, estimated cost may be calculated against stale global runtime model instead of active per-session model.

Please re-apply the override+recording logic to the streaming path (run_zeroclaw_message_streaming + stream_once), and pass the effective session model into TokenBadge.

Finding 3: TokenBadge was using the global runtimeModel for cost
estimation, ignoring per-session model overrides set via ModelSwitcher.

Changes:
- Doctor.tsx tracks sessionModelOverride state, derives effectiveModel
  (override ?? runtimeModel), passes it to TokenBadge
- ModelSwitcher accepts onModelChange callback to notify parent when
  the user switches model or clears override
- TokenBadge now receives the effective model so cost estimates match
  the actual model being used for the session

Note: Findings 1 & 2 (streaming runtime override + usage) reference
run_zeroclaw_message_streaming / stream_once which exist only in
PR #34 (feat/zeroclaw-streaming-response), not in this branch's base.
Will wire session override + usage into the streaming path when #34
merges.
@dev01lay2
Copy link
Collaborator Author

Re-review response (10:08 UTC findings)

Finding 3 (cost estimate may show wrong model) — ✅ Fixed

  • Doctor.tsx now tracks sessionModelOverride state and derives effectiveModel = sessionModelOverride ?? runtimeModel
  • TokenBadge receives the effective model so cost estimates match the actual session model
  • ModelSwitcher notifies parent via onModelChange callback when user switches or clears override

Finding 1 & 2 (streaming runtime ignores override + doesn't record session usage) — ⚠️ Not applicable to this branch

The functions mentioned (run_zeroclaw_message_streaming, stream_once, ZeroclawDoctorAdapter::start_streaming/send_streaming) exist only in PR #34 (feat/zeroclaw-streaming-response). This PR's base is main, which has no streaming execution path — the only execution path is run_zeroclaw_message, which is already wired with both session override and usage tracking.

Plan: When #34 merges to main, I'll rebase this PR and wire the session override + per-session usage recording into the streaming path (run_zeroclaw_message_streaming + stream_once).

Copy link
Collaborator Author

@dev01lay2 dev01lay2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed finding #3 — effective model for TokenBadge

Pushed 6522b01: Doctor.tsx now tracks the session model override via onModelChange from ModelSwitcher and derives effectiveModel = sessionModelOverride ?? runtimeModel, which is passed to TokenBadge. Cost estimates will now match the actual per-session model.

Findings #1 & #2 — streaming path

run_zeroclaw_message_streaming and stream_once don't exist in this branch — they're introduced by PR #34 (feat/zeroclaw-streaming-response). The non-streaming execution path (run_zeroclaw_message) already has session override lookup and per-session usage recording wired in.

Once #34 merges, I'll rebase this branch and wire the session override + usage recording into the streaming path too. Happy to handle that as a follow-up commit here or in a separate PR — whatever you prefer.

(CI failure on latest push is a flaky Text file busy in an unrelated test — re-triggered.)

Copy link
Collaborator

@Keith-CY Keith-CY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed PR #35 on latest HEAD. The previous three regressions are now addressed: session override is consulted in runtime execution, per-session usage is recorded in run_zeroclaw_message, and Doctor now passes effective session model (override -> runtime) into TokenBadge. No remaining blocking issues found in this revision.

@Keith-CY Keith-CY merged commit b478e2e into main Mar 2, 2026
3 of 4 checks passed
@Keith-CY Keith-CY deleted the feature/issue-32-session-model-switching branch March 5, 2026 08:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Zeroclaw: In-session model switching and cost awareness

2 participants