feat: live model switching for running sessions#1239
feat: live model switching for running sessions#1239Gkrumbach07 wants to merge 5 commits intomainfrom
Conversation
Allow users to switch the LLM model on a running session without
stopping it. The flow is: frontend → backend updates CR → backend
proxies to runner → runner stops current workers, updates config,
and emits an ag-ui confirmation event.
- Backend: POST /api/projects/{project}/agentic-sessions/{session}/model
- Runner: POST /model endpoint with mid-generation guard (422)
- Frontend: LiveModelSelector in chat input toolbar, disabled during runs
- AG-UI: ambient:model_switched custom event renders as system message
Closes #1090
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds live model switching: backend Gin handler to update an AgenticSession CR (with RFC1123 name validation and rollback on runner errors) and route; frontend selector, API call and mutation wired into the session UI; runner POST /model endpoint that enforces locks, updates LLM env, emits an AG-UI event, and triggers adapter rebuild; plus tests. Changes
Sequence DiagramsequenceDiagram
participant Frontend as Client/Frontend
participant Backend as Backend API
participant K8s as Kubernetes API
participant Runner as Session Runner
Frontend->>Backend: POST /api/projects/:projectName/agentic-sessions/:sessionName/model\n{ model }
Backend->>Backend: auth, bind JSON, sanitize names
Backend->>K8s: GET AgenticSession CR (dynamic client)
K8s-->>Backend: AgenticSession resource
Backend->>Backend: ensureRuntimeMutationAllowed\ncompare current model
alt Model unchanged
Backend-->>Frontend: 200 current AgenticSession
else Model differs
Backend->>K8s: PATCH CR.spec.llmSettings.model
K8s-->>Backend: Updated AgenticSession
Backend->>Runner: POST http://session-<name>.<project>.svc.cluster.local:8001/model\n{ model }
Runner->>Runner: validate payload, ensure bridge.context\nRunner->>Runner: check generation lock (per-session)\nRunner->>Runner: acquire module-level _model_change_lock
Runner->>Runner: update os.environ["LLM_MODEL"] (clear Vertex ID if needed)\nemit ambient:model_switched to between-run queue\nbridge.mark_dirty()
Runner-->>Backend: 200 { model, previousModel }
Backend-->>Frontend: 200 updated AgenticSession
end
alt Runner failure
Backend->>Backend: revert CR.spec.llmSettings.model (revertModelSwitch)
Backend-->>Frontend: 502 or forwarded runner status/body
end
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error)
✅ Passed checks (7 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/backend/handlers/sessions.go`:
- Around line 1415-1466: After binding req.Model and before comparing to
previousModel, run the same model validation performed in CreateSession to
ensure the requested model is supported by the session's runner/provider: read
spec and llmSettings from item (same way you do into spec and llmSettings),
extract the runner/provider info, and reuse the validation logic from
CreateSession (the model-checking block around lines 678-692) to check
req.Model; if validation fails, return HTTP 400 with a clear error message;
otherwise continue with the existing mutation flow (variables: req.Model, spec,
llmSettings, previousModel, and the CreateSession validation routine).
- Around line 1467-1507: The code currently calls the runner (runnerURL /
client.Do) before updating the session CR
(k8sDyn.Resource(...).Namespace(project).Update), which allows the live runner
to change even if the CR update is denied; fix by performing the authoritative
CR update first: fetch the latest item, mutate spec["llmSettings"]["model"] (or
a dedicated pending field) on the item and attempt
k8sDyn.Resource(gvr).Namespace(project).Update using that item (so the write is
permission-checked and uses the up-to-date resourceVersion), handle update
errors/conflicts and only upon a successful Update proceed to POST to runnerURL;
use the same symbols llmSettings, spec, item and the existing Update call
location to implement this reorder and error handling.
In
`@components/frontend/src/app/projects/`[name]/sessions/[sessionName]/components/live-model-selector.tsx:
- Around line 32-36: The useModels hook result (currently only destructuring
data into modelsData) lacks loading/error state handling, causing the UI to show
"No models available" even while loading or on failure; update the destructuring
to include isLoading and isError from useModels(projectName, true, provider),
then modify the component rendering for the models dropdown (where
modelsData/models are used) to show a loading spinner or placeholder when
isLoading===true and an error message when isError===true, otherwise map
modelsData.models to the models array as before.
In `@components/frontend/src/hooks/agui/event-handlers.ts`:
- Around line 1065-1076: The model-switch assistant message created in the
ambient:model_switched handler lacks a timestamp, which can break chronological
ordering used by insertByTimestamp and snapshot merging; update the msg object
in that handler (PlatformMessage creation) to include a timestamp (e.g.,
Date.now() or the same time basis used elsewhere) so the message sort/merge
logic treats it correctly and remains ordered with other messages.
In `@components/runners/ambient-runner/ambient_runner/endpoints/model.py`:
- Around line 55-62: The current pattern using if not
_model_change_lock.locked() then async with _model_change_lock is racy; replace
it with a non-blocking acquire on _model_change_lock and act based on the return
value: attempt to acquire the lock with
_model_change_lock.acquire(blocking=False) (or the async equivalent) and if
acquisition succeeds call and await _perform_model_switch(bridge, context,
new_model, previous_model) inside the held lock, otherwise raise the same
HTTPException(409, "A model switch is already in progress"); ensure you release
the lock if acquired (or use the lock's context manager when acquisition
succeeded) so no deadlock occurs.
- Around line 78-82: The event emission is happening after bridge.mark_dirty(),
which clears the session manager so _emit_model_switched_event(bridge, context,
new_model, previous_model) finds no worker and skips sending the
ambient:model_switched event; move the call to _emit_model_switched_event(...)
to occur before bridge.mark_dirty() so the event is emitted while the session
manager and workers still exist, keeping the rest of the logic intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b18bf6c6-3847-48fd-a4e7-217744678a4c
📒 Files selected for processing (13)
components/backend/handlers/sessions.gocomponents/backend/routes.gocomponents/frontend/src/app/projects/[name]/sessions/[sessionName]/components/__tests__/live-model-selector.test.tsxcomponents/frontend/src/app/projects/[name]/sessions/[sessionName]/components/live-model-selector.tsxcomponents/frontend/src/app/projects/[name]/sessions/[sessionName]/page.tsxcomponents/frontend/src/components/chat/ChatInputBox.tsxcomponents/frontend/src/components/session/MessagesTab.tsxcomponents/frontend/src/hooks/agui/event-handlers.tscomponents/frontend/src/services/api/sessions.tscomponents/frontend/src/services/queries/use-sessions.tscomponents/runners/ambient-runner/ambient_runner/app.pycomponents/runners/ambient-runner/ambient_runner/endpoints/model.pycomponents/runners/ambient-runner/tests/test_model_endpoint.py
| var req struct { | ||
| Model string `json:"model" binding:"required"` | ||
| } | ||
| if err := c.ShouldBindJSON(&req); err != nil { | ||
| c.JSON(http.StatusBadRequest, gin.H{"error": "Invalid request body: model is required"}) | ||
| return | ||
| } | ||
|
|
||
| if req.Model == "" { | ||
| c.JSON(http.StatusBadRequest, gin.H{"error": "model must not be empty"}) | ||
| return | ||
| } | ||
|
|
||
| gvr := GetAgenticSessionV1Alpha1Resource() | ||
|
|
||
| // Get current session | ||
| item, err := k8sDyn.Resource(gvr).Namespace(project).Get(context.TODO(), sessionName, v1.GetOptions{}) | ||
| if err != nil { | ||
| if errors.IsNotFound(err) { | ||
| c.JSON(http.StatusNotFound, gin.H{"error": "Session not found"}) | ||
| return | ||
| } | ||
| c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to get session"}) | ||
| return | ||
| } | ||
|
|
||
| // Ensure session is Running | ||
| if err := ensureRuntimeMutationAllowed(item); err != nil { | ||
| c.JSON(http.StatusConflict, gin.H{"error": err.Error()}) | ||
| return | ||
| } | ||
|
|
||
| // Get current model for comparison | ||
| spec := item.Object["spec"].(map[string]interface{}) | ||
| llmSettings, _, _ := unstructured.NestedMap(spec, "llmSettings") | ||
| previousModel, _ := llmSettings["model"].(string) | ||
|
|
||
| // No-op if same model | ||
| if previousModel == req.Model { | ||
| session := types.AgenticSession{ | ||
| APIVersion: item.GetAPIVersion(), | ||
| Kind: item.GetKind(), | ||
| Metadata: item.Object["metadata"].(map[string]interface{}), | ||
| } | ||
| session.Spec = parseSpec(spec) | ||
| if status, ok := item.Object["status"].(map[string]interface{}); ok { | ||
| session.Status = parseStatus(status) | ||
| } | ||
| c.JSON(http.StatusOK, session) | ||
| return | ||
| } | ||
|
|
There was a problem hiding this comment.
Validate the requested model before proxying.
After Line 1423, this path only rejects empty input. Unlike CreateSession (Lines 678-692), it never verifies that req.Model is valid for the session's runner/provider, so an unsupported model can be accepted, persisted, and only fail on the next LLM call instead of returning 400 here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/backend/handlers/sessions.go` around lines 1415 - 1466, After
binding req.Model and before comparing to previousModel, run the same model
validation performed in CreateSession to ensure the requested model is supported
by the session's runner/provider: read spec and llmSettings from item (same way
you do into spec and llmSettings), extract the runner/provider info, and reuse
the validation logic from CreateSession (the model-checking block around lines
678-692) to check req.Model; if validation fails, return HTTP 400 with a clear
error message; otherwise continue with the existing mutation flow (variables:
req.Model, spec, llmSettings, previousModel, and the CreateSession validation
routine).
...s/frontend/src/app/projects/[name]/sessions/[sessionName]/components/live-model-selector.tsx
Outdated
Show resolved
Hide resolved
| if not _model_change_lock.locked(): | ||
| async with _model_change_lock: | ||
| return await _perform_model_switch(bridge, context, new_model, previous_model) | ||
| else: | ||
| raise HTTPException( | ||
| status_code=409, | ||
| detail="A model switch is already in progress", | ||
| ) |
There was a problem hiding this comment.
TOCTOU race condition in lock check.
The check if not _model_change_lock.locked() followed by async with _model_change_lock is racy. Two concurrent requests can both pass the check before either acquires the lock.
Use non-blocking acquire instead:
Proposed fix
- if not _model_change_lock.locked():
- async with _model_change_lock:
- return await _perform_model_switch(bridge, context, new_model, previous_model)
- else:
- raise HTTPException(
- status_code=409,
- detail="A model switch is already in progress",
- )
+ if _model_change_lock.locked():
+ raise HTTPException(
+ status_code=409,
+ detail="A model switch is already in progress",
+ )
+ async with _model_change_lock:
+ return await _perform_model_switch(bridge, context, new_model, previous_model)Note: This still has a small race window. For true atomicity, use lock.acquire(blocking=False) and check the return value, or accept that 409 is best-effort.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if not _model_change_lock.locked(): | |
| async with _model_change_lock: | |
| return await _perform_model_switch(bridge, context, new_model, previous_model) | |
| else: | |
| raise HTTPException( | |
| status_code=409, | |
| detail="A model switch is already in progress", | |
| ) | |
| if _model_change_lock.locked(): | |
| raise HTTPException( | |
| status_code=409, | |
| detail="A model switch is already in progress", | |
| ) | |
| async with _model_change_lock: | |
| return await _perform_model_switch(bridge, context, new_model, previous_model) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/runners/ambient-runner/ambient_runner/endpoints/model.py` around
lines 55 - 62, The current pattern using if not _model_change_lock.locked() then
async with _model_change_lock is racy; replace it with a non-blocking acquire on
_model_change_lock and act based on the return value: attempt to acquire the
lock with _model_change_lock.acquire(blocking=False) (or the async equivalent)
and if acquisition succeeds call and await _perform_model_switch(bridge,
context, new_model, previous_model) inside the held lock, otherwise raise the
same HTTPException(409, "A model switch is already in progress"); ensure you
release the lock if acquired (or use the lock's context manager when acquisition
succeeded) so no deadlock occurs.
components/runners/ambient-runner/ambient_runner/endpoints/model.py
Outdated
Show resolved
Hide resolved
- Runner: emit model_switched event BEFORE mark_dirty() so session manager still exists when event is pushed to between-run queue - Runner: add comment clarifying lock safety in asyncio (single-threaded, no yield between locked() check and acquisition) - Backend: update CR before proxying to runner (validates RBAC first), revert CR if runner rejects the switch - Frontend: add loading/error states to LiveModelSelector dropdown - Frontend: add timestamp to model switch message for proper ordering Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
components/backend/handlers/sessions.go (1)
1484-1484: Hardcoded port bypasses dynamic port resolution.Uses
:8001directly instead ofgetRunnerEndpoint()which respects custom runner ports viasessionPortMap. This matches other handlers in this file but diverges from the WebSocket proxy pattern.If custom runner ports are a supported configuration, consider using the helper function for consistency.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/backend/handlers/sessions.go` at line 1484, The code constructs runnerURL with a hardcoded ":8001" which bypasses the dynamic port mapping; replace that construction by calling the existing helper getRunnerEndpoint(sessionName, project) (or the equivalent function used elsewhere that consults sessionPortMap) to produce the full host:port path and then append "/model" so the URL respects custom runner ports and matches other handlers (update places that set runnerURL, referencing runnerURL, getRunnerEndpoint, sessionName, project, and sessionPortMap).components/runners/ambient-runner/ambient_runner/endpoints/model.py (1)
45-48: Accessing private bridge internals.
_session_managerand_between_run_queueare internal attributes. This coupling is acceptable for same-codebase integration, but note that changes to bridge internals would require updating this endpoint.Also applies to: 109-114
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/runners/ambient-runner/ambient_runner/endpoints/model.py` around lines 45 - 48, This code is directly accessing bridge internals (_session_manager and _between_run_queue); replace those private attribute accesses with public accessors on the bridge (e.g., add or use bridge.get_session_manager() and bridge.get_between_run_queue() or public properties bridge.session_manager / bridge.between_run_queue) and update the uses in this file (the session_manager / get_lock usage around context.session_id and the between-run queue usage) to call those accessors; include a safe fallback that uses getattr(bridge, "_session_manager", None) / getattr(bridge, "_between_run_queue", None) only if the public accessor is absent, and add a brief warning log when falling back for backwards compatibility.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/backend/handlers/sessions.go`:
- Around line 1499-1512: The revert is modifying the stale llmSettings map
(extracted from item) instead of the live CR returned in updated, so the Update
call won't correctly revert the model; change both revert blocks to mutate the
spec llmSettings inside the updated object (e.g.,
updated.Object["spec"].(map[string]interface{})["llmSettings"].(map[string]interface{})["model"]
= previousModel) before calling
k8sDyn.Resource(gvr).Namespace(project).Update(...) so you update the
server-side object rather than the old llmSettings reference; apply this fix to
the earlier revert and the second revert that runs when resp.StatusCode !=
http.StatusOK.
---
Nitpick comments:
In `@components/backend/handlers/sessions.go`:
- Line 1484: The code constructs runnerURL with a hardcoded ":8001" which
bypasses the dynamic port mapping; replace that construction by calling the
existing helper getRunnerEndpoint(sessionName, project) (or the equivalent
function used elsewhere that consults sessionPortMap) to produce the full
host:port path and then append "/model" so the URL respects custom runner ports
and matches other handlers (update places that set runnerURL, referencing
runnerURL, getRunnerEndpoint, sessionName, project, and sessionPortMap).
In `@components/runners/ambient-runner/ambient_runner/endpoints/model.py`:
- Around line 45-48: This code is directly accessing bridge internals
(_session_manager and _between_run_queue); replace those private attribute
accesses with public accessors on the bridge (e.g., add or use
bridge.get_session_manager() and bridge.get_between_run_queue() or public
properties bridge.session_manager / bridge.between_run_queue) and update the
uses in this file (the session_manager / get_lock usage around
context.session_id and the between-run queue usage) to call those accessors;
include a safe fallback that uses getattr(bridge, "_session_manager", None) /
getattr(bridge, "_between_run_queue", None) only if the public accessor is
absent, and add a brief warning log when falling back for backwards
compatibility.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 72d06b65-2dec-4469-8050-65224e21b4b0
📒 Files selected for processing (4)
components/backend/handlers/sessions.gocomponents/frontend/src/app/projects/[name]/sessions/[sessionName]/components/live-model-selector.tsxcomponents/frontend/src/hooks/agui/event-handlers.tscomponents/runners/ambient-runner/ambient_runner/endpoints/model.py
✅ Files skipped from review due to trivial changes (1)
- components/frontend/src/hooks/agui/event-handlers.ts
- Use getRunnerServiceName(item.GetName()) to construct runner URL from validated CR metadata instead of raw URL parameter - Extract revert logic into revertModelSwitch helper that operates on the server-returned `updated` object instead of stale `llmSettings` reference from the original `item` Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add regex-based sanitizeK8sName() that validates names against RFC 1123 DNS label format before they reach network calls, breaking the taint chain for CodeQL's SSRF detection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🔴 Circuit Breaker Triggered (retry_count = 3)Automated maintenance has reached the maximum retry limit. Escalating to human review. Status Summary
What needs human attention
cc @Gkrumbach07 |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
components/backend/handlers/sessions.go (1)
1424-1427:⚠️ Potential issue | 🟠 MajorValidate model availability before persisting and proxying.
Unlike
CreateSession(lines 690-693), this handler only rejects empty input. An unsupported model will be persisted to the CR and only fail on the next LLM call, rather than returning 400 here.Proposed fix
if req.Model == "" { c.JSON(http.StatusBadRequest, gin.H{"error": "model must not be empty"}) return } + + // Validate model availability (reuse CreateSession logic) + reqK8s, _ := GetK8sClientsForRequest(c) + if reqK8s != nil && !isModelAvailable(c.Request.Context(), reqK8s, req.Model, "", project) { + c.JSON(http.StatusBadRequest, gin.H{"error": "Model is not available"}) + return + }Note: You'll also need to resolve the runner's provider from the session spec for full parity with CreateSession's provider-matching logic.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/backend/handlers/sessions.go` around lines 1424 - 1427, The handler currently only checks req.Model != "" but should mirror CreateSession's model availability validation: before persisting or proxying, verify the requested model is supported/available and return HTTP 400 if not. Update this handler to call the same model availability check used by CreateSession (reuse the CreateSession validation routine or extract it into a shared ValidateModelAvailability helper), and when validating include resolving the runner/provider from the session spec (so provider-matching logic is applied) before accepting the model; on failure return c.JSON(http.StatusBadRequest, ...) and do not persist the CR.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/backend/handlers/sessions.go`:
- Line 1449: The unchecked type assertion spec :=
item.Object["spec"].(map[string]interface{}) can panic if spec is missing or not
a map; change it to use the safe comma-ok form (and handle nil/missing case) —
e.g., retrieve item.Object["spec"], verify it's a map[string]interface{} before
using it, and return or handle an error/empty map instead of letting it panic;
apply the same safe-assertion pattern for metadata accesses referenced in the
diff (the similar usages around the metadata handling near the other
occurrences, e.g., the accesses noted at the locations corresponding to lines
1458 and 1536) so no direct type assertions remain without an ok-check and
proper fallback/early return.
---
Duplicate comments:
In `@components/backend/handlers/sessions.go`:
- Around line 1424-1427: The handler currently only checks req.Model != "" but
should mirror CreateSession's model availability validation: before persisting
or proxying, verify the requested model is supported/available and return HTTP
400 if not. Update this handler to call the same model availability check used
by CreateSession (reuse the CreateSession validation routine or extract it into
a shared ValidateModelAvailability helper), and when validating include
resolving the runner/provider from the session spec (so provider-matching logic
is applied) before accepting the model; on failure return
c.JSON(http.StatusBadRequest, ...) and do not persist the CR.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 41961da0-f48e-4948-a029-a4b25da93100
📒 Files selected for processing (1)
components/backend/handlers/sessions.go
|
@ambient-code looks like we need to fix e2e tests, also a rebase would be helpfull for the e2e test u need to fix also when you are done, look for edge cases, simplify this if you can. dont need to make it complicated |
Address CodeRabbit review: use comma-ok pattern for spec and metadata type assertions to prevent panics per project guidelines. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Circuit Breaker Triggered (retry_count = 3)Automated maintenance has reached the maximum retry limit. Escalating to human review. What was fixed this iteration
Current CI Status
Action needed
🤖 Session |
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
components/backend/handlers/sessions.go (1)
1409-1427:⚠️ Potential issue | 🟠 MajorValidate requested model before persisting the CR change
Line 1424 only rejects empty input; unsupported model names are still accepted and written to
spec.llmSettings.model. That violates the endpoint contract for invalid models (400) and can leave sessions in an unusable model state.Suggested fix
- _, k8sDyn := GetK8sClientsForRequest(c) - if k8sDyn == nil { + reqK8s, k8sDyn := GetK8sClientsForRequest(c) + if reqK8s == nil || k8sDyn == nil { c.JSON(http.StatusUnauthorized, gin.H{"error": "Invalid or missing token"}) c.Abort() return } @@ - if req.Model == "" { + req.Model = strings.TrimSpace(req.Model) + if req.Model == "" { c.JSON(http.StatusBadRequest, gin.H{"error": "model must not be empty"}) return } + if !isModelAvailable(c.Request.Context(), reqK8s, req.Model, "", project) { + c.JSON(http.StatusBadRequest, gin.H{"error": "Model is not available for this session"}) + return + }Also applies to: 1448-1483
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/backend/handlers/sessions.go` around lines 1409 - 1427, The handler currently only checks req.Model for empty string after c.ShouldBindJSON but still accepts unsupported model names and writes them into spec.llmSettings.model; update the validation in this handler (the block using GetK8sClientsForRequest, c.ShouldBindJSON and req.Model) to validate req.Model against the list of allowed models (or a utility like isValidModel(model) you add), return http.StatusBadRequest with an explanatory error when the model is unsupported, and do not proceed to persist the CR change; apply the same validation logic to the other handler block mentioned (the identical flow around lines 1448-1483) so unsupported model names are rejected before writing to spec.llmSettings.model.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/backend/handlers/sessions.go`:
- Around line 1483-1488: The Update call to
k8sDyn.Resource(gvr).Namespace(project).Update(...) can return a
resource-version conflict which must be surfaced as HTTP 409 rather than 500;
modify the error handling around that Update (the block referencing sessionName,
project, item and v1.UpdateOptions) to detect a Kubernetes conflict using
apierrors.IsConflict(err) (from k8s.io/apimachinery/pkg/api/errors) and return
c.JSON(http.StatusConflict, ...) with a concise conflict/retry message, while
preserving the existing 500 path for other errors.
- Around line 1558-1566: revertModelSwitch currently writes a single Update
against an object that may be stale; change it to perform a safe retry loop:
fetch the latest object with k8sDyn.Resource(gvr).Namespace(namespace).Get(...),
modify its spec.llmSettings.model to previousModel, then attempt Update and on
conflict (or error indicating resourceVersion mismatch) re-fetch and retry a few
times (use exponential/backoff and a limited retry count) until success or final
failure; ensure you preserve/modify the live object's spec (using
updated.GetName()/GetResourceVersion() from the fetched object) rather than
updating the originally passed-in stale Unstructured.
---
Duplicate comments:
In `@components/backend/handlers/sessions.go`:
- Around line 1409-1427: The handler currently only checks req.Model for empty
string after c.ShouldBindJSON but still accepts unsupported model names and
writes them into spec.llmSettings.model; update the validation in this handler
(the block using GetK8sClientsForRequest, c.ShouldBindJSON and req.Model) to
validate req.Model against the list of allowed models (or a utility like
isValidModel(model) you add), return http.StatusBadRequest with an explanatory
error when the model is unsupported, and do not proceed to persist the CR
change; apply the same validation logic to the other handler block mentioned
(the identical flow around lines 1448-1483) so unsupported model names are
rejected before writing to spec.llmSettings.model.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 03a55a1e-b9c8-40b5-aa13-5c279ae6dd80
📒 Files selected for processing (1)
components/backend/handlers/sessions.go
| updated, err := k8sDyn.Resource(gvr).Namespace(project).Update(context.TODO(), item, v1.UpdateOptions{}) | ||
| if err != nil { | ||
| log.Printf("Failed to update session CR %s for model switch: %v", sessionName, err) | ||
| c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to update session record"}) | ||
| return | ||
| } |
There was a problem hiding this comment.
Concurrent switch conflicts are returned as 500 instead of conflict
At Line 1483, a resource-version conflict during rapid switches is returned as 500. This should be surfaced as a conflict/retry condition, not an internal error.
Suggested fix
updated, err := k8sDyn.Resource(gvr).Namespace(project).Update(context.TODO(), item, v1.UpdateOptions{})
if err != nil {
+ if errors.IsConflict(err) {
+ c.JSON(http.StatusConflict, gin.H{"error": "Model switch conflict, please retry"})
+ return
+ }
log.Printf("Failed to update session CR %s for model switch: %v", sessionName, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to update session record"})
return
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| updated, err := k8sDyn.Resource(gvr).Namespace(project).Update(context.TODO(), item, v1.UpdateOptions{}) | |
| if err != nil { | |
| log.Printf("Failed to update session CR %s for model switch: %v", sessionName, err) | |
| c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to update session record"}) | |
| return | |
| } | |
| updated, err := k8sDyn.Resource(gvr).Namespace(project).Update(context.TODO(), item, v1.UpdateOptions{}) | |
| if err != nil { | |
| if errors.IsConflict(err) { | |
| c.JSON(http.StatusConflict, gin.H{"error": "Model switch conflict, please retry"}) | |
| return | |
| } | |
| log.Printf("Failed to update session CR %s for model switch: %v", sessionName, err) | |
| c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to update session record"}) | |
| return | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/backend/handlers/sessions.go` around lines 1483 - 1488, The Update
call to k8sDyn.Resource(gvr).Namespace(project).Update(...) can return a
resource-version conflict which must be surfaced as HTTP 409 rather than 500;
modify the error handling around that Update (the block referencing sessionName,
project, item and v1.UpdateOptions) to detect a Kubernetes conflict using
apierrors.IsConflict(err) (from k8s.io/apimachinery/pkg/api/errors) and return
c.JSON(http.StatusConflict, ...) with a concise conflict/retry message, while
preserving the existing 500 path for other errors.
| func revertModelSwitch(updated *unstructured.Unstructured, previousModel string, k8sDyn dynamic.Interface, gvr schema.GroupVersionResource, namespace string) { | ||
| if updatedSpec, ok := updated.Object["spec"].(map[string]interface{}); ok { | ||
| if updatedLLM, ok := updatedSpec["llmSettings"].(map[string]interface{}); ok { | ||
| updatedLLM["model"] = previousModel | ||
| _, err := k8sDyn.Resource(gvr).Namespace(namespace).Update(context.TODO(), updated, v1.UpdateOptions{}) | ||
| if err != nil { | ||
| log.Printf("Failed to revert model switch for session %s: %v", updated.GetName(), err) | ||
| } | ||
| } |
There was a problem hiding this comment.
Rollback is not durable under concurrent updates
revertModelSwitch does a single Update on a previously returned object. If that object is stale, rollback fails and the CR may remain on the new model even though the runner rejected the switch.
Suggested fix
func revertModelSwitch(updated *unstructured.Unstructured, previousModel string, k8sDyn dynamic.Interface, gvr schema.GroupVersionResource, namespace string) {
- if updatedSpec, ok := updated.Object["spec"].(map[string]interface{}); ok {
- if updatedLLM, ok := updatedSpec["llmSettings"].(map[string]interface{}); ok {
- updatedLLM["model"] = previousModel
- _, err := k8sDyn.Resource(gvr).Namespace(namespace).Update(context.TODO(), updated, v1.UpdateOptions{})
- if err != nil {
- log.Printf("Failed to revert model switch for session %s: %v", updated.GetName(), err)
- }
- }
- }
+ for i := 0; i < 3; i++ {
+ current, err := k8sDyn.Resource(gvr).Namespace(namespace).Get(context.TODO(), updated.GetName(), v1.GetOptions{})
+ if err != nil {
+ log.Printf("Failed to fetch session %s for model revert: %v", updated.GetName(), err)
+ return
+ }
+ spec, found, _ := unstructured.NestedMap(current.Object, "spec")
+ if !found || spec == nil {
+ return
+ }
+ llm, _, _ := unstructured.NestedMap(spec, "llmSettings")
+ if llm == nil {
+ llm = map[string]interface{}{}
+ }
+ llm["model"] = previousModel
+ spec["llmSettings"] = llm
+ current.Object["spec"] = spec
+
+ if _, err = k8sDyn.Resource(gvr).Namespace(namespace).Update(context.TODO(), current, v1.UpdateOptions{}); err == nil {
+ return
+ }
+ if !errors.IsConflict(err) {
+ log.Printf("Failed to revert model switch for session %s: %v", updated.GetName(), err)
+ return
+ }
+ }
+ log.Printf("Failed to revert model switch for session %s after retries", updated.GetName())
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/backend/handlers/sessions.go` around lines 1558 - 1566,
revertModelSwitch currently writes a single Update against an object that may be
stale; change it to perform a safe retry loop: fetch the latest object with
k8sDyn.Resource(gvr).Namespace(namespace).Get(...), modify its
spec.llmSettings.model to previousModel, then attempt Update and on conflict (or
error indicating resourceVersion mismatch) re-fetch and retry a few times (use
exponential/backoff and a limited retry count) until success or final failure;
ensure you preserve/modify the live object's spec (using
updated.GetName()/GetResourceVersion() from the fetched object) rather than
updating the originally passed-in stale Unstructured.
Summary
POST /api/projects/{project}/agentic-sessions/{session}/modelendpoint that validates session is Running, updates CR (validates RBAC), proxies model switch to runner, and reverts CR if runner rejects/modelendpoint that guards against mid-generation switches (422), updatesLLM_MODELenv var, emitsambient:model_switchedag-ui event, then callsmark_dirty()to rebuild adapterLiveModelSelectordropdown in the chat input toolbar (visible only when session is Running, disabled while agent is generating) with loading/error statesambient:model_switchedcustom event to inject a timestamped confirmation message in the conversationTest plan
ambient:model_switchedcustom eventCloses #1090
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
UX
Reliability
Tests