fix(auth): prevent stale runtime state inheritance from disabled auth entries#2133
Conversation
…ntries When an auth file is deleted and re-created with the same path/ID, the new auth could inherit stale ModelStates (cooldown/backoff) from the previously disabled entry, preventing it from being routed. Gate runtime state inheritance (ModelStates, LastRefreshedAt, NextRefreshAfter) on both existing and incoming auth being non-disabled in Manager.Update and Service.applyCoreAuthAddOrUpdate. Closes router-for-me#2061
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves an issue where re-adding an authentication configuration that was previously disabled could lead to it inheriting stale runtime state, such as cooldown or backoff information. The core problem was that disabled authentication entries were retained in the system, and their state was unconditionally inherited upon re-creation. The implemented solution introduces checks to ensure that runtime state inheritance (including Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request correctly prevents stale runtime state from being inherited by disabled auth entries. The logic is sound and the added tests cover all the required scenarios. I have a couple of suggestions to improve the readability and maintainability of the new conditional logic, which is currently duplicated in two files.
| if !existing.Disabled && existing.Status != StatusDisabled && !auth.Disabled && auth.Status != StatusDisabled { | ||
| if len(auth.ModelStates) == 0 && len(existing.ModelStates) > 0 { | ||
| auth.ModelStates = existing.ModelStates | ||
| } | ||
| } |
There was a problem hiding this comment.
This condition is a bit long and is also duplicated in sdk/cliproxy/service.go. To improve readability and maintainability, I suggest introducing local variables with an explanatory comment. This makes the logic clearer and easier to manage. I've added a similar suggestion for the other file where this logic is duplicated.
// An auth entry is considered enabled if it is not explicitly disabled and its status is not StatusDisabled.
// We check both fields for robustness as they might be updated from different sources.
isExistingEnabled := !existing.Disabled && existing.Status != StatusDisabled
isIncomingEnabled := !auth.Disabled && auth.Status != StatusDisabled
if isExistingEnabled && isIncomingEnabled {
if len(auth.ModelStates) == 0 && len(existing.ModelStates) > 0 {
auth.ModelStates = existing.ModelStates
}
}| if !existing.Disabled && existing.Status != coreauth.StatusDisabled && !auth.Disabled && auth.Status != coreauth.StatusDisabled { | ||
| auth.LastRefreshedAt = existing.LastRefreshedAt | ||
| auth.NextRefreshAfter = existing.NextRefreshAfter | ||
| if len(auth.ModelStates) == 0 && len(existing.ModelStates) > 0 { | ||
| auth.ModelStates = existing.ModelStates | ||
| } | ||
| } |
There was a problem hiding this comment.
As mentioned in the comment for sdk/cliproxy/auth/conductor.go, this condition is duplicated. Extracting the logic into local variables improves readability and maintainability by making the code self-documenting and reducing redundancy.
| if !existing.Disabled && existing.Status != coreauth.StatusDisabled && !auth.Disabled && auth.Status != coreauth.StatusDisabled { | |
| auth.LastRefreshedAt = existing.LastRefreshedAt | |
| auth.NextRefreshAfter = existing.NextRefreshAfter | |
| if len(auth.ModelStates) == 0 && len(existing.ModelStates) > 0 { | |
| auth.ModelStates = existing.ModelStates | |
| } | |
| } | |
| // An auth entry is considered enabled if it is not explicitly disabled and its status is not StatusDisabled. | |
| // We check both fields for robustness as they might be updated from different sources. | |
| isExistingEnabled := !existing.Disabled && existing.Status != coreauth.StatusDisabled | |
| isIncomingEnabled := !auth.Disabled && auth.Status != coreauth.StatusDisabled | |
| if isExistingEnabled && isIncomingEnabled { | |
| auth.LastRefreshedAt = existing.LastRefreshedAt | |
| auth.NextRefreshAfter = existing.NextRefreshAfter | |
| if len(auth.ModelStates) == 0 && len(existing.ModelStates) > 0 { | |
| auth.ModelStates = existing.ModelStates | |
| } | |
| } |
luispater
left a comment
There was a problem hiding this comment.
Summary:
This change matches the reported root cause. The new guards in both Manager.Update and Service.applyCoreAuthAddOrUpdate prevent stale runtime state from being inherited when either the existing auth or the incoming auth is disabled.
I did not find any blocking correctness issues.
Non-blocking:
- Consider adding a service-level regression test for the watcher delete -> re-add flow so
LastRefreshedAt,NextRefreshAfter, and the service-side inheritance path are covered directly.
Test plan:
- Verified
go test ./sdk/cliproxy/auth - Verified
go test ./sdk/cliproxy - Verified
go build -o test-output ./cmd/server
This is an automated Codex review result and still requires manual verification by a human reviewer.
|
Thanks for the review. I added the suggested service-level regression test for the watcher delete -> re-add path, covering |
Summary
ModelStates,LastRefreshedAt,NextRefreshAfter) in bothManager.UpdateandService.applyCoreAuthAddOrUpdateso that state is only carried over when both the existing and incoming auth are non-disabledRoot Cause
When an auth file is deleted,
applyCoreAuthRemovalmarks the runtime entry asDisabled=true/Status=StatusDisabledbut does not remove it fromcoreManager. When the same file path is re-added, the update branch unconditionally inheritedModelStatesfrom the old disabled entry, causing the new auth to carry over stale cooldown/backoff state and become unroutable.Changes
sdk/cliproxy/auth/conductor.go— Add disabled guard toModelStatesinheritance inManager.Updatesdk/cliproxy/service.go— Add disabled guard toLastRefreshedAt,NextRefreshAfter, andModelStatesinheritance inService.applyCoreAuthAddOrUpdatesdk/cliproxy/auth/conductor_update_test.go— Add 4 test cases covering all disabled/active transition combinationsTest plan
go test ./sdk/cliproxy/auth/andgo test ./sdk/cliproxy/pass with no regressionsCloses #2061