-
Notifications
You must be signed in to change notification settings - Fork 109
Add dynamic router for manual model selection #504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Coverage Report •
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
xingyaoww
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks great
| @@ -30,11 +31,12 @@ class RouterLLM(LLM): | |||
| """ | |||
|
|
|||
| router_name: str = Field(default="base_router", description="Name of the router") | |||
| llms_for_routing: dict[str, LLM] = Field( | |||
| llms_for_routing: dict[str, LLMBase] = Field( | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's confirm this will be persisted correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use an immutable mapping for this:
from pydantic import BaseModel, ConfigDict
from types import MappingProxyType
class MyModel(BaseModel):
model_config = ConfigDict(frozen=True)
data: dict[str, int]
def __init__(self, **kwargs):
super().__init__(**kwargs)
object.__setattr__(self, "data", MappingProxyType(self.data))
m = MyModel(data={"a": 1})
m.data["a"] = 2 # ❌ TypeError: 'mappingproxy' object does not support item assignment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the example to assert for the correct active model and add immutability for llms_for_routing!
| conversation.run() | ||
|
|
||
| # Switch back to primary model for complex task | ||
| print("Switching back to claude for complex reasoning...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how would the user know in advance that they'll need to switch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we were designing the DynamicRouter as a way for the user to "manually select model name from drop down"?
And further down the road, we should by default initialize LLM as DynamicRouter in our GUI/CLI if we have UI for switching model on the fly
|
|
||
|
|
||
| class LLM(BaseModel, RetryMixin, NonNativeToolCallingMixin): | ||
| class LLMBase(DiscriminatedUnionMixin, RetryMixin, NonNativeToolCallingMixin): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please explain, why is LLM a DiscriminatedUnionMixin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use it to distinguish between router classes and normal LLM when deserializing, without it we'll need an explicit kind field.
| message.function_calling_enabled = self.is_function_calling_active() | ||
| message.extended_thinking_enabled = get_features( | ||
| self.model | ||
| ).supports_extended_thinking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, is this a different issue than the PR is about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah for trajectories with a single LLM, we don't need this field, but it requires when using an extending thinking LLM like Sonnet 4.5 with another non-thinking LLM. I ran into the error earlier when testing and adding this field to Message seems to fix it.
| @model_validator(mode="after") | ||
| def make_immutable(self): | ||
| object.__setattr__( | ||
| self, "llms_for_routing", MappingProxyType(self.llms_for_routing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we expressly make them immutable when... they weren't?
Do I understand correctly, that will mean that the user cannot decide at runtime to just switch to another LLM they pick up, let's say, from openhands provider? (where they already had an API key)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually my earlier implementation did allow user to add a new LLM config to the router during runtime, but I remember @xingyaoww said it might lead to some weird issues if we modify configs like that, ideally we want immutability when possible. I think it boils down to what "UX" we want to achieve, whether to pre-define all LLMs in advance or add a new LLM during the session. After thinking a bit more, I feel like asking users to set some LLMs they prefer to use in advance (e.g. a set of Sonnet 4.5, Codex and GLM-4.6) and switching between them later seem to be more convenient and not too rigid, I imagine myself also too lazy to manually input LLM configs within sessions many times 🤔
| persisted = self.__class__.model_validate(persisted_dict) | ||
|
|
||
| # Check classes match exactly | ||
| if type(persisted) is not type(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I don't know, I don't really like that we use this code, which "sets things in stone", when what we'd like to implement in this PR is precisely flexibility? It seems a bit contradictory, and it's getting too complex maybe... I'm following up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the work on this, @ryanhoangt !
I'm not sure that this way is the best way, though, and I'd appreciate to think it over a bit. It seems that this PR:
- requires the user to declare ahead of time a "DynamicRouter", otherwise they're stuck with what they started
- they can only switch between two?
- it seems to add quite a lot of complexity to LLM class and its persistence, to work around the fact that we froze the LLM, if I understand correctly? Please correct me if this is wrong.
TBH I'm not sure about this, and I'd appreciate to look a bit at the alternative.
@xingyaoww I understand we want freeze for evals, but then maybe we could have an INLINE_CONVERSATIONS for evals? In that case, we store it with the conversation and enforce its restoration as-it-was.
I looked into what it implies to switch LLM, and it seems relatively straightforward, if:
- we don't do the enforced freeze for any LLM parameters
- it's very related to something like this, LLM profiles. Profiles are persisted on disk, just like any LLM, and it seems to me that it becomes totally possible to choose to persist into the Conversation the full LLM, or only its profile reference: then the LLM Registry will provide the full LLM. Similarly, it can provide it for switching.
My thinking is, we could look into using that mechanism to switch LLM on the fly. We could use those profiles registered in the registry.
What if we try to do it that way? WDYT?
|
[Automatic Post]: It has been a while since there was any activity on this PR. @ryanhoangt, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up. |
2 similar comments
|
[Automatic Post]: It has been a while since there was any activity on this PR. @ryanhoangt, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up. |
|
[Automatic Post]: It has been a while since there was any activity on this PR. @ryanhoangt, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up. |
|
[Automatic Post]: It has been a while since there was any activity on this PR. @ryanhoangt, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up. |
2 similar comments
|
[Automatic Post]: It has been a while since there was any activity on this PR. @ryanhoangt, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up. |
|
[Automatic Post]: It has been a while since there was any activity on this PR. @ryanhoangt, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up. |
|
Closing since we haven't been able to find time to make progress on this. |
This PR is to:
Fix #416
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
golang:1.21-bookwormeclipse-temurin:17-jdknikolaik/python-nodejs:python3.12-nodejs22Pull (multi-arch manifest)
Run
All tags pushed for this build
The
df4aa57tag is a multi-arch manifest (amd64/arm64); your client pulls the right arch automatically.