Skip to content

Comments

feat(llm): switch model profile on user message#2192

Open
VascoSch92 wants to merge 6 commits intomainfrom
command-model
Open

feat(llm): switch model profile on user message#2192
VascoSch92 wants to merge 6 commits intomainfrom
command-model

Conversation

@VascoSch92
Copy link
Contributor

@VascoSch92 VascoSch92 commented Feb 23, 2026

Summary

Implementation of the /model <MODEL_NAME> [prompt] command (ref #2018 )

The command functions as follows:

  • /model: Returns a list of available LLM profiles.
  • /model <MODEL_NAME>: Switches the current LLM profile to the one specified.
  • /model <MODEL_NAME> <PROMPT>: Switches the LLM profile and immediately sends the provided prompt to the new model.

Note:

  • I’m not certain this is the best approach, but it’s functional and keeps the LocalConversation class clean.
  • Happy to add an example if needed.

Checklist

  • If the PR is changing/adding functionality, are there tests to reflect this?
  • If there is an example, have you run the example to make sure that it works?
  • If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
  • If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
  • Is the github CI passing?

Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:ce2c80a-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-ce2c80a-python \
  ghcr.io/openhands/agent-server:ce2c80a-python

All tags pushed for this build

ghcr.io/openhands/agent-server:ce2c80a-golang-amd64
ghcr.io/openhands/agent-server:ce2c80a-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:ce2c80a-golang-arm64
ghcr.io/openhands/agent-server:ce2c80a-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:ce2c80a-java-amd64
ghcr.io/openhands/agent-server:ce2c80a-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:ce2c80a-java-arm64
ghcr.io/openhands/agent-server:ce2c80a-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:ce2c80a-python-amd64
ghcr.io/openhands/agent-server:ce2c80a-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:ce2c80a-python-arm64
ghcr.io/openhands/agent-server:ce2c80a-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:ce2c80a-golang
ghcr.io/openhands/agent-server:ce2c80a-java
ghcr.io/openhands/agent-server:ce2c80a-python

About Multi-Architecture Support

  • Each variant tag (e.g., ce2c80a-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., ce2c80a-python-amd64) are also available if needed

@github-actions
Copy link
Contributor

github-actions bot commented Feb 23, 2026

API breakage checks (Griffe)

Result: Passed

Action log

all-hands-bot

This comment was marked as outdated.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 23, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/conversation
   switch_model_handler.py39197%40
openhands-sdk/openhands/sdk/conversation/impl
   local_conversation.py3562493%289, 294, 322, 387, 429, 460–461, 465–467, 618–619, 622, 768, 776, 778, 789, 791–793, 818, 980, 987–988
TOTAL18888569069% 

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Clean, pragmatic implementation

What works well:

  • Simple command syntax that solves a real user need
  • Registry caching pattern (checks cache before disk I/O) - previous performance concern resolved ✅
  • Clear separation: parse → handle → switch
  • Tests verify actual behavior, not mocks
  • No breaking changes (additive only)

Linus would say: "This is how you do it. Straightforward data flow, no special cases, solves the actual problem."

LGTM! 🚀

@VascoSch92 VascoSch92 requested review from enyst and neubig February 23, 2026 21:12
@VascoSch92 VascoSch92 marked this pull request as ready for review February 23, 2026 21:12
Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Clean, pragmatic implementation

What works well:

  • Simple command syntax that solves a real user need
  • Registry caching pattern (checks cache before disk I/O) - previous performance concern resolved ✅
  • Clear separation: parse → handle → switch
  • Tests verify actual behavior, not mocks
  • No breaking changes (additive only)

Linus would say: "This is how you do it. Straightforward data flow, no special cases, solves the actual problem."

LGTM! 🚀

Copy link
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this!

I think plugging this behavior in the conversation may be easy enough to do (and undo), so IMHO that is fine.

I think maybe we don’t want to send this information to the LLM, it’s just a toggle for the user, isn’t it?

@VascoSch92
Copy link
Contributor Author

Thank you for this!

I think plugging this behavior in the conversation may be easy enough to do (and undo), so IMHO that is fine.

I think maybe we don’t want to send this information to the LLM, it’s just a toggle for the user, isn’t it?

The most important things was to do a first try to understand better the problem.

I believe i can make it better :)

I will adress you suggestion ;)

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable - Solid architecture with a UX gap

What works well:

  • Clean separation of concerns (SwitchModelHandler is standalone and focused)
  • Registry caching pattern prevents redundant disk I/O (previous concern resolved ✅)
  • Tests verify actual behavior with real code paths (no mock spam)
  • Opt-in feature with no breaking changes
  • Solves a real user need pragmatically

Key issue:
User feedback disappears into logs instead of being shown in the conversation. See inline comments.

Linus would say: "The data flow is clean and the code does what it should, but users are left in the dark. When someone types /model, they expect to see something back. Logger output is not user output."

@VascoSch92
Copy link
Contributor Author

Hey @enyst

I’ve made some changes based on your feedback. The goal was to pollute the LocalConversation class as little as possible, so I extracted the logic into a new class within a separate module.

Additionally, I've addressed your comments and added an example (give it a try, it’s fun!). I also added a flag to enable or disable model switching.

Let me know what you think!

state. If provided, secrets are encrypted when saving and
decrypted when loading. If not provided, secrets are redacted
(lost) on serialization.
allow_model_switching: Whether to allow switching between persisted
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, is this necessary? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is very improbable that someone type /model as a mistake. But I wanted to give the possibility not to have that enable. It is just a flag. I can remove it if wished.

new_llm = new_llm.model_copy(update={"usage_id": profile_name})
self.llm_registry.add(new_llm)

return agent.model_copy(update={"llm": new_llm})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot to like about the separation... 🤔

],
),
with self._state:
self._state.agent = self.agent
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we do all changes to state under a lock, I think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But I should acquire the lock when I type with self._state right?

from the class ConversationState

    def __enter__(self: Self) -> Self:
        """Context manager entry."""
        self._lock.acquire()
        return self

Copy link
Collaborator

@enyst enyst Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I think I was thinking also of a couple other things: base_state.json changes, and changing the agent because we change the LLM. For your context, when we last considered the "statelessness" of the agent, we came up with a design "rule", or perhaps desire, to have strict, maybe single, pathways in the codebase, where the components of the agent can be modified. Please take a look at this:

I'd love your thoughts!

@enyst
Copy link
Collaborator

enyst commented Feb 24, 2026

Hey @enyst

I’ve made some changes based on your feedback. The goal was to pollute the LocalConversation class as little as possible, so I extracted the logic into a new class within a separate module.

Additionally, I've addressed your comments and added an example (give it a try, it’s fun!). I also added a flag to enable or disable model switching.

Let me know what you think!

Thank you! Left a few tiny comments

I think maybe it doesn't get saved, so it won't be restored, the initial profile is? As in, restoring a conversation from the saved base_state.json 🤔 It's just a quick question

Just a heads up, I won't be around for a bit, back later (some number of hours)

@VascoSch92
Copy link
Contributor Author

Hey @enyst
I’ve made some changes based on your feedback. The goal was to pollute the LocalConversation class as little as possible, so I extracted the logic into a new class within a separate module.
Additionally, I've addressed your comments and added an example (give it a try, it’s fun!). I also added a flag to enable or disable model switching.
Let me know what you think!

Thank you! Left a few tiny comments

I think maybe it doesn't get saved, so it won't be restored, the initial profile is? As in, restoring a conversation from the saved base_state.json 🤔 It's just a quick question

Just a heads up, I won't be around for a bit, back later (some number of hours)

So I ran a small experiment to see what is happening.

If we save a convo and we restore it later: the last used LLM will be used. Which makes sense, right?

@enyst
Copy link
Collaborator

enyst commented Feb 24, 2026

If we save a convo and we restore it later: the last used LLM will be used. Which makes sense, right?

Ahh. Because we always save/overwrite base_state.json, right?

@VascoSch92
Copy link
Contributor Author

Ahh. Because we always save/overwrite base_state.json, right?

i think yes.

More generally, I think having this kind of machinery (command handlers injected into the conversation) could be useful for other features we want to implement. For example, #1929 is a great use case for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants