Skip to content

Conversation

@zpitroda
Copy link
Contributor

  • Updated models from Qwen 2.5 to Qwen 3 equivalents
  • Updated transformers and torch python packages

- Updated from qwen 2.5 to qwen 3 models
- Updated transformers and torch python packages
Updated llama.cpp for Qwen3 support
README.md Outdated
For model deployment, we utilized [llama.cpp](https://github.com/ggml-org/llama.cpp), which provides efficient inference capabilities.

Our base models primarily come from the [Qwen2.5](https://huggingface.co/Qwen) series.
Our base models primarily come from the [Qwen3](https://huggingface.co/Qwen) series.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what is Second Me's model update policy. As a community user I definitely want to use Qwen3 considering its SOTA capabilities (but I haven't tested yet so maybe there will be some in & outs).

Huge thanks to your work!

On top of that, It would be nice to add Qwen 3 on top of existing Qwen 2.5 model support, like adding a new support model instead of replace the existing Qwen 2.5 model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what is Second Me's model update policy. As a community user I definitely want to use Qwen3 considering its SOTA capabilities (but I haven't tested yet so maybe there will be some in & outs).

Huge thanks to your work!

On top of that, It would be nice to add Qwen 3 on top of existing Qwen 2.5 model support, like adding a new support model instead of replace the existing Qwen 2.5 model.

I was wondering that as well, I'm testing right now to ensure qwen 3 doesn't break anything but yeah I don't know if the Second Me team currents actually wants to update . I can also keep the 2.5 models and add the option for 3 as well if that's preferable?

Updated convert_hf_to_gguf script and gguf-py package to support qwen3 models
Disabled thinking mode and updated backend dockerfile to work with new llama.cpp
@zpitroda
Copy link
Contributor Author

zpitroda commented Apr 30, 2025

Everything is working except during inference it outputs the <think></think> blocks because llama.cpp hasn't updated the qwen3 template to support {"enable_thinking": false} kwargs. There is already a PR so hopefully it'll be updated asap

Added Qwen 2.5 models back along with 3
@kevinaimonster
Copy link
Contributor

Hi, I've test it a bit, it fails when downloading models...

2025-05-06 14:30:21 [INFO] trainprocess_service.py:265 - Starting model download: Qwen3-1.7B
2025-05-06 14:30:21 [ERROR] trainprocess_service.py:285 - Download model failed: cannot access local variable 'base_dir' where it is not associated with a value
2025-05-06 14:30:21 [ERROR] trainprocess_service.py:1109 - Step model_download failed

@zpitroda
Copy link
Contributor Author

zpitroda commented May 6, 2025

Hi, I've test it a bit, it fails when downloading models...

2025-05-06 14:30:21 [INFO] trainprocess_service.py:265 - Starting model download: Qwen3-1.7B
2025-05-06 14:30:21 [ERROR] trainprocess_service.py:285 - Download model failed: cannot access local variable 'base_dir' where it is not associated with a value
2025-05-06 14:30:21 [ERROR] trainprocess_service.py:1109 - Step model_download failed

Sorry about that! The base_dir variable was accidentally indented into the "if" block above it but should be working now

@kevin-mindverse
Copy link
Contributor

Hi, it works now. so once the conflict is resolved, it will be added to develop branch :)

@zpitroda
Copy link
Contributor Author

@kevin-mindverse sounds good! I think I have a temporary solution to the extra block being outputted until the llama.pp pr is merged, I'll try to have that and the conflict fixes pushed tomorrow

@zpitroda zpitroda marked this pull request as ready for review May 10, 2025 04:24
@zpitroda
Copy link
Contributor Author

Ok I actually quickly pushed what I think should do it, shouldn't break anything but haven't double check it's fixed

@zpitroda
Copy link
Contributor Author

@yingapple I'm away from my computer this week and unable to test, but it should hopefull now only add no_think flags when using a qwen3 model

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants