-
Notifications
You must be signed in to change notification settings - Fork 0
feat: Add support for toggling Qwen3.5 thinking tokens #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: Add-support-for-Qwen3.5-models-and-option-to-additional-paramters-(to-enable/disable-thinking)
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -64,7 +64,12 @@ class GGUFEngine { | |
| nativeLib.nativeSetSystemPrompt(inference.systemPrompt) | ||
| } | ||
| if (inference.chatTemplate.isNotEmpty()) { | ||
| nativeLib.nativeSetChatTemplate(inference.chatTemplate) | ||
| val template = if (!inference.enableThinking) { | ||
| "{% set enable_thinking = false %}\n" + inference.chatTemplate | ||
| } else { | ||
| inference.chatTemplate | ||
| } | ||
| nativeLib.nativeSetChatTemplate(template) | ||
|
Comment on lines
+67
to
+72
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This logic for preparing the chat template is duplicated in the For example, you could add a function to the private fun prepareChatTemplate(chatTemplate: String, enableThinking: Boolean): String {
return if (!enableThinking) {
// Prepend the Jinja variable to disable thinking tokens
"{% set enable_thinking = false %}\n" + chatTemplate
} else {
chatTemplate
}
}Then, you can call this function from both if (inference.chatTemplate.isNotEmpty()) {
val template = prepareChatTemplate(inference.chatTemplate, inference.enableThinking)
nativeLib.nativeSetChatTemplate(template)
}This will make the code cleaner and easier to modify in the future.
Comment on lines
+67
to
+72
|
||
| } | ||
| } | ||
|
|
||
|
|
@@ -115,7 +120,12 @@ class GGUFEngine { | |
| nativeLib.nativeSetSystemPrompt(inference.systemPrompt) | ||
| } | ||
| if (inference.chatTemplate.isNotEmpty()) { | ||
| nativeLib.nativeSetChatTemplate(inference.chatTemplate) | ||
| val template = if (!inference.enableThinking) { | ||
| "{% set enable_thinking = false %}\n" + inference.chatTemplate | ||
| } else { | ||
| inference.chatTemplate | ||
| } | ||
| nativeLib.nativeSetChatTemplate(template) | ||
|
Comment on lines
+123
to
+128
|
||
| } | ||
| } | ||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -72,7 +72,8 @@ data class GgufInferenceParams( | |||||
| val maxTokens: Int = 4096, | ||||||
| val systemPrompt: String = "", | ||||||
| val chatTemplate: String = "", | ||||||
| val toolsJson: String = "" // JSON array of tool definitions | ||||||
| val toolsJson: String = "", // JSON array of tool definitions | ||||||
| val enableThinking: Boolean = true // Enable/Disable Qwen3.5 thinking tokens | ||||||
|
||||||
| val enableThinking: Boolean = true // Enable/Disable Qwen3.5 thinking tokens | |
| val enableThinking: Boolean = true // Enable/disable thinking tokens when supported by the chat template |
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -463,6 +463,13 @@ private fun GgufConfigEditor(viewModel: ModelConfigEditorViewModel) { | |||||||||||||||||||||||||||||
| multiline = true, | ||||||||||||||||||||||||||||||
| minLines = 3 | ||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| SwitchField( | ||||||||||||||||||||||||||||||
| label = "Enable Thinking", | ||||||||||||||||||||||||||||||
| description = "Enable reasoning tokens for supported models (e.g., Qwen3.5)", | ||||||||||||||||||||||||||||||
| checked = ggufConfig.inferenceParams.enableThinking, | ||||||||||||||||||||||||||||||
| onCheckedChange = { viewModel.updateGgufEnableThinking(it) } | ||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||
|
Comment on lines
+467
to
+472
|
||||||||||||||||||||||||||||||
| SwitchField( | |
| label = "Enable Thinking", | |
| description = "Enable reasoning tokens for supported models (e.g., Qwen3.5)", | |
| checked = ggufConfig.inferenceParams.enableThinking, | |
| onCheckedChange = { viewModel.updateGgufEnableThinking(it) } | |
| ) | |
| if (ggufConfig.inferenceParams.chatTemplate.isNotBlank()) { | |
| SwitchField( | |
| label = "Enable Thinking", | |
| description = "Enable reasoning tokens for supported models (e.g., Qwen3.5)", | |
| checked = ggufConfig.inferenceParams.enableThinking, | |
| onCheckedChange = { viewModel.updateGgufEnableThinking(it) } | |
| ) | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic only applies
enableThinkinginside the chat-template branch, so the new toggle is ignored wheneverinference.chatTemplateis empty. In the normal GGUF flow, configs are created fromGgufEngineSchema()(ModelDownloadService.kt), andchatTemplatedefaults to""(GgufEngineSchema.kt), so users can disable “Enable Thinking” in the editor but Qwen-style thinking tokens still remain enabled at runtime. The feature effectively becomes a no-op unless a custom template was manually persisted.Useful? React with 👍 / 👎.