feat: Add support for toggling Qwen3.5 thinking tokens by Godzilla675 · Pull Request #9 · Godzilla675/ToolNeuron

Godzilla675 · 2026-03-05T04:21:09Z

Adds a UI toggle in the model config editor to enable or disable thinking (reasoning) tokens for models like Qwen3.5. Disabling thinking is handled by dynamically injecting {% set enable_thinking = false %} into the raw Jinja chat template.

PR created automatically by Jules for task 18296997036924831771 started by @Godzilla675

Adds a UI toggle in the model config editor to enable or disable thinking (reasoning) tokens for models like Qwen3.5. Disabling thinking is handled by dynamically injecting `{% set enable_thinking = false %}` into the raw Jinja chat template.

gemini-code-assist · 2026-03-05T04:21:26Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new feature that provides users with direct control over the 'thinking tokens' mechanism in models like Qwen3.5. By integrating a UI toggle and corresponding backend logic, it allows for dynamic modification of the chat template, giving users the flexibility to optimize model behavior for specific use cases where reasoning output might be desired or not.

Highlights

UI Toggle for Thinking Tokens: A new UI toggle has been added to the model config editor, allowing users to enable or disable 'thinking tokens' for supported models like Qwen3.5.
Dynamic Chat Template Injection: The GGUFEngine now dynamically injects {% set enable_thinking = false %} into the raw Jinja chat template if the 'Enable Thinking' toggle is disabled, effectively controlling the model's reasoning output.
Inference Parameter Update: The GgufInferenceParams schema has been extended with an enableThinking boolean parameter to store the state of this new setting.

Changelog

app/src/main/java/com/dark/tool_neuron/engine/GGUFEngine.kt
- Modified the nativeSetChatTemplate calls to conditionally prepend {% set enable_thinking = false %} to the chat template based on the new enableThinking inference parameter.
app/src/main/java/com/dark/tool_neuron/models/engine_schema/GgufEngineSchema.kt
- Added a new enableThinking boolean property to the GgufInferenceParams data class, defaulting to true.
app/src/main/java/com/dark/tool_neuron/ui/screen/ModelConfigEditorScreen.kt
- Introduced a SwitchField UI component labeled 'Enable Thinking' into the GgufConfigEditor to allow users to toggle the new setting.
app/src/main/java/com/dark/tool_neuron/viewmodel/ModelConfigEditorViewModel.kt
- Added a new function updateGgufEnableThinking to the ModelConfigEditorViewModel to handle state updates for the enableThinking parameter.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds a feature to toggle 'thinking tokens' for certain models by dynamically modifying the Jinja chat template. The changes correctly propagate from the data model to the UI and ViewModel. The core logic in GGUFEngine.kt prepends the necessary template instruction when the feature is disabled. My main feedback is to address code duplication in GGUFEngine.kt by extracting the template modification logic into a shared helper function to improve code maintainability.

gemini-code-assist · 2026-03-05T04:22:45Z

app/src/main/java/com/dark/tool_neuron/engine/GGUFEngine.kt

+                val template = if (!inference.enableThinking) {
+                    "{% set enable_thinking = false %}\n" + inference.chatTemplate
+                } else {
+                    inference.chatTemplate
+                }
+                nativeLib.nativeSetChatTemplate(template)


This logic for preparing the chat template is duplicated in the loadFromFd function (lines 123-128). To improve maintainability and avoid code duplication, consider extracting this logic into a private helper function.

For example, you could add a function to the GGUFEngine class:

private fun prepareChatTemplate(chatTemplate: String, enableThinking: Boolean): String { return if (!enableThinking) { // Prepend the Jinja variable to disable thinking tokens "{% set enable_thinking = false %}\n" + chatTemplate } else { chatTemplate } }

Then, you can call this function from both load() and loadFromFd():

if (inference.chatTemplate.isNotEmpty()) { val template = prepareChatTemplate(inference.chatTemplate, inference.enableThinking) nativeLib.nativeSetChatTemplate(template) }

This will make the code cleaner and easier to modify in the future.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e058450e4d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-05T04:24:16Z

app/src/main/java/com/dark/tool_neuron/engine/GGUFEngine.kt

+                val template = if (!inference.enableThinking) {
+                    "{% set enable_thinking = false %}\n" + inference.chatTemplate


Honor thinking toggle when template is missing

This logic only applies enableThinking inside the chat-template branch, so the new toggle is ignored whenever inference.chatTemplate is empty. In the normal GGUF flow, configs are created from GgufEngineSchema() (ModelDownloadService.kt), and chatTemplate defaults to "" (GgufEngineSchema.kt), so users can disable “Enable Thinking” in the editor but Qwen-style thinking tokens still remain enabled at runtime. The feature effectively becomes a no-op unless a custom template was manually persisted.

Useful? React with 👍 / 👎.

Copilot

Pull request overview

Adds a configuration toggle to enable/disable “thinking” (reasoning) tokens for GGUF models (intended for Qwen3.5-style Jinja templates) by prepending a Jinja {% set enable_thinking = false %} directive to the configured chat template.

Changes:

Adds enableThinking to GgufInferenceParams and wires it through the config editor ViewModel.
Adds an “Enable Thinking” switch to the GGUF config editor UI.
Injects the Jinja directive into the chat template passed to the native GGUF engine when thinking is disabled.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File	Description
app/src/main/java/com/dark/tool_neuron/viewmodel/ModelConfigEditorViewModel.kt	Adds a ViewModel updater to persist the new GGUF `enableThinking` flag into state/config.
app/src/main/java/com/dark/tool_neuron/ui/screen/ModelConfigEditorScreen.kt	Adds a UI switch to toggle `enableThinking` for GGUF inference params.
app/src/main/java/com/dark/tool_neuron/models/engine_schema/GgufEngineSchema.kt	Extends the serialized GGUF inference schema with `enableThinking`.
app/src/main/java/com/dark/tool_neuron/engine/GGUFEngine.kt	Applies the toggle by conditionally prepending a Jinja directive before setting the chat template in the native engine.

Copilot · 2026-03-05T04:24:58Z

app/src/main/java/com/dark/tool_neuron/engine/GGUFEngine.kt

+                val template = if (!inference.enableThinking) {
+                    "{% set enable_thinking = false %}\n" + inference.chatTemplate
+                } else {
+                    inference.chatTemplate
+                }
+                nativeLib.nativeSetChatTemplate(template)


The template-selection/injection block is duplicated in both load(...) and loadFromFd(...). Since this is easy to accidentally diverge (and will grow if more template flags are added), consider extracting a small helper (e.g., applyChatTemplate(inference)) or computing template via a shared function/constant.

Copilot · 2026-03-05T04:24:58Z

app/src/main/java/com/dark/tool_neuron/engine/GGUFEngine.kt

+                val template = if (!inference.enableThinking) {
+                    "{% set enable_thinking = false %}\n" + inference.chatTemplate
+                } else {
+                    inference.chatTemplate
+                }
+                nativeLib.nativeSetChatTemplate(template)


Same duplication as above: the template-selection/injection block is repeated here as well. Extracting shared logic would reduce the chance of future inconsistencies between load(...) and loadFromFd(...).

Copilot · 2026-03-05T04:24:58Z

app/src/main/java/com/dark/tool_neuron/ui/screen/ModelConfigEditorScreen.kt

+            SwitchField(
+                label = "Enable Thinking",
+                description = "Enable reasoning tokens for supported models (e.g., Qwen3.5)",
+                checked = ggufConfig.inferenceParams.enableThinking,
+                onCheckedChange = { viewModel.updateGgufEnableThinking(it) }
+            )


This UI toggle is always shown, but the engine-side behavior only changes anything when a non-empty chatTemplate override is present. Given the current default GGUF config schema uses an empty chatTemplate, users may flip this switch and see no change. Consider disabling/hiding the switch unless a template override is being used, or ensure the model’s built-in chat_template is persisted into the GGUF inference params so the toggle is effective.

Suggested change

SwitchField(

label = "Enable Thinking",

description = "Enable reasoning tokens for supported models (e.g., Qwen3.5)",

checked = ggufConfig.inferenceParams.enableThinking,

onCheckedChange = { viewModel.updateGgufEnableThinking(it) }

)

if (ggufConfig.inferenceParams.chatTemplate.isNotBlank()) {

SwitchField(

label = "Enable Thinking",

description = "Enable reasoning tokens for supported models (e.g., Qwen3.5)",

checked = ggufConfig.inferenceParams.enableThinking,

onCheckedChange = { viewModel.updateGgufEnableThinking(it) }

)

}

Copilot · 2026-03-05T04:24:59Z

app/src/main/java/com/dark/tool_neuron/models/engine_schema/GgufEngineSchema.kt

    val chatTemplate: String = "",
-    val toolsJson: String = ""  // JSON array of tool definitions
+    val toolsJson: String = "",  // JSON array of tool definitions
+    val enableThinking: Boolean = true // Enable/Disable Qwen3.5 thinking tokens


Minor formatting/naming clarity: the inline comment for enableThinking doesn’t match the spacing/style used on adjacent fields (two spaces before //) and it hard-codes a single model name (Qwen3.5) even though the flag name is generic. Consider aligning comment formatting with the rest of the file and wording it in a model-agnostic way (e.g., “Enable/disable thinking tokens when supported by the chat template”).

Suggested change

val enableThinking: Boolean = true // Enable/Disable Qwen3.5 thinking tokens

val enableThinking: Boolean = true // Enable/disable thinking tokens when supported by the chat template

Copilot · 2026-03-05T04:24:59Z

app/src/main/java/com/dark/tool_neuron/engine/GGUFEngine.kt

            if (inference.chatTemplate.isNotEmpty()) {
-                nativeLib.nativeSetChatTemplate(inference.chatTemplate)
+                val template = if (!inference.enableThinking) {
+                    "{% set enable_thinking = false %}\n" + inference.chatTemplate
+                } else {
+                    inference.chatTemplate
+                }
+                nativeLib.nativeSetChatTemplate(template)


enableThinking is only applied when inference.chatTemplate is non-empty, because the injection + nativeSetChatTemplate(...) call is gated by if (inference.chatTemplate.isNotEmpty()). In this codebase, new GGUF configs are created from GgufEngineSchema() (which defaults chatTemplate to ""), and there are no other writers for chatTemplate, so this toggle likely has no effect in practice. Consider either persisting the model’s built-in chat_template into modelInferenceParams (so it’s available here), or adding a native-level way to disable thinking without overriding the template, or at minimum removing the isNotEmpty() gate when enableThinking is false and you have a known template to apply.

Godzilla675 · 2026-03-05T17:03:58Z

address code reviews and dont comment back

Adds a UI toggle in the model config editor to enable or disable thinking (reasoning) tokens for models like Qwen3.5. Disabling thinking is handled by dynamically injecting `{% set enable_thinking = false %}` into the raw Jinja chat template.

Copilot AI review requested due to automatic review settings March 5, 2026 04:21

Copilot started reviewing on behalf of Godzilla675 March 5, 2026 04:21 View session

gemini-code-assist bot reviewed Mar 5, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 5, 2026

View reviewed changes

Copilot AI reviewed Mar 5, 2026

View reviewed changes

Repository owner deleted a comment from google-labs-jules bot Mar 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add support for toggling Qwen3.5 thinking tokens#9

feat: Add support for toggling Qwen3.5 thinking tokens#9
Godzilla675 wants to merge 2 commits intoAdd-support-for-Qwen3.5-models-and-option-to-additional-paramters-(to-enable/disable-thinking)from
feat/qwen-thinking-toggle-18296997036924831771

Godzilla675 commented Mar 5, 2026

Uh oh!

gemini-code-assist bot commented Mar 5, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 5, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Copilot AI Mar 5, 2026

Uh oh!

Godzilla675 commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		val template = if (!inference.enableThinking) {
		"{% set enable_thinking = false %}\n" + inference.chatTemplate

	val enableThinking: Boolean = true // Enable/Disable Qwen3.5 thinking tokens
	val enableThinking: Boolean = true // Enable/disable thinking tokens when supported by the chat template

Conversation

Godzilla675 commented Mar 5, 2026

Uh oh!

gemini-code-assist bot commented Mar 5, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Godzilla675 commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants