Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions app/src/main/java/com/dark/tool_neuron/engine/GGUFEngine.kt
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,12 @@ class GGUFEngine {
nativeLib.nativeSetSystemPrompt(inference.systemPrompt)
}
if (inference.chatTemplate.isNotEmpty()) {
nativeLib.nativeSetChatTemplate(inference.chatTemplate)
val template = if (!inference.enableThinking) {
"{% set enable_thinking = false %}\n" + inference.chatTemplate
Comment on lines +67 to +68

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Honor thinking toggle when template is missing

This logic only applies enableThinking inside the chat-template branch, so the new toggle is ignored whenever inference.chatTemplate is empty. In the normal GGUF flow, configs are created from GgufEngineSchema() (ModelDownloadService.kt), and chatTemplate defaults to "" (GgufEngineSchema.kt), so users can disable “Enable Thinking” in the editor but Qwen-style thinking tokens still remain enabled at runtime. The feature effectively becomes a no-op unless a custom template was manually persisted.

Useful? React with 👍 / 👎.

} else {
inference.chatTemplate
}
nativeLib.nativeSetChatTemplate(template)
Comment on lines +67 to +72

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This logic for preparing the chat template is duplicated in the loadFromFd function (lines 123-128). To improve maintainability and avoid code duplication, consider extracting this logic into a private helper function.

For example, you could add a function to the GGUFEngine class:

private fun prepareChatTemplate(chatTemplate: String, enableThinking: Boolean): String {
    return if (!enableThinking) {
        // Prepend the Jinja variable to disable thinking tokens
        "{% set enable_thinking = false %}\n" + chatTemplate
    } else {
        chatTemplate
    }
}

Then, you can call this function from both load() and loadFromFd():

if (inference.chatTemplate.isNotEmpty()) {
    val template = prepareChatTemplate(inference.chatTemplate, inference.enableThinking)
    nativeLib.nativeSetChatTemplate(template)
}

This will make the code cleaner and easier to modify in the future.

Comment on lines +67 to +72
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The template-selection/injection block is duplicated in both load(...) and loadFromFd(...). Since this is easy to accidentally diverge (and will grow if more template flags are added), consider extracting a small helper (e.g., applyChatTemplate(inference)) or computing template via a shared function/constant.

Copilot uses AI. Check for mistakes.
Comment on lines 66 to +72
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enableThinking is only applied when inference.chatTemplate is non-empty, because the injection + nativeSetChatTemplate(...) call is gated by if (inference.chatTemplate.isNotEmpty()). In this codebase, new GGUF configs are created from GgufEngineSchema() (which defaults chatTemplate to ""), and there are no other writers for chatTemplate, so this toggle likely has no effect in practice. Consider either persisting the model’s built-in chat_template into modelInferenceParams (so it’s available here), or adding a native-level way to disable thinking without overriding the template, or at minimum removing the isNotEmpty() gate when enableThinking is false and you have a known template to apply.

Copilot uses AI. Check for mistakes.
}
}

Expand Down Expand Up @@ -115,7 +120,12 @@ class GGUFEngine {
nativeLib.nativeSetSystemPrompt(inference.systemPrompt)
}
if (inference.chatTemplate.isNotEmpty()) {
nativeLib.nativeSetChatTemplate(inference.chatTemplate)
val template = if (!inference.enableThinking) {
"{% set enable_thinking = false %}\n" + inference.chatTemplate
} else {
inference.chatTemplate
}
nativeLib.nativeSetChatTemplate(template)
Comment on lines +123 to +128
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same duplication as above: the template-selection/injection block is repeated here as well. Extracting shared logic would reduce the chance of future inconsistencies between load(...) and loadFromFd(...).

Copilot uses AI. Check for mistakes.
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,8 @@ data class GgufInferenceParams(
val maxTokens: Int = 4096,
val systemPrompt: String = "",
val chatTemplate: String = "",
val toolsJson: String = "" // JSON array of tool definitions
val toolsJson: String = "", // JSON array of tool definitions
val enableThinking: Boolean = true // Enable/Disable Qwen3.5 thinking tokens
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor formatting/naming clarity: the inline comment for enableThinking doesn’t match the spacing/style used on adjacent fields (two spaces before //) and it hard-codes a single model name (Qwen3.5) even though the flag name is generic. Consider aligning comment formatting with the rest of the file and wording it in a model-agnostic way (e.g., “Enable/disable thinking tokens when supported by the chat template”).

Suggested change
val enableThinking: Boolean = true // Enable/Disable Qwen3.5 thinking tokens
val enableThinking: Boolean = true // Enable/disable thinking tokens when supported by the chat template

Copilot uses AI. Check for mistakes.
)

@Serializable
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -463,6 +463,13 @@ private fun GgufConfigEditor(viewModel: ModelConfigEditorViewModel) {
multiline = true,
minLines = 3
)

SwitchField(
label = "Enable Thinking",
description = "Enable reasoning tokens for supported models (e.g., Qwen3.5)",
checked = ggufConfig.inferenceParams.enableThinking,
onCheckedChange = { viewModel.updateGgufEnableThinking(it) }
)
Comment on lines +467 to +472
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This UI toggle is always shown, but the engine-side behavior only changes anything when a non-empty chatTemplate override is present. Given the current default GGUF config schema uses an empty chatTemplate, users may flip this switch and see no change. Consider disabling/hiding the switch unless a template override is being used, or ensure the model’s built-in chat_template is persisted into the GGUF inference params so the toggle is effective.

Suggested change
SwitchField(
label = "Enable Thinking",
description = "Enable reasoning tokens for supported models (e.g., Qwen3.5)",
checked = ggufConfig.inferenceParams.enableThinking,
onCheckedChange = { viewModel.updateGgufEnableThinking(it) }
)
if (ggufConfig.inferenceParams.chatTemplate.isNotBlank()) {
SwitchField(
label = "Enable Thinking",
description = "Enable reasoning tokens for supported models (e.g., Qwen3.5)",
checked = ggufConfig.inferenceParams.enableThinking,
onCheckedChange = { viewModel.updateGgufEnableThinking(it) }
)
}

Copilot uses AI. Check for mistakes.
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,12 @@ class ModelConfigEditorViewModel @Inject constructor(
}
}

fun updateGgufEnableThinking(value: Boolean) {
_ggufConfig.update {
it.copy(inferenceParams = it.inferenceParams.copy(enableThinking = value))
}
}

// ==================== Diffusion Config Updates ====================

fun updateDiffusionEmbeddingSize(value: Int) {
Expand Down