Skip to content

feat(renderers): add keep_thinking option#1278

Open
samsja wants to merge 4 commits intomainfrom
sami/renderers-keep-thinking
Open

feat(renderers): add keep_thinking option#1278
samsja wants to merge 4 commits intomainfrom
sami/renderers-keep-thinking

Conversation

@samsja
Copy link
Copy Markdown
Member

@samsja samsja commented May 3, 2026

Summary

  • Add a uniform keep_thinking: bool = False parameter to renderers that strip historical <think> blocks from assistant turns before the last user query. When True, prior-turn reasoning is preserved across subsequent user turns.
  • Replaces the existing per-renderer knobs preserve_thinking (Qwen3.6) and clear_thinking (GLM-5) for a single clear path, per AGENTS.md guidance.
  • Applies to: qwen3, qwen35, qwen36 (inherits), glm5/glm51, glm45, minimax_m2, nemotron3, kimi_k25. Renderers where the concept doesn't apply are untouched (qwen3_vl, kimi_k2, deepseek_v3, gpt_oss, default).

Test plan

  • uv run ruff check packages/renderers/ — clean
  • uv run pytest packages/renderers/tests/test_render_ids.py — 216 passed (existing parity tests; no regressions)
  • uv run pytest packages/renderers/tests/test_keep_thinking.py — 7 passed (new test, parameterized over all 8 affected renderers)
  • Re-run full suite in CI (4 pre-existing local failures are tiktoken env issues unrelated to this change)

🤖 Generated with Claude Code


Note

Medium Risk
Modifies prompt/token rendering across multiple model-specific renderers, which can change model behavior and training masks in multi-turn conversations. Risk is mitigated by defaulting to current behavior (keep_thinking=False) and adding coverage to verify the new mode.

Overview
Adds a uniform keep_thinking: bool = False option to several renderers so callers can preserve prior-turn assistant <think>/reasoning_content blocks across subsequent user turns instead of stripping them by default.

Replaces renderer-specific knobs (clear_thinking in GLM5Renderer, preserve_thinking in Qwen36Renderer) with the shared keep_thinking behavior, and introduces a Qwen3KeepThinkingRenderer plus a new registry name qwen3-keep-thinking.

Adds a new parametrized test (test_keep_thinking.py) that asserts prior-turn reasoning is absent in default renders but present when keep_thinking=True across the affected model families.

Reviewed by Cursor Bugbot for commit 31fae38. Bugbot is set up for automated code reviews on this repo. Configure here.

samsja and others added 3 commits May 3, 2026 02:06
Qwen3 subclass that defaults keep_thinking=True, registered as
"qwen3-keep-thinking" so it can be selected from a client config
(e.g. for Qwen3-30B-A3B-Thinking-2507) without plumbing the flag
through ClientConfig / create_renderer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 160094f. Configure here.

Comment thread packages/renderers/renderers/base.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants