fix: detect tool open tag within reasoning content in _consume_reasoning by dbsd11 · Pull Request #4580 · InternLM/lmdeploy

dbsd11 · 2026-05-11T10:29:19Z

The original _consume_reasoning method did not check for tool open tags before the reasoning close tag, causing tool
calls embedded in reasoning content to be incorrectly parsed as reasoning text. This fix adds proper detection and
priority handling for both tags.

Also drop unsupported add_bos parameter from test tokenizer encode calls.

Motivation

When a model emits tool open tags (like the Qwen3Coder XML tool syntax) directly after reasoning content — without
first emitting the reasoning close tag </think> — the original _consume_reasoning method would treat the entire
pending text (including the tool tag) as reasoning content. This caused tool calls to be lost and incorrectly appended
to reasoning_content instead of being routed to the tool parser.

Modification

lmdeploy/serve/parsers/response_parser.py: Rewrote the tag detection logic in _consume_reasoning to handle
the interplay between reasoning close tag and tool open tag:
- Added explicit search for both reasoning_close_tag and tool_open_tag positions
- When reasoning close tag is found, checks if tool tag appears before it — if so, emits reasoning prefix and
  switches to MODE_TOOL
- When reasoning close tag is not found but tool tag exists, emits reasoning prefix and switches to MODE_TOOL
- Updated docstring to reflect the new behavior
tests/test_lmdeploy/serve/parsers/test_qwen3_parser.py &
tests/test_lmdeploy/serve/parsers/test_qwen3_5_parser.py: Removed unsupported add_bos=False argument from
tokenizer.encode() calls to fix test compatibility with current transformers versions.

BC-breaking (Optional)

No. This modification does not break backward compatibility. It only corrects the parsing behavior within the
reasoning mode — downstream users do not need to make any changes.

Use cases (Optional)

This PR fixes the streaming parsing for Qwen3-Coder / Qwen3.5 models when they:

Emit reasoning content followed directly by tool calls without a closing </think> tag
Embed tool XML structures (like <function=...>) within reasoning content

All 26 parser unit tests pass after this change.

Checklist

PR checklist:

Pre-commit or other linting tools are used to fix the potential lint issues.
The modification is covered by complete unit tests. If not, please add more unit tests to ensure the
correctness.
The documentation has been modified accordingly, like docstring or example tutorials.

The original _consume_reasoning method did not check for tool open tags before the reasoning close tag, causing tool calls embedded in reasoning content to be incorrectly parsed as reasoning text. This fix adds proper detection and priority handling for both tags. Also drop unsupported add_bos parameter from test tokenizer encode calls.

lvhan028 · 2026-05-11T12:24:54Z

Could you help providing an example about "tool open tag within reasoning content"?

Copilot

Pull request overview

This PR fixes streaming parsing in BaseResponseParser._consume_reasoning() so that tool-call opening tags appearing inside reasoning output are detected and routed to tool parsing (instead of being appended to reasoning_content). It also updates Qwen parser tests to remove an unsupported add_bos argument from tokenizer encode() calls.

Changes:

Update _consume_reasoning() to search for both reasoning_close_tag and tool_open_tag and switch to tool mode when the tool tag appears first.
Remove add_bos=False from tokenizer.encode() in Qwen3 / Qwen3.5 parser streaming tests for transformers compatibility.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
`lmdeploy/serve/parsers/response_parser.py`	Adjust reasoning-mode tag handling to prioritize tool-open tags that occur before the reasoning close tag.
`tests/test_lmdeploy/serve/parsers/test_qwen3_parser.py`	Remove unsupported `add_bos` argument from test tokenizer encoding helper.
`tests/test_lmdeploy/serve/parsers/test_qwen3_5_parser.py`	Remove unsupported `add_bos` argument from test tokenizer encoding helper.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+        # No close tag and no tool tag found - emit safe reasoning prefix,
+        # keeping possible partial-tag suffix in buffer.
+        if not self._pending:
+            return None, False
+        out = self._pending


+        close_idx = self._pending.find(close_tag)
+        tool_tag = self.profile.tool_open_tag
+        tool_idx = self._pending.find(tool_tag) if tool_tag else -1
+
+        # If close tag is found, process it first.
+        if close_idx >= 0:
+            # Check if tool tag appears before close tag.


lvhan028 requested review from Copilot and lvhan028 May 11, 2026 12:25

Copilot started reviewing on behalf of lvhan028 May 11, 2026 12:26 View session

Copilot AI reviewed May 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: detect tool open tag within reasoning content in _consume_reasoning#4580

fix: detect tool open tag within reasoning content in _consume_reasoning#4580
dbsd11 wants to merge 1 commit into
InternLM:mainfrom
dbsd11:fix-detect-tool-open-tag

dbsd11 commented May 11, 2026 •

edited

Loading

Uh oh!

lvhan028 commented May 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dbsd11 commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist

Uh oh!

lvhan028 commented May 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dbsd11 commented May 11, 2026 •

edited

Loading