Skip to content

fix: detect tool open tag within reasoning content in _consume_reasoning#4580

Open
dbsd11 wants to merge 1 commit into
InternLM:mainfrom
dbsd11:fix-detect-tool-open-tag
Open

fix: detect tool open tag within reasoning content in _consume_reasoning#4580
dbsd11 wants to merge 1 commit into
InternLM:mainfrom
dbsd11:fix-detect-tool-open-tag

Conversation

@dbsd11
Copy link
Copy Markdown

@dbsd11 dbsd11 commented May 11, 2026

The original _consume_reasoning method did not check for tool open tags before the reasoning close tag, causing tool
calls embedded in reasoning content to be incorrectly parsed as reasoning text. This fix adds proper detection and
priority handling for both tags.

Also drop unsupported add_bos parameter from test tokenizer encode calls.

Motivation

When a model emits tool open tags (like the Qwen3Coder XML tool syntax) directly after reasoning content — without
first emitting the reasoning close tag </think> — the original _consume_reasoning method would treat the entire
pending text (including the tool tag) as reasoning content. This caused tool calls to be lost and incorrectly appended
to reasoning_content instead of being routed to the tool parser.

Modification

  1. lmdeploy/serve/parsers/response_parser.py: Rewrote the tag detection logic in _consume_reasoning to handle
    the interplay between reasoning close tag and tool open tag:

    • Added explicit search for both reasoning_close_tag and tool_open_tag positions
    • When reasoning close tag is found, checks if tool tag appears before it — if so, emits reasoning prefix and
      switches to MODE_TOOL
    • When reasoning close tag is not found but tool tag exists, emits reasoning prefix and switches to MODE_TOOL
    • Updated docstring to reflect the new behavior
  2. tests/test_lmdeploy/serve/parsers/test_qwen3_parser.py &
    tests/test_lmdeploy/serve/parsers/test_qwen3_5_parser.py: Removed unsupported add_bos=False argument from
    tokenizer.encode() calls to fix test compatibility with current transformers versions.

BC-breaking (Optional)

No. This modification does not break backward compatibility. It only corrects the parsing behavior within the
reasoning mode — downstream users do not need to make any changes.

Use cases (Optional)

This PR fixes the streaming parsing for Qwen3-Coder / Qwen3.5 models when they:

  • Emit reasoning content followed directly by tool calls without a closing </think> tag
  • Embed tool XML structures (like <function=...>) within reasoning content

All 26 parser unit tests pass after this change.

Checklist

PR checklist:

  • Pre-commit or other linting tools are used to fix the potential lint issues.
  • The modification is covered by complete unit tests. If not, please add more unit tests to ensure the
    correctness.
  • The documentation has been modified accordingly, like docstring or example tutorials.

The original _consume_reasoning method did not check for tool open tags
before the reasoning close tag, causing tool calls embedded in reasoning
content to be incorrectly parsed as reasoning text. This fix adds proper
detection and priority handling for both tags.

Also drop unsupported add_bos parameter from test tokenizer encode calls.
@lvhan028
Copy link
Copy Markdown
Collaborator

Could you help providing an example about "tool open tag within reasoning content"?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes streaming parsing in BaseResponseParser._consume_reasoning() so that tool-call opening tags appearing inside reasoning output are detected and routed to tool parsing (instead of being appended to reasoning_content). It also updates Qwen parser tests to remove an unsupported add_bos argument from tokenizer encode() calls.

Changes:

  • Update _consume_reasoning() to search for both reasoning_close_tag and tool_open_tag and switch to tool mode when the tool tag appears first.
  • Remove add_bos=False from tokenizer.encode() in Qwen3 / Qwen3.5 parser streaming tests for transformers compatibility.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
lmdeploy/serve/parsers/response_parser.py Adjust reasoning-mode tag handling to prioritize tool-open tags that occur before the reasoning close tag.
tests/test_lmdeploy/serve/parsers/test_qwen3_parser.py Remove unsupported add_bos argument from test tokenizer encoding helper.
tests/test_lmdeploy/serve/parsers/test_qwen3_5_parser.py Remove unsupported add_bos argument from test tokenizer encoding helper.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +392 to +396
# No close tag and no tool tag found - emit safe reasoning prefix,
# keeping possible partial-tag suffix in buffer.
if not self._pending:
return None, False
out = self._pending
Comment on lines +360 to +366
close_idx = self._pending.find(close_tag)
tool_tag = self.profile.tool_open_tag
tool_idx = self._pending.find(tool_tag) if tool_tag else -1

# If close tag is found, process it first.
if close_idx >= 0:
# Check if tool tag appears before close tag.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants