fix: detect tool open tag within reasoning content in _consume_reasoning#4580
Open
dbsd11 wants to merge 1 commit into
Open
fix: detect tool open tag within reasoning content in _consume_reasoning#4580dbsd11 wants to merge 1 commit into
dbsd11 wants to merge 1 commit into
Conversation
The original _consume_reasoning method did not check for tool open tags before the reasoning close tag, causing tool calls embedded in reasoning content to be incorrectly parsed as reasoning text. This fix adds proper detection and priority handling for both tags. Also drop unsupported add_bos parameter from test tokenizer encode calls.
Collaborator
|
Could you help providing an example about "tool open tag within reasoning content"? |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes streaming parsing in BaseResponseParser._consume_reasoning() so that tool-call opening tags appearing inside reasoning output are detected and routed to tool parsing (instead of being appended to reasoning_content). It also updates Qwen parser tests to remove an unsupported add_bos argument from tokenizer encode() calls.
Changes:
- Update
_consume_reasoning()to search for bothreasoning_close_tagandtool_open_tagand switch to tool mode when the tool tag appears first. - Remove
add_bos=Falsefromtokenizer.encode()in Qwen3 / Qwen3.5 parser streaming tests for transformers compatibility.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
lmdeploy/serve/parsers/response_parser.py |
Adjust reasoning-mode tag handling to prioritize tool-open tags that occur before the reasoning close tag. |
tests/test_lmdeploy/serve/parsers/test_qwen3_parser.py |
Remove unsupported add_bos argument from test tokenizer encoding helper. |
tests/test_lmdeploy/serve/parsers/test_qwen3_5_parser.py |
Remove unsupported add_bos argument from test tokenizer encoding helper. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+392
to
+396
| # No close tag and no tool tag found - emit safe reasoning prefix, | ||
| # keeping possible partial-tag suffix in buffer. | ||
| if not self._pending: | ||
| return None, False | ||
| out = self._pending |
Comment on lines
+360
to
+366
| close_idx = self._pending.find(close_tag) | ||
| tool_tag = self.profile.tool_open_tag | ||
| tool_idx = self._pending.find(tool_tag) if tool_tag else -1 | ||
|
|
||
| # If close tag is found, process it first. | ||
| if close_idx >= 0: | ||
| # Check if tool tag appears before close tag. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The original
_consume_reasoningmethod did not check for tool open tags before the reasoning close tag, causing toolcalls embedded in reasoning content to be incorrectly parsed as reasoning text. This fix adds proper detection and
priority handling for both tags.
Also drop unsupported
add_bosparameter from test tokenizer encode calls.Motivation
When a model emits tool open tags (like the Qwen3Coder XML tool syntax) directly after reasoning content — without
first emitting the reasoning close tag
</think>— the original_consume_reasoningmethod would treat the entirepending text (including the tool tag) as reasoning content. This caused tool calls to be lost and incorrectly appended
to
reasoning_contentinstead of being routed to the tool parser.Modification
lmdeploy/serve/parsers/response_parser.py: Rewrote the tag detection logic in_consume_reasoningto handlethe interplay between reasoning close tag and tool open tag:
reasoning_close_tagandtool_open_tagpositionsswitches to
MODE_TOOLMODE_TOOLtests/test_lmdeploy/serve/parsers/test_qwen3_parser.py&tests/test_lmdeploy/serve/parsers/test_qwen3_5_parser.py: Removed unsupportedadd_bos=Falseargument fromtokenizer.encode()calls to fix test compatibility with current transformers versions.BC-breaking (Optional)
No. This modification does not break backward compatibility. It only corrects the parsing behavior within the
reasoning mode — downstream users do not need to make any changes.
Use cases (Optional)
This PR fixes the streaming parsing for Qwen3-Coder / Qwen3.5 models when they:
</think>tag<function=...>) within reasoning contentAll 26 parser unit tests pass after this change.
Checklist
PR checklist:
correctness.