-
Notifications
You must be signed in to change notification settings - Fork 828
Fix reasoning parsing when streaming from AWS SageMaker AI #2265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -369,7 +369,10 @@ async def stream( | |
| ) | ||
|
|
||
| # Handle reasoning content | ||
| if choice["delta"].get("reasoning_content"): | ||
| # NOTE: Both `reasoning` and `reasoning_content` need to be handled as vLLM v0.16.0 deprecated | ||
| # the `reasoning_content` in favour of `reasoning` | ||
| # See https://github.com/vllm-project/vllm/pull/33402 | ||
| if any(k in choice["delta"] for k in {"reasoning_content", "reasoning"}): | ||
| if not reasoning_content_started: | ||
| yield self.format_chunk( | ||
| {"chunk_type": "content_start", "data_type": "reasoning_content"} | ||
|
|
@@ -379,7 +382,7 @@ async def stream( | |
| { | ||
| "chunk_type": "content_delta", | ||
| "data_type": "reasoning_content", | ||
| "data": choice["delta"]["reasoning_content"], | ||
| "data": choice["delta"].get("reasoning_content", choice["delta"].get("reasoning")), | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Issue: Suggestion: Use "data": choice["delta"].get("reasoning_content") or choice["delta"].get("reasoning"), |
||
| } | ||
| ) | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue: The condition checks for key existence but not value truthiness. If vLLM sends
"reasoning": Noneor"reasoning": ""in the delta (common in streaming when the key is present but no reasoning content is being sent on that chunk), this will enter the block and emit an empty/None delta.Compare with the text content check on line 358 which uses
.get("content")(truthy check), and the OpenAI provider which useschoice.delta.reasoning_content(also truthy).Suggestion: Use a truthiness check consistent with the rest of the codebase:
This also simplifies line 385 since the value is already extracted.