-
Notifications
You must be signed in to change notification settings - Fork 650
Description
Checks
- I have updated to the lastest minor and patch version of Strands
- I have checked the documentation and this is not expected behavior
- I have searched ./issues and there are no duplicates of my issue
Strands Version
1.26.0
Python Version
3.11
Operating System
Amazon Linux 2023 (Bedrock AgentCore Runtime), also reproducible on macOS
Installation Method
pip
Steps to Reproduce
- Create a
BedrockModelwith extended thinking enabled:
from strands import Agent
from strands.models import BedrockModel
model = BedrockModel(
model_id="us.anthropic.claude-sonnet-4-20250514-v1:0",
additional_request_fields={
"thinking": {
"type": "enabled",
"budget_tokens": 4096,
},
},
)- Create an agent with a session manager that persists conversation history (e.g.,
AgentCoreMemorySessionManageror any customSessionManagerimplementation):
agent = Agent(
model=model,
tools=[some_tool],
session_manager=my_session_manager, # persists messages across turns
)- Send a prompt that triggers a tool call:
agent("What is the status of zone X?")
# Agent calls some_tool, gets result, responds successfully.- Send a second prompt in the same session:
agent("Now tell me about zone Y")
# FAILS with ValidationExceptionExpected Behavior
The second turn should work correctly. The session manager loads the persisted conversation history (including assistant messages from turn 1), and the model should be able to continue the conversation with extended thinking enabled.
Actual Behavior
The second turn fails with:
botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the ConverseStream operation: If an assistant message contains any thinking blocks, the first block must be thinking. Found text
Or, if the first turn's assistant messages had no thinking blocks at all:
botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the ConverseStream operation: A conversation must alternating user and assistant messages...
Additional Context
Root Cause Analysis
When extended thinking is enabled, Bedrock/Claude requires every assistant message to start with a reasoningContent block. The Strands SDK event loop constructs assistant messages during tool-use cycles that do not include reasoningContent blocks.
Here's what happens step by step:
Turn 1 — First API call (works)
The SDK calls model.stream() with [user_message]. Thinking is enabled, so Claude responds with:
# Assistant message from Bedrock (has thinking block)
{
"role": "assistant",
"content": [
{"reasoningContent": {"reasoningText": {"text": "Let me think..."}}}, # ✅ thinking block
{"text": "I'll check the status."},
{"toolUse": {"toolUseId": "tool_123", "name": "zoneStatus", "input": {...}}}
]
}Turn 1 — Second API call (tool result follow-up)
The SDK's event loop constructs the messages for the next model.stream() call. The conversation now looks like:
messages = [
{"role": "user", "content": [{"text": "What is the status of zone X?"}]},
# ⚠️ SDK-constructed assistant message — NO reasoningContent block
{"role": "assistant", "content": [
{"text": "I'll check the status."},
{"toolUse": {"toolUseId": "tool_123", "name": "zoneStatus", "input": {...}}}
]},
{"role": "user", "content": [
{"toolResult": {"toolUseId": "tool_123", "content": [{"text": "Zone is active"}]}}
]},
]Notice the assistant message at index 1: the SDK stripped the reasoningContent block when it constructed this message for the next event loop cycle. If thinking is still enabled for this call, Bedrock rejects it because the assistant message doesn't start with a thinking block.
Turn 2 — Session reload (fails)
The session manager persisted the messages from turn 1 (without thinking blocks). On turn 2, it loads them back:
messages = [
{"role": "user", "content": [{"text": "What is the status of zone X?"}]},
{"role": "assistant", "content": [...]}, # no reasoningContent
{"role": "user", "content": [{"toolResult": ...}]},
{"role": "assistant", "content": [...]}, # no reasoningContent (final response)
{"role": "user", "content": [{"text": "Now tell me about zone Y"}]}, # new prompt
]With thinking enabled, Bedrock rejects this because the assistant messages lack reasoningContent blocks.
Workaround
We currently work around this by subclassing BedrockModel and overriding stream() to detect incompatible assistant messages and temporarily disable thinking for that API call:
class _ThinkingCompatibleBedrockModel(BedrockModel):
async def stream(self, messages, tool_specs=None, system_prompt=None, **kwargs):
if self._has_any_incompatible_assistant(messages):
# Disable thinking, pass full history
original = self.config.get("additional_request_fields", {})
patched = {k: v for k, v in original.items() if k != "thinking"}
self.config["additional_request_fields"] = patched
try:
async for event in super().stream(messages, tool_specs, system_prompt, **kwargs):
yield event
finally:
self.config["additional_request_fields"] = original
else:
async for event in super().stream(messages, tool_specs, system_prompt, **kwargs):
yield eventThis preserves multi-turn context but sacrifices thinking for any session that has incompatible history.
Possible Solution
Proposed Fix
The SDK's event loop (in streaming.py / event_loop.py) should preserve reasoningContent blocks in the assistant messages it constructs when passing them back into the next model.stream() call. The thinking blocks are part of the model's response and are required by the API contract when thinking is enabled.
Specifically, when the event loop builds the assistant message from the streamed response to pass back with tool results, it should include the reasoningContent content blocks that were part of the original response.
Related Issues
No response