Skip to content

[BUG] Extended thinking breaks multi-turn sessions: SDK-constructed assistant messages lack reasoningContent blocks #1698

@zhenyanghua

Description

@zhenyanghua

Checks

  • I have updated to the lastest minor and patch version of Strands
  • I have checked the documentation and this is not expected behavior
  • I have searched ./issues and there are no duplicates of my issue

Strands Version

1.26.0

Python Version

3.11

Operating System

Amazon Linux 2023 (Bedrock AgentCore Runtime), also reproducible on macOS

Installation Method

pip

Steps to Reproduce

  1. Create a BedrockModel with extended thinking enabled:
from strands import Agent
from strands.models import BedrockModel

model = BedrockModel(
    model_id="us.anthropic.claude-sonnet-4-20250514-v1:0",
    additional_request_fields={
        "thinking": {
            "type": "enabled",
            "budget_tokens": 4096,
        },
    },
)
  1. Create an agent with a session manager that persists conversation history (e.g., AgentCoreMemorySessionManager or any custom SessionManager implementation):
agent = Agent(
    model=model,
    tools=[some_tool],
    session_manager=my_session_manager,  # persists messages across turns
)
  1. Send a prompt that triggers a tool call:
agent("What is the status of zone X?")
# Agent calls some_tool, gets result, responds successfully.
  1. Send a second prompt in the same session:
agent("Now tell me about zone Y")
# FAILS with ValidationException

Expected Behavior

The second turn should work correctly. The session manager loads the persisted conversation history (including assistant messages from turn 1), and the model should be able to continue the conversation with extended thinking enabled.

Actual Behavior

The second turn fails with:

botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the ConverseStream operation: If an assistant message contains any thinking blocks, the first block must be thinking. Found text

Or, if the first turn's assistant messages had no thinking blocks at all:

botocore.errorfactory.ValidationException: An error occurred (ValidationException) when calling the ConverseStream operation: A conversation must alternating user and assistant messages...

Additional Context

Root Cause Analysis

When extended thinking is enabled, Bedrock/Claude requires every assistant message to start with a reasoningContent block. The Strands SDK event loop constructs assistant messages during tool-use cycles that do not include reasoningContent blocks.

Here's what happens step by step:

Turn 1 — First API call (works)

The SDK calls model.stream() with [user_message]. Thinking is enabled, so Claude responds with:

# Assistant message from Bedrock (has thinking block)
{
    "role": "assistant",
    "content": [
        {"reasoningContent": {"reasoningText": {"text": "Let me think..."}}},  # ✅ thinking block
        {"text": "I'll check the status."},
        {"toolUse": {"toolUseId": "tool_123", "name": "zoneStatus", "input": {...}}}
    ]
}

Turn 1 — Second API call (tool result follow-up)

The SDK's event loop constructs the messages for the next model.stream() call. The conversation now looks like:

messages = [
    {"role": "user", "content": [{"text": "What is the status of zone X?"}]},
    # ⚠️ SDK-constructed assistant message — NO reasoningContent block
    {"role": "assistant", "content": [
        {"text": "I'll check the status."},
        {"toolUse": {"toolUseId": "tool_123", "name": "zoneStatus", "input": {...}}}
    ]},
    {"role": "user", "content": [
        {"toolResult": {"toolUseId": "tool_123", "content": [{"text": "Zone is active"}]}}
    ]},
]

Notice the assistant message at index 1: the SDK stripped the reasoningContent block when it constructed this message for the next event loop cycle. If thinking is still enabled for this call, Bedrock rejects it because the assistant message doesn't start with a thinking block.

Turn 2 — Session reload (fails)

The session manager persisted the messages from turn 1 (without thinking blocks). On turn 2, it loads them back:

messages = [
    {"role": "user", "content": [{"text": "What is the status of zone X?"}]},
    {"role": "assistant", "content": [...]},  # no reasoningContent
    {"role": "user", "content": [{"toolResult": ...}]},
    {"role": "assistant", "content": [...]},  # no reasoningContent (final response)
    {"role": "user", "content": [{"text": "Now tell me about zone Y"}]},  # new prompt
]

With thinking enabled, Bedrock rejects this because the assistant messages lack reasoningContent blocks.

Workaround

We currently work around this by subclassing BedrockModel and overriding stream() to detect incompatible assistant messages and temporarily disable thinking for that API call:

class _ThinkingCompatibleBedrockModel(BedrockModel):
    async def stream(self, messages, tool_specs=None, system_prompt=None, **kwargs):
        if self._has_any_incompatible_assistant(messages):
            # Disable thinking, pass full history
            original = self.config.get("additional_request_fields", {})
            patched = {k: v for k, v in original.items() if k != "thinking"}
            self.config["additional_request_fields"] = patched
            try:
                async for event in super().stream(messages, tool_specs, system_prompt, **kwargs):
                    yield event
            finally:
                self.config["additional_request_fields"] = original
        else:
            async for event in super().stream(messages, tool_specs, system_prompt, **kwargs):
                yield event

This preserves multi-turn context but sacrifices thinking for any session that has incompatible history.

Possible Solution

Proposed Fix

The SDK's event loop (in streaming.py / event_loop.py) should preserve reasoningContent blocks in the assistant messages it constructs when passing them back into the next model.stream() call. The thinking blocks are part of the model's response and are required by the API contract when thinking is enabled.

Specifically, when the event loop builds the assistant message from the streamed response to pass back with tool results, it should include the reasoningContent content blocks that were part of the original response.

Related Issues

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions