Skip to content

[Feature]: Expose intermediate assistant text during streaming in FunctionInvocationLayer #4868

@Danrong430

Description

@Danrong430

Description

When using Agent.run(stream=True) with tools, the FunctionInvocationLayer's streaming loop discards the LLM's reasoning text from intermediate iterations. Only function_call content is yielded during tool-calling iterations; the assistant text that accompanies those tool calls is silently dropped. Text content only streams from the final iteration (when no tool calls are present).

Current behavior:

Iteration 1 (LLM returns text + tool_call):
  Streamed to consumer: function_call chunks only
  Discarded: assistant reasoning text

Iteration 2 (LLM returns text + tool_call):
  Streamed to consumer: function_call chunks only
  Discarded: assistant reasoning text

Iteration 3 (LLM returns text only, no tool calls):
  Streamed to consumer: text chunks ✅

Expected behavior:

Iteration 1 (LLM returns text + tool_call):
  Streamed: text chunks ("I'll search the data lake...")
  Streamed: function_call chunks
  → framework executes tool

Iteration 2 (LLM returns text + tool_call):
  Streamed: text chunks ("Found the dataset. Loading it now...")
  Streamed: function_call chunks
  → framework executes tool

Iteration 3 (LLM returns text only):
  Streamed: text chunks (final response)

Real-time reasoning traces would provide transparency into why each tool is being called, similar to how ChatGPT and Claude show "thinking" text.

The ChatMiddleware documentation says it intercepts individual chat client calls. In practice, when used with AzureOpenAIChatClient + Agent.run(stream=True), the ChatMiddlewareLayer sits above FunctionInvocationLayer in the MRO, so it wraps the entire tool-calling loop as a single call rather than intercepting each individual LLM call within the loop.

Code Sample

Language/SDK

Both

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions