-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Python: Orchestration output ADR #4799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,186 @@ | ||
| --- | ||
| status: proposed | ||
| contact: taochen | ||
| date: 2026-03-19 | ||
| deciders: | ||
| consulted: | ||
| informed: | ||
| --- | ||
|
|
||
| # Orchestration Run Output Types | ||
|
|
||
| > Note: this document only applies to Python. .Net is out of scope for now since it does not have the same orchestration patterns or output model. | ||
| ## Context and Problem Statement | ||
|
|
||
| Python orchestrations (Concurrent, Sequential, Handoff, GroupChat, Magentic) currently all yield `list[Message]` — typically the full conversation history — as their final output via `ctx.yield_output(...)`. This creates several problems: | ||
|
|
||
| 1. **The final output is semantically wrong.** Dumping the entire conversation into the output means consumers receive every intermediate message (user prompts, multi-round exchanges) rather than the orchestration's actual "answer." For example, a Sequential orchestration with three agents returns all messages from all three agents, even though only the last agent's response is the meaningful result. | ||
|
|
||
| 2. **No clean distinction between intermediate and final outputs.** When `intermediate_outputs=True`, orchestrations surface `AgentResponse` / `AgentResponseUpdate` from individual agents as they run. The final output is also surfaced as an output event with the same `type='output'`. While callers could in theory distinguish them by inspecting the data type (e.g., intermediate outputs are `AgentResponse` while the final output is `list[Message]`), this is a fragile model — it couples control flow semantics to data representation and requires callers to know the internal output type of each orchestration pattern. More importantly, consumers like `WorkflowAgent` via `as_agent()` do not make this distinction: they convert all output events into the agent's response regardless of whether they are intermediate or final, producing a response that mixes progress updates with the actual answer. | ||
|
|
||
| 3. **Inconsistent output types across orchestrations.** Most orchestrations yield `list[Message]`, but the content semantics vary wildly: Concurrent yields `[user_prompt, agent1_reply, agent2_reply, ...]`, Sequential yields the full chain, GroupChat/Magentic yield all rounds plus a completion message. Handoff yields `list[Message]` representing the full conversation at two different call sites. There is no unified contract for what a caller should expect. | ||
|
|
||
| ## Orchestrations as Prebuilt Workflow Patterns | ||
|
|
||
| Orchestrations (Concurrent, Sequential, Handoff, GroupChat, Magentic) are not standalone features — they are prebuilt workflow patterns built on top of the workflow system APIs. They serve as both ready-to-use solutions and as reference implementations that demonstrate how to correctly compose agents using the workflow primitives (`Executor`, `WorkflowBuilder`, `yield_output`, `as_agent()`, etc.). | ||
|
|
||
| This dual role makes orchestrations critically important: the patterns they establish become the patterns that developers follow when building their own workflows. In practice, developers use the framework to build workflows that coordinate Foundry agents, and ultimately deploy those workflows as hosted agents on Azure AI Foundry. This path — from workflow definition to agent deployment — relies on a seamless integration between workflows and agents. If orchestrations model this integration poorly (e.g., producing outputs that don't compose cleanly with `as_agent()`), developers building custom workflows will inherit the same problems. | ||
TaoChenOSU marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| Getting the output contract right in orchestrations therefore has implications beyond the orchestrations themselves. It sets the standard for how any workflow should produce its final result, how that result flows through `as_agent()` to become an agent response, and how sub-workflows signal completion to their parent workflows. | ||
|
|
||
| ### Usage Scenarios | ||
|
|
||
| Orchestrations are used in three primary ways, each with different output expectations: | ||
|
|
||
| #### 1. As Workflows (Basic Usage) | ||
|
|
||
| The most common scenario. The caller runs the workflow and iterates over events: | ||
|
|
||
| ```python | ||
| workflow = SequentialBuilder(participants=[agent1, agent2, agent3]).build() | ||
| events = await workflow.run(message="Write a report") | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't this missing a |
||
| for event in events: | ||
| if event.type == "output": | ||
| # Currently: event.data is either AgentResponse or list[Message] with ALL messages from ALL agents | ||
| pass | ||
| ``` | ||
|
|
||
| When `intermediate_outputs=True`, the caller receives both intermediate agent outputs and the final output. While the data types differ (intermediate outputs are `AgentResponse` / `AgentResponseUpdate`, the final output is `list[Message]`), relying on type inspection to distinguish them is fragile and requires knowledge of each orchestration's internal output types. There is no explicit signal that says "the orchestration is done." | ||
|
|
||
| #### 2. As Agents via `as_agent()` | ||
|
|
||
| Orchestrations can be wrapped as agents using `workflow.as_agent()`. The `WorkflowAgent` processes workflow output events differently depending on the mode: | ||
|
|
||
| - **Non-streaming**: Collects all output events, then merges their data into a single `AgentResponse`. The full conversation dump from the orchestration's final output becomes `AgentResponse.messages` alongside any intermediate agent responses — producing a response that conflates progress with the actual answer. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This says when
Which approach? Buffering means memory overhead for long-running orchestrations. Filtering means you lose the ability to retroactively include intermediate outputs.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure if I understand the question. This line is a description of the current behavior, not the proposal. In the proposed solution, the
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is what happens in single agents, as well, there might be multiple turns and tool calls, and you wait until all of that is done, and then you return the full set of Messages in one AgentResponse. |
||
| - **Streaming**: Converts each output event into `AgentResponseUpdate` objects and yields them as they arrive. All updates — whether from intermediate agents or the final conversation dump — are yielded indiscriminately as streaming chunks. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is also the same, regardless of where the Update originates, the user get's it back (there are really only 2 places in single agents, but still) |
||
|
|
||
| In both modes, `WorkflowAgent` processes all output events without distinguishing intermediate from final. When `intermediate_outputs=True`, this means intermediate agent responses and the final conversation dump are merged together. Even when `intermediate_outputs=False`, the final output is still the full conversation rather than the meaningful answer. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this true for single agents? I only get back one AgentResponse with one content that is the final answer?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, in single agent, you get all contents from all model calls/tool results in a single AgentResponse (through get_final_response when streaming)
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For workflows, you will still receive intermediate works by setting |
||
|
|
||
| #### 3. As Sub-Workflows | ||
|
|
||
| When an orchestration is embedded within a parent workflow, downstream executors receive output events from the orchestration. They must be able to: | ||
|
|
||
| - Process intermediate outputs (streaming chunks or agent responses) for real-time updates | ||
| - Identify when the orchestration has produced its final result | ||
| - Extract the meaningful answer from the final output | ||
|
|
||
| Currently, there is no mechanism to distinguish final from intermediate output events. | ||
|
|
||
| ### Current State | ||
|
|
||
| | Orchestration | Final Output | What It Contains | | ||
| |---|---|---| | ||
| | **Concurrent** | `list[Message]` | User prompt + final assistant message from each agent | | ||
| | **Sequential** | `list[Message]` | Full conversation chain from all agents | | ||
| | **Handoff** | `list[Message]` | Full conversation history | | ||
| | **GroupChat** | `list[Message]` | All rounds + completion message | | ||
| | **Magentic** | `list[Message]` | Chat history + final answer | | ||
|
|
||
| ## Decision Drivers | ||
|
|
||
| - The final output of an orchestration should be semantically meaningful — it should represent the orchestration's "answer," not a conversation dump. | ||
TaoChenOSU marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - Consumers must be able to distinguish the orchestration's final output from intermediate progress updates without relying on data type inspection or positional heuristics. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The problem with this is what the final answer is, if multiple agents can return the final answer then this is hard anyway. |
||
| - The output type should be `AgentResponse` so that orchestrations compose naturally with the agent system — particularly `as_agent()` and nested agent patterns. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why? a Workflow is not a Agent, unless you use
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We are not talking about generic workflows here. Orchestrations are designed to work with agents thus it makes sense for these orchestrations to behave similarly to agents. However, these aren't the only ways to build workflows for multi agents. Currently, we can't force output types on workflows. |
||
| - Different orchestration patterns have fundamentally different semantics for what constitutes the "answer," and the solution must accommodate this. | ||
|
|
||
| ## Considered Options | ||
|
|
||
| ### Option 1: Use data type as the discriminator | ||
|
|
||
| Introduce a wrapper type (e.g., `OrchestrationResult`) for the final output. Consumers check `isinstance(event.data, OrchestrationResult)` to identify the final output. | ||
|
|
||
| - Pro: No changes to the event system are needed. | ||
| - Con: | ||
| - Introduces a new type that callers must know about and unwrap. | ||
| - Conflates data representation with control flow semantics. | ||
|
|
||
| ### Option 2: Add a new event type | ||
|
|
||
| Add a `"run_output"` event type alongside the existing `"output"` type. | ||
|
|
||
| - Pro: The distinction is clear and explicit. | ||
| - Con: | ||
| - Adds a new concept to the workflow framework (`run_output` vs `output`). | ||
| - `WorkflowAgent`, sub-workflow consumers, and event processing logic all need to handle two output event types. | ||
|
|
||
| ### Option 3: Add `is_run_completed` flag to existing output event | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When an Currently
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, the agent executors don't know. Only the author of the workflow knows what makes up the run output. Our orchestration layer doesn't use the
There isn't an orchestration layer. The orchestrations are just like any workflow. When an
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: but is
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you elaborate? |
||
|
|
||
| Add an optional `is_run_completed: bool` parameter to the existing `yield_output()` method and `WorkflowEvent`: | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd say that nothing in the framework would enforce that exactly one event has
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree, this seems like a recipe for mistakes
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, we can create a warning if there are more than one output event with |
||
|
|
||
| ```python | ||
| # WorkflowContext — existing API, new optional param | ||
| async def yield_output(self, output: W_OutT, *, is_run_completed: bool = False) -> None: | ||
| ... | ||
|
|
||
| # WorkflowEvent.output factory — new optional param | ||
| @classmethod | ||
| def output(cls, executor_id: str, data: DataT, *, is_run_completed: bool = False) -> WorkflowEvent[DataT]: | ||
| ... | ||
| ``` | ||
|
|
||
| - Pro: | ||
| - Minimal, backward-compatible extension of the existing API. | ||
| - The flag is on the event, not the data — separating control flow from representation. | ||
| - Consumers can simply check `event.is_run_completed` without knowing about special types. | ||
| - Con: | ||
| - Adds a new concept to the workflow framework, but it's a simple boolean flag rather than a whole new event type. | ||
|
|
||
| ## Definition of a Run | ||
|
|
||
| A **run** represents a single invocation of the workflow — from receiving an initial request to the workflow returning to idle status. The `is_run_completed` flag on an output event signals that this output represents the final result of the current run. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How would this work, what if multiple executors set this at the same time?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is just like outputs. Any executor in the workflow can create outputs. We can't prevent this because we don't know the internals of executors. Similar to the comment above, we can create a warning when we see two output events with |
||
|
|
||
| Important considerations: | ||
|
|
||
| - A workflow going back to **idle** status after processing a request typically means a run has completed. All orchestration patterns emit an output event with `is_run_completed=True` when their run finishes. In some cases (e.g., Handoff), the completion event may carry no data — it serves purely as a signal that the run is done. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We acknowledge that the Handoff pattern emits an empty |
||
| - A workflow entering **idle with pending requests** (e.g., waiting for human-in-the-loop approval) does **not** mean the run has completed. Rather, the run is suspended and will resume when the pending request is fulfilled. The `is_run_completed` flag should not be set on outputs emitted before or during a pending request pause. | ||
|
|
||
| ## Decision Outcome | ||
|
|
||
| Chosen option: **Option 3 — Add `is_run_completed` flag to existing output event**, because it is the most minimal and backward-compatible approach. It does not introduce new types or event categories, and the semantic intent is clear: `is_run_completed=True` means "this output represents the final result of the current run." | ||
|
|
||
| ### Per-Orchestration Output Changes | ||
|
|
||
| Each orchestration pattern changes what data it yields as the final output and sets `is_run_completed=True`: | ||
|
|
||
| | Orchestration | Current Final Output | New Final Output | Rationale | | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't there a breaking change here?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fundamentally it does not make sense to me that a Workflow defaults to returning a AgentResponse
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will mark this as breaking. And to @eavanvalkenburg's comment, we don't force the workflow to output anything. The orchestrations are just a way to build workflows, and we think it makes sense to have the orchestrations to output Workflows don't return anything. It's event based. A workflow can generate output events containing anything. |
||
| |---|---|---|---| | ||
| | **Concurrent** | `list[Message]` (user prompt + one reply per agent) | `AgentResponse` containing all sub-agent response messages | The combined responses from all parallel agents represent the orchestration's answer. Messages are copied from each sub-agent's `AgentResponse`. | | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happens when Here it says the final output is an
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this makes sense, perhaps I need to reword this a bit. Concurrent allows custom aggregation strategies so not every concurrent will simply aggregate the messages into a list. Of course, the default one does simple aggregation. We can use the workflow name as the agent response in that case, and for the individual messages, they will be put into the list unchanged. A different container also makes sense. There is no correct answer here. I think AgentResponse is a natural integration, the prerequisite is that users need to know that they are running a concurrent orchestration. |
||
| | **Sequential** | `list[Message]` (full conversation chain) | `AgentResponse` from the last agent | The last agent in the chain produces the final answer. Earlier agents' outputs are intermediate steps. | | ||
| | **Handoff** | `list[Message]` (full conversation) | Empty output event with `is_run_completed=True` | Handoff emits agent responses as they become available (each agent's response is surfaced as an intermediate output). Since agents in handoff workflows are not sub-agents of a central orchestrator, all outputs are directly emitted — there is no separate "final answer." An empty completion event is emitted so consumers have a consistent signal that the run has finished. | | ||
| | **GroupChat** | `list[Message]` (all rounds + completion message) | `AgentResponse` containing the summary or completion message | The orchestrator's summary/end message is the meaningful result. Individual round messages are intermediate outputs visible when `intermediate_outputs=True`. | | ||
| | **Magentic** | `list[Message]` (chat history + final answer) | `AgentResponse` containing the synthesized final answer | The manager's synthesized final answer is the meaningful result. Individual agent work is intermediate. | | ||
|
|
||
| ### Integration Points | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If all orchestrations change their output from
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We are not GAing orchestrations.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The intent is to GA orchestrations, please check with Shawn. Even if that doesn't mean the same day as core GA, it will fast-follow. |
||
|
|
||
| #### WorkflowAgent (`as_agent()`) | ||
|
|
||
| When `WorkflowAgent` converts workflow events to an `AgentResponse`: | ||
|
|
||
| - Events with `is_run_completed=True` provide the `AgentResponse` that becomes the agent's response directly, with the name of the workflow as the author of the response. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this would make this inconsistent with single agents, because there you alwasy get all intermediate work in a AgentResponse
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can still get intermediate work too with workflows as agents. |
||
| - Events with `is_run_completed=False` are intermediate updates — they are included as streaming updates when `stream=True`, or merged into the response in non-streaming mode. | ||
| - When `intermediate_outputs=False` (recommended for agent usage), only the `is_run_completed=True` event is surfaced, producing a clean agent response. | ||
|
|
||
| > `intermediate_outputs` is always `True` for `handoff` since it has no single final answer — all agent responses are surfaced as intermediate outputs, and the completion event is empty. | ||
| #### Sub-Workflows | ||
|
|
||
| Downstream executors in a parent workflow can check `event.is_run_completed` to determine if the orchestration has produced its final answer: | ||
|
|
||
| - `is_run_completed == False` → intermediate progress (streaming chunk, individual agent response) | ||
| - `is_run_completed == True` → the orchestration is done; `event.data` contains the final `AgentResponse` (or empty for handoff) and the executor can proceed using the answer or assemble the received data as needed. | ||
|
|
||
| ### Consequences | ||
|
|
||
| - Pro: | ||
| - The final output is now semantically meaningful — consumers get the "answer" rather than a conversation dump. | ||
| - The `is_run_completed` flag provides a clear, type-agnostic signal for consumers to identify completion. | ||
| - `AgentResponse` as the output type means orchestrations compose naturally with the agent system. | ||
| - The change is backward-compatible — existing code that doesn't check `is_run_completed` continues to work; it simply receives `AgentResponse` instead of `list[Message]`. | ||
| - Con: Workflow executors must remember to set `is_run_completed=True` on their final yield when appropriate. | ||
| - Neutral: The Handoff pattern emits an empty completion event (no data, just the flag) since it has no single "final answer." Consumers must handle the case where `is_run_completed=True` but `event.data` is empty. | ||
|
|
||
| ## More Information | ||
|
|
||
| - See [ADR-0001: Agent Run Responses Design](0001-agent-run-response.md) for the foundational design of `AgentResponse`, primary vs secondary output, and the streaming model. | ||
| - The `intermediate_outputs` parameter on orchestration builders controls whether intermediate agent outputs are surfaced. When `False` (default), only outputs from designated output executors are visible. The `is_run_completed` flag adds a second dimension: even among visible outputs, only those marked `is_run_completed=True` represent the orchestration's final answer. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.