Skip to content

Sequential LlmAgents not returning audio with bidi streaming #2261

@ZachTB123

Description

@ZachTB123

** Please make sure you read the contribution guide and file the issues in the right place. **
Contribution guide.

Describe the bug
I have two separate LlmAgent's who are instructed to only say a specific. After the first agent completes, I want the next LlmAgent to say its message. When using either a SequentialAgent or custom agent to orchestrate this, I'm not getting an audio response.

To Reproduce
Steps to reproduce the behavior:

  1. Clone: https://github.com/google/adk-docs/tree/main/examples/python/snippets/streaming/adk-streaming-ws/app
  2. Update google_search_agent/agent.py with the following code:
from google.adk.agents import Agent, SequentialAgent

msg1_agent = Agent(
    name="message_agent_1",
    model="gemini-live-2.5-flash",
    description="Agent to say a message to the user.",
    instruction=(
        """
        Your goal is to say a specific message to the user. The message is:

        This is message one.

        Do not say anything else other than the message. You should only
        say the message exactly as it is.
        """
    ),
    include_contents="none"
)

msg2_agent = Agent(
    name="message_agent_2",
    model="gemini-live-2.5-flash",
    description="Agent to say a message to the user.",
    instruction=(
        """
        Your goal is to say a specific message to the user. The message is:

        This is message two.

        Do not say anything else other than the message. You should only
        say the message exactly as it is.
        """
    ),
    include_contents="none"
)

root_agent = SequentialAgent(
    name="MessagesAgent",
    sub_agents=[msg1_agent, msg2_agent],
    description="An agent that plays messages in sequence.",
)
  1. Use the directions here: https://google.github.io/adk-docs/streaming/custom-streaming-ws/#3.-interact-with-your-streaming-app to run the app.
  2. Say "hello".
  3. You should see This is message one.This is message two. displayed in the page.

Expected behavior
I expect to hear those two messages said back to me as audio. Instead, I only see the text representations.

This seems like it is possibly due to the tool call that gets added here. If I instead using a custom agent like follows:

class CustomAgent(BaseAgent):

    def __init__(
        self,
        name: str,
    ):
        super().__init__(
            name=name,
            sub_agents=[msg1_agent, msg2_agent],
        )

    @override
    async def _run_live_impl(
        self, ctx: InvocationContext
    ) -> AsyncGenerator[Event, None]:
        async for event in msg1_agent.run_live(ctx):
            yield event

        async for event in msg2_agent.run_live(ctx):
            yield event

I do hear the first message played back to me, but not the second. As soon as I add tool calls and update the the instructions on my LlmAgent's to match those added by the SequentialAgent, I no longer get audio but both messages are displayed as text on the page.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: mac
  • Python version(python -V): 3.13.3
  • ADK version(pip show google-adk): 1.5.0

Model Information:

gemini-live-2.5-flash

Additional context

My goal is to create a custom agent that implements our deterministic requirements. Sometimes we may have to play multiple messages in a row. We will only know if we have another message to play after the previous has finished.

Metadata

Metadata

Assignees

Labels

live[Component] This issue is related to live, voice and video chat

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions