Skip to content

Google TTS Plugin Timeout with Gemini Model & Chirp_3 Streaming Error (livekit-plugins-google==1.4.4) #5117

@sahil-axl

Description

@sahil-axl

Bug Description

Hi Team,

I'm facing an issue with the Google TTS plugin while using LiveKit Agents.

I am currently using:

livekit-plugins-google==1.4.4
Issue 1 – chirp_3 Model Not Working with Streaming

When I try to use the chirp_3 model for TTS, I receive the following error:

{"message": "failed to synthesize speech: Streaming for this Gemini-TTS model is currently not supported. Please use another Gemini-TTS model or Chirp3: HD voices for streaming., retrying in 0.1s",
"level": "WARNING",
"name": "livekit.agents",
"tts": "livekit.plugins.google.tts.TTS",
"attempt": 1,
"streamed": true,
"timestamp": "2026-03-13T10:47:24.540323+00:00"}

From the message, it appears that streaming is not supported for this model. However, the error message itself suggests using Chirp3 HD voices for streaming, which is confusing because the chirp_3 model fails when used.

Issue 2 – Gemini Default Model Random Timeout

If I switch to the default Gemini TTS model, the agent initially starts speaking correctly. However, after some time, it randomly fails with a timeout error:

{"message": "failed to synthesize speech, retrying in 0.1s
Traceback (most recent call last):
File "/app/.venv/lib/python3.12/site-packages/livekit/agents/tts/tts.py", line 456, in _main_task
await self._run(output_emitter)
File "/app/.venv/lib/python3.12/site-packages/livekit/plugins/google/tts.py", line 398, in _run
await asyncio.gather(*tasks)
File "/app/.venv/lib/python3.12/site-packages/livekit/plugins/google/tts.py", line 391, in _run_segments
await self._run_stream(input_stream, output_emitter, streaming_config)
File "/app/.venv/lib/python3.12/site-packages/livekit/plugins/google/tts.py", line 443, in _run_stream
raise APITimeoutError() from None
livekit.agents._exceptions.APITimeoutError: Request timed out. (body=None, retryable=True)",
"level": "WARNING",
"name": "livekit.agents",
"tts": "livekit.plugins.google.tts.TTS",
"attempt": 1,
"streamed": true
}

Expected Behavior

the agent should speak seamlessly without getting any timeout issues with default gemini model and chirp_3 model shoulf also work. Please correct me if i'm wrong

Reproduction Steps

# -------------------------
    # STT CONFIG
    # -------------------------
    stt = deepgram.STT(
        api_key=os.getenv("DEEPGRAM_API_KEY"),
        model="nova-3",
        language="en-US",
        smart_format=True,
        numerals=True
    )

    # -------------------------
    # LLM CONFIG
    # -------------------------
    llm = LLM.with_azure(
        azure_deployment=os.getenv("AZURE_OPENAI_MODEL"),
        azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
        api_key=os.getenv("AZURE_OPENAI_API_KEY"),
        api_version="2025-01-01-preview",
        temperature=0.3
    )

    # -------------------------
    # GOOGLE TTS CONFIG
    # -------------------------
    credentials_info = json.loads(os.getenv("GOOGLE_APPLICATION_CREDENTIALS"))

    tts = google.TTS(
        model_name="chirp_3",	
        language="en-US",
        credentials_info=credentials_info,
        voice_name="Achernar",
        audio_encoding=texttospeech.AudioEncoding.PCM,
        use_streaming=True,
        location="eu"
    )

    # -------------------------
    # AGENT SESSION
    # -------------------------
    session = AgentSession(
        stt=stt,
        llm=llm,
        tts=tts,
        vad=silero.VAD.load(min_speech_duration=0.3),
        turn_detection=None,
        min_interruption_words=2,
        user_away_timeout=60,
    )

    await session.start(
            agent=agent,
            room=ctx.room,
            room_input_options=RoomInputOptions(
                  noise_cancellation=noise_cancellation.NC(),
            )
        )

Operating System

windows and linux tried in both

Models Used

deepgram nova-3, azure-openai gpt 4.1 mini, google tts -> default gemini model

Package Versions

"livekit==1.1.2",
    "livekit-api==1.1.0",
    "livekit-agents[cartesia]==1.4.4",
    "livekit-plugins-openai==1.4.4",
    "livekit-plugins-deepgram==1.4.4",
    "livekit-plugins-azure==1.4.4",
    "livekit-plugins-elevenlabs==1.4.4",
    "livekit-plugins-silero==1.4.4",
    "livekit-plugins-turn-detector==1.4.4",
    "livekit-plugins-langchain==1.4.4",
    "livekit-plugins-neuphonic==1.4.4",
    "livekit-plugins-google==1.4.4",
    "livekit-plugins-noise-cancellation==0.2.5",
"livekit-protocol==1.1.2",

Session/Room/Call IDs

No response

Proposed Solution

Additional Context

No response

Screenshots and Recordings

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions