feat(soniox): add real-time translation support and rewrite SpeechStream by MSameerAbbas · Pull Request #5111 · livekit/agents

MSameerAbbas · 2026-03-15T18:44:10Z

Summary

Adds real-time translation support to the Soniox STT plugin (#4943), along with a cleanup of SpeechStream to align with the patterns used by other plugins like Deepgram.

New features

Real-time translation (one-way and two-way) via TranslationConfig dataclass. Translations surface as alternatives[1] on SpeechEvent, fully backward-compatible since all consumers only read alternatives[0].
max_endpoint_delay_ms parameter (500-3000ms) for tuning endpoint detection latency.
models.py with Literal type aliases (SonioxRTModels, SonioxLanguages) for IDE autocomplete -- follows the same pattern as the Google STT plugin.
Flush sentinel handling: _FlushSentinel is now mapped to Soniox's documented end-of-stream signal for clean session shutdown. Previously it was not handled.

SpeechStream cleanup

While adding translation, I noticed a few things in SpeechStream that could be simplified to match how other plugins (Deepgram, Google) structure their streaming:

Simplified connection lifecycle: Consolidated into a single _run() that connects, runs tasks, and cleans up. The base class _main_task() already handles retry logic, so the plugin doesn't need its own retry loop.
Reduced task count (4 -> 3): The intermediate audio_queue between _prepare_audio_task and _send_audio_task was consolidated into a single _send_task that reads _input_ch directly.
ws as parameter: Subtasks receive the WebSocket as a parameter rather than reading self._ws, similar to how the Deepgram plugin passes connection state.
Error propagation: Server errors now raise APIConnectionError (5xx) or APIStatusError (4xx) so the base class can decide whether to retry. Unexpected WebSocket closure raises instead of silently returning.

Translation design

When translation is enabled, Soniox returns tokens with a translation_status field. The plugin routes tokens into dict-keyed accumulators:

translation_status in ("none", "original", absent) -> final["original"]
translation_status == "translation" -> final["translation"]

At endpoint: alternatives[0] = original text, alternatives[1] = translation (if present). When translation is off, all tokens route to "original" and the event has a single alternative -- identical to the previous behavior. One code path handles both cases.

What was NOT changed

_TokenAccumulator class -- already clean, kept as-is.
STT class -- kept as-is.
All STTOptions defaults preserved (model, sample_rate, num_channels, etc.).
Context dataclasses (ContextObject, ContextGeneralItem, ContextTranslationTerm) -- unchanged.

Files changed

livekit-plugins/livekit-plugins-soniox/livekit/plugins/soniox/stt.py -- SpeechStream rewrite, added TranslationConfig + STTOptions fields
livekit-plugins/livekit-plugins-soniox/livekit/plugins/soniox/__init__.py -- export TranslationConfig, SonioxLanguages, SonioxRTModels
livekit-plugins/livekit-plugins-soniox/livekit/plugins/soniox/models.py -- new file with Literal type aliases

Test plan

Two-way translation (en/ur) -- verified both directions produce correct alternatives[0] and alternatives[1]
One-way translation (to ur) -- verified single target language translation
No translation (backward compat) -- verified single alternative, identical to previous behavior
max_endpoint_delay_ms -- verified API accepts the parameter
Ruff format and lint -- all checks passed
mypy strict -- 0 new errors (1 pre-existing across all STT plugins)
Unit test suite (294 passed, 2 skipped, 9 errors from missing LiveKit server -- pre-existing)

Refs: #4943

Rewrite the Soniox STT plugin to support all WebSocket API features and fix structural issues in the streaming implementation. New features: - Real-time translation (one-way and two-way) via TranslationConfig - Configurable max_endpoint_delay_ms (500-3000ms) - Typed Literal autocomplete for models, languages, and translation type - Flush sentinel mapped to FINALIZE_MSG for clean session shutdown Structural fixes: - Remove dead reconnect machinery (_reconnect_event was never set) - Eliminate unnecessary intermediate audio queue (2 tasks -> 1) - Pass ws as parameter to subtasks instead of mutable self._ws - Single connection lifecycle in _run(); base class handles retries - Proper error semantics (5xx -> APIConnectionError, 4xx -> APIStatusError) - Raise on unexpected WS closure instead of silent hang - Handle _FlushSentinel (was silently dropped) - Remove unreachable except clause Translation design: - alternatives[0] = original text (always present) - alternatives[1] = translated text (when translation is enabled) - Fully backward-compatible: all consumers read alternatives[0] - Dict-keyed accumulators with no special cases Refs: livekit#4943

devin-ai-integration

Devin Review found 1 potential issue.

View 6 additional findings in Devin Review.

livekit-plugins/livekit-plugins-soniox/livekit/plugins/soniox/stt.py

MSameerAbbas · 2026-03-15T19:14:41Z

Hey @tinalenguyen, I saw this was assigned to you - hope it's helpful! Would love your review.

devin-ai-integration bot reviewed Mar 15, 2026

View reviewed changes

livekit-plugins/livekit-plugins-soniox/livekit/plugins/soniox/stt.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(soniox): add real-time translation support and rewrite SpeechStream#5111

feat(soniox): add real-time translation support and rewrite SpeechStream#5111
MSameerAbbas wants to merge 1 commit intolivekit:mainfrom
MSameerAbbas:feat/soniox-full-feature-support

MSameerAbbas commented Mar 15, 2026 •

edited

Loading

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

Uh oh!

MSameerAbbas commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MSameerAbbas commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New features

SpeechStream cleanup

Translation design

What was NOT changed

Files changed

Test plan

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MSameerAbbas commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

MSameerAbbas commented Mar 15, 2026 •

edited

Loading