v3:stt and tts models #4603

dhruvladia-sarvam · 2026-01-23T12:22:40Z

Added:

"saaras:v3" for STTT
"bulbul:v3-beta" for TTS

Summary by CodeRabbit

New Features
- Added support for "saaras:v3" speech-to-text model
- Introduced "bulbul:v3-beta" text-to-speech model with 25 new voice options, including voices designed for customer service, content creation, and international applications, providing greater flexibility in voice selection and speaker diversity

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-23T12:23:02Z

📝 Walkthrough

Walkthrough

This pull request extends Sarvam plugin model support by adding "saaras:v3" to the STT models and introducing "bulbul:v3-beta" for TTS with comprehensive speaker support and compatibility mappings.

Changes

Cohort / File(s)	Summary
STT Model Expansion `livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py`	Added "saaras:v3" to `SarvamSTTModels` Literal type alongside existing model versions.
TTS Model and Speaker Expansion `livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py`	Added "bulbul:v3-beta" to `SarvamTTSModels` Literal type; extended `SarvamTTSSpeakers` with 25 new voice names across Customer Care, Content Creation, and International categories; created compatibility mapping in `MODEL_SPEAKER_COMPATIBILITY` categorizing speakers as female, male, or all; updated `update_options` validation to accept the new model version.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A model update hops into view,
Saaras:v3 and bulbul too!
Twenty-five new voices sing,
With speakers we bring spring! 🎤✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'v3:stt and tts models' directly summarizes the main changes: adding v3 models to both STT and TTS components in the Sarvam plugin.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py (1)

628-638: Gate pitch/loudness by model in streaming config.

The HTTP path explicitly omits pitch and loudness for v3-beta (see lines 489-492: "not supported in v3-beta"), but the streaming config sends them unconditionally. This inconsistency will cause v3-beta streaming sessions to fail with API rejection. Apply the same model check as the HTTP path.

♻️ Suggested adjustment

-                config_msg = {
-                    "type": "config",
-                    "data": {
-                        "target_language_code": self._opts.target_language_code,
-                        "speaker": self._opts.speaker,
-                        "pitch": self._opts.pitch,
-                        "pace": self._opts.pace,
-                        "loudness": self._opts.loudness,
-                        "enable_preprocessing": self._opts.enable_preprocessing,
-                        "model": self._opts.model,
-                    },
-                }
+                config_data = {
+                    "target_language_code": self._opts.target_language_code,
+                    "speaker": self._opts.speaker,
+                    "pace": self._opts.pace,
+                    "enable_preprocessing": self._opts.enable_preprocessing,
+                    "model": self._opts.model,
+                }
+                if self._opts.model == "bulbul:v2":
+                    config_data["pitch"] = self._opts.pitch
+                    config_data["loudness"] = self._opts.loudness
+                config_msg = {"type": "config", "data": config_data}

🤖 Fix all issues with AI agents

In `@livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py`:
- Around line 399-405: When update_options sets a new model (the block that
assigns self._opts.model), also revalidate the currently set speaker
(self._opts.speaker) if no new speaker is passed: check compatibility of the
existing speaker with the requested model and raise a ValueError if
incompatible. Implement this by adding a compatibility check (e.g., call a
helper like is_speaker_supported(model, self._opts.speaker) or inline logic)
immediately after setting model in update_options (the same scope where model,
speaker and self._opts.model are handled) so switching to "bulbul:v3-beta" with
an incompatible current speaker (e.g., "anushka") fails early rather than
causing runtime API errors.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7fe642d and 6f83e4a.

📒 Files selected for processing (2)

livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: unit-tests
GitHub Check: type-check (3.9)
GitHub Check: type-check (3.13)

🔇 Additional comments (4)

livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (1)

56-57: LGTM — model literal updated cleanly.

livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py (3)

50-50: LGTM — model enum expanded for v3-beta.

77-105: LGTM — speaker list expansion aligns with lowercased validation.

108-171: LGTM — compatibility mapping looks consistent with new speakers.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-23T12:28:10Z

livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py

        if model is not None:
            if not model.strip():
                raise ValueError("Model cannot be empty")
-            if model not in ["bulbul:v2"]:
+            if model not in ["bulbul:v2", "bulbul:v3-beta"]:
                raise ValueError(f"Unsupported model: {model}")
            self._opts.model = model



⚠️ Potential issue | 🟠 Major

Revalidate the existing speaker when switching models.
update_options(model=...) can set bulbul:v3-beta while keeping an incompatible current speaker (e.g., "anushka"), and the mismatch isn’t checked unless speaker is also passed. This can surface as runtime API errors later.

🐛 Proposed fix

if model is not None: if not model.strip(): raise ValueError("Model cannot be empty") if model not in ["bulbul:v2", "bulbul:v3-beta"]: raise ValueError(f"Unsupported model: {model}") self._opts.model = model + if speaker is None and not validate_model_speaker_compatibility( + model, self._opts.speaker + ): + compatible = MODEL_SPEAKER_COMPATIBILITY.get(model, {}).get("all", []) + raise ValueError( + f"Speaker '{self._opts.speaker}' is not compatible with model '{model}'. " + "Please choose a compatible speaker from: " + f"{', '.join(compatible)}" + )

🤖 Prompt for AI Agents

In `@livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py` around lines 399 - 405, When update_options sets a new model (the block that assigns self._opts.model), also revalidate the currently set speaker (self._opts.speaker) if no new speaker is passed: check compatibility of the existing speaker with the requested model and raise a ValueError if incompatible. Implement this by adding a compatibility check (e.g., call a helper like is_speaker_supported(model, self._opts.speaker) or inline logic) immediately after setting model in update_options (the same scope where model, speaker and self._opts.model are handled) so switching to "bulbul:v3-beta" with an incompatible current speaker (e.g., "anushka") fails early rather than causing runtime API errors.

v3:stt and tts models

6f83e4a

coderabbitai bot reviewed Jan 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v3:stt and tts models #4603

v3:stt and tts models #4603

dhruvladia-sarvam commented Jan 23, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 23, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

v3:stt and tts models #4603

Are you sure you want to change the base?

v3:stt and tts models #4603

Conversation

dhruvladia-sarvam commented Jan 23, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dhruvladia-sarvam commented Jan 23, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 23, 2026 •

edited

Loading