-
Notifications
You must be signed in to change notification settings - Fork 2.7k
v3:stt and tts models #4603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
v3:stt and tts models #4603
Conversation
📝 WalkthroughWalkthroughThis pull request extends Sarvam plugin model support by adding "saaras:v3" to the STT models and introducing "bulbul:v3-beta" for TTS with comprehensive speaker support and compatibility mappings. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py (1)
628-638: Gate pitch/loudness by model in streaming config.The HTTP path explicitly omits
pitchandloudnessfor v3-beta (see lines 489-492: "not supported in v3-beta"), but the streaming config sends them unconditionally. This inconsistency will cause v3-beta streaming sessions to fail with API rejection. Apply the same model check as the HTTP path.♻️ Suggested adjustment
- config_msg = { - "type": "config", - "data": { - "target_language_code": self._opts.target_language_code, - "speaker": self._opts.speaker, - "pitch": self._opts.pitch, - "pace": self._opts.pace, - "loudness": self._opts.loudness, - "enable_preprocessing": self._opts.enable_preprocessing, - "model": self._opts.model, - }, - } + config_data = { + "target_language_code": self._opts.target_language_code, + "speaker": self._opts.speaker, + "pace": self._opts.pace, + "enable_preprocessing": self._opts.enable_preprocessing, + "model": self._opts.model, + } + if self._opts.model == "bulbul:v2": + config_data["pitch"] = self._opts.pitch + config_data["loudness"] = self._opts.loudness + config_msg = {"type": "config", "data": config_data}
🤖 Fix all issues with AI agents
In `@livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py`:
- Around line 399-405: When update_options sets a new model (the block that
assigns self._opts.model), also revalidate the currently set speaker
(self._opts.speaker) if no new speaker is passed: check compatibility of the
existing speaker with the requested model and raise a ValueError if
incompatible. Implement this by adding a compatibility check (e.g., call a
helper like is_speaker_supported(model, self._opts.speaker) or inline logic)
immediately after setting model in update_options (the same scope where model,
speaker and self._opts.model are handled) so switching to "bulbul:v3-beta" with
an incompatible current speaker (e.g., "anushka") fails early rather than
causing runtime API errors.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.pylivekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.pylivekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: unit-tests
- GitHub Check: type-check (3.9)
- GitHub Check: type-check (3.13)
🔇 Additional comments (4)
livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/stt.py (1)
56-57: LGTM — model literal updated cleanly.livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py (3)
50-50: LGTM — model enum expanded for v3-beta.
77-105: LGTM — speaker list expansion aligns with lowercased validation.
108-171: LGTM — compatibility mapping looks consistent with new speakers.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| if model is not None: | ||
| if not model.strip(): | ||
| raise ValueError("Model cannot be empty") | ||
| if model not in ["bulbul:v2"]: | ||
| if model not in ["bulbul:v2", "bulbul:v3-beta"]: | ||
| raise ValueError(f"Unsupported model: {model}") | ||
| self._opts.model = model | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revalidate the existing speaker when switching models.
update_options(model=...) can set bulbul:v3-beta while keeping an incompatible current speaker (e.g., "anushka"), and the mismatch isn’t checked unless speaker is also passed. This can surface as runtime API errors later.
🐛 Proposed fix
if model is not None:
if not model.strip():
raise ValueError("Model cannot be empty")
if model not in ["bulbul:v2", "bulbul:v3-beta"]:
raise ValueError(f"Unsupported model: {model}")
self._opts.model = model
+ if speaker is None and not validate_model_speaker_compatibility(
+ model, self._opts.speaker
+ ):
+ compatible = MODEL_SPEAKER_COMPATIBILITY.get(model, {}).get("all", [])
+ raise ValueError(
+ f"Speaker '{self._opts.speaker}' is not compatible with model '{model}'. "
+ "Please choose a compatible speaker from: "
+ f"{', '.join(compatible)}"
+ )🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-sarvam/livekit/plugins/sarvam/tts.py` around
lines 399 - 405, When update_options sets a new model (the block that assigns
self._opts.model), also revalidate the currently set speaker
(self._opts.speaker) if no new speaker is passed: check compatibility of the
existing speaker with the requested model and raise a ValueError if
incompatible. Implement this by adding a compatibility check (e.g., call a
helper like is_speaker_supported(model, self._opts.speaker) or inline logic)
immediately after setting model in update_options (the same scope where model,
speaker and self._opts.model are handled) so switching to "bulbul:v3-beta" with
an incompatible current speaker (e.g., "anushka") fails early rather than
causing runtime API errors.
Added:
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.