Skip to content

[VoiceLive]Add MAI and additional transcription model support to live tests#45779

Open
xitzhang wants to merge 1 commit intomainfrom
xitzhang/supportmaimodel
Open

[VoiceLive]Add MAI and additional transcription model support to live tests#45779
xitzhang wants to merge 1 commit intomainfrom
xitzhang/supportmaimodel

Conversation

@xitzhang
Copy link
Member

  • Add mai-transcribe-1, whisper-1, and azure-speech to transcription model parametrize list
  • Switch realtime model from gpt-4o-realtime-preview to gpt-realtime-mini
  • Increase flaky reruns from 1 to 3 for stability
  • Simplify session config: remove language param, turn_detection, and shorten instructions

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

- Add mai-transcribe-1, whisper-1, and azure-speech to transcription model parametrize list
- Switch realtime model from gpt-4o-realtime-preview to gpt-realtime-mini
- Increase flaky reruns from 1 to 3 for stability
- Simplify session config: remove language param, turn_detection, and shorten instructions
Copilot AI review requested due to automatic review settings March 18, 2026 23:06
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the VoiceLive live realtime service tests to broaden transcription model coverage while aiming to improve test stability and simplify session configuration.

Changes:

  • Expand transcription_model parametrization to include whisper-1, azure-speech, and mai-transcribe-1.
  • Switch the realtime model under test to gpt-realtime-mini.
  • Increase flaky reruns and simplify the session config (remove language, turn_detection, and shorten instructions).

Comment on lines +813 to +820
"transcription_model", [
"whisper-1",
"gpt-4o-transcribe",
"gpt-4o-mini-transcribe",
"gpt-4o-transcribe-diarize",
"azure-speech",
"mai-transcribe-1",
]
"gpt-4o-transcribe-diarize",
"azure-speech",
"mai-transcribe-1",
]
test_data_dir: Path,
model: str,
transcription_model: Literal["gpt-4o-transcribe", "gpt-4o-mini-transcribe", "gpt-4o-transcribe-diarize"],
transcription_model: Literal["whisper-1", "gpt-4o-transcribe", "gpt-4o-mini-transcribe", "gpt-4o-transcribe-diarize", "azure-speech", "mai-transcribe-1"],
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants