Skip to content

Feature Request: Add SenseVoice as STT option (faster than Whisper) #42

@LauraGPT

Description

@LauraGPT

Feature Request

Verbi is a great modular voice assistant for experimenting with STT models. Suggesting SenseVoice as an additional STT option.

Why SenseVoice?

  • 5x faster than Whisper — lower conversation latency
  • Non-autoregressive — constant latency
  • 234M params — lightweight
  • 50+ languages — multilingual
  • Emotion detection — could enable emotion-aware assistant responses
  • OpenAI-compatible APIfunasr-server serves /v1/audio/transcriptions

Integration

from funasr import AutoModel
model = AutoModel(model="iic/SenseVoiceSmall", vad_model="fsmn-vad")
result = model.generate(input=audio)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions