rctl / stt-server Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

STT server is a basic standalone HTTP API server for OpenAI Whisper model. It allows for simple sharing of GPU resources across local applications that require STT integration.

0 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Dockerfile		Dockerfile
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Repository files navigation

stt-server

Minimal Whisper-based speech-to-text HTTP server.

Usage

Pre-built docker container: rctl/stt-server:latest

docker run rctl/stt-server:latest -p 8081:8000

What It Does

Loads an OpenAI Whisper model on startup (turbo by default).
Exposes a single transcription endpoint: POST /transcribe.
Accepts raw PCM s16le mono audio in the request body.
Returns Whisper JSON output (including text and segments timing).

API

GET / returns ok
POST /transcribe
- Headers:
  - X-Sample-Rate: sample rate (default 16000)
  - X-Lang: language code (default en)
- Body:
  - raw int16 PCM bytes
- Response:
  - Whisper transcription JSON

Run

pip install -r requirements.txt
python main.py --host 0.0.0.0 --port 8000 --model turbo

Optional flags:

--device cuda|cpu
--no-fp16
--debug (writes incoming audio samples as *.wav)

Quick Test

curl -X POST "http://localhost:8000/transcribe" \
  -H "X-Sample-Rate: 16000" \
  -H "X-Lang: en" \
  --data-binary @sample.pcm

About

STT server is a basic standalone HTTP API server for OpenAI Whisper model. It allows for simple sharing of GPU resources across local applications that require STT integration.

Report repository

Languages