Voice-To-Text

A highly efficient voice-to-text tool based on the ChatGPT Web API, specifically designed for Linux (GNOME) environments. It enables a seamless "Record -> Transcribe -> Beautify -> Copy to Clipboard" workflow with a simple hotkey toggle.

Prerequisites

This project uses uv for Python environment and dependency management.

1. Install System Dependencies

Ensure your system has portaudio (for recording) and wl-clipboard (for clipboard operations on Wayland) installed:

# For Debian/Ubuntu
sudo apt install libportaudio2 libportaudiocpp0 portaudio19-dev wl-clipboard

2. Configure API Headers

To call the ChatGPT transcription API, you need to manually capture the Request Headers after logging in:

Open your browser and log in to ChatGPT.
Open Developer Tools (F12) and go to the Network tab.
Record a short voice command on the page to trigger the transcribe API.
Find the transcribe request and copy its Request Headers (excluding headers starting with :).
Create a file named .request-header.txt in the project root. You can use the provided template as a reference, ensuring it includes required fields like Authorization and Cookie.

3. Install Python Dependencies

uv sync

Usage

Run Directly

uv run src/main.py

First Run: Starts recording; a notification will appear.
Second Run: Stops recording; the system begins transcription and beautification.
Completion: The result is automatically saved to output/output.txt and copied to your clipboard.

Recommended: Bind to GNOME Hotkey

It is recommended to bind uv run /path/to/Audio-To-Text/src/main.py to a system shortcut (e.g., Ctrl + Alt + T) for the smootmost experience.

Project Structure

src/main.py: Main script for toggling recording states and orchestration.
src/transcribe.py: Handles communication with the ChatGPT API.
src/beautify.py: Logic for text beautification and punctuation correction.
src/record.py: Underlying recording implementation.
output/: Directory for temporary audio files and final transcription results.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
docs		docs
src		src
.gitignore		.gitignore
.request-header.txt.example		.request-header.txt.example
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice-To-Text

Prerequisites

1. Install System Dependencies

2. Configure API Headers

3. Install Python Dependencies

Usage

Run Directly

Recommended: Bind to GNOME Hotkey

Project Structure

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Voice-To-Text

Prerequisites

1. Install System Dependencies

2. Configure API Headers

3. Install Python Dependencies

Usage

Run Directly

Recommended: Bind to GNOME Hotkey

Project Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages