VoxFlow AI is a desktop application designed to streamline the creation of audiobooks and audio content. Built with modern web and Rust technologies, it allows you to visually script, synthesize natural-sounding speech using Alibaba Cloud Bailian, and manage your audio projects with ease.
Note: This application uses the Alibaba Cloud Bailian (DashScope) platform for its Text-to-Speech (TTS) capabilities. You will need a valid API Key from the platform to use the synthesis features.
- 🎙️ Alibaba Cloud Bailian Integration — High-quality TTS synthesis powered by models like Qwen3 TTS Flash.
- ⚙️ User-Configurable Settings — Easily input your API Key, select models, and adjust default voice settings (speed, pitch, voice) directly from the application UI.
- ⏱️ Adjustable Intervals — Fine-tune the silence between script lines to ensure natural pacing.
- 💾 Robust Data Management — Built-in SQLite database ensures data integrity and prevents conflicts during manual script creation.
- 🖥️ Modern Desktop Experience — Powered by Tauri 2.0 for a lightweight, secure, and native feel.
Follow these instructions to get a local copy of the project up and running.
-
Clone the repository:
git clone https://github.com/iMyth/VoxFlow.git cd VoxFlow -
Install dependencies:
# Install frontend dependencies pnpm install # Or: npm install
-
Configure Bailian API:
Unlike many CLI tools, VoxFlow does not require you to set environment variables manually.
- Launch the application (see Development below).
- Click the Settings (gear icon ⚙️).
- Enter your DashScope API Key in the designated field.
To start the application in development mode with hot-reloading:
npm run tauri:devThis command will start both the Vite development server and the Tauri backend simultaneously.
To build the application for your platform:
npm run tauri:build| Layer | Technologies |
|---|---|
| Frontend | React 19, TypeScript, Vite, Tailwind CSS, Zustand |
| Backend | Rust, Tauri 2.0, Tokio, Rusqlite (SQLite) |
| AI Services | Alibaba Cloud Bailian (DashScope) |
This project is licensed under the MIT License.
