LEMAS‑Edit: Multilingual Speech Editing System

LEMAS‑Edit is a multilingual version speech editing system, supporting 10 languages:

Chinese
English
Spanish
Russian
French
German
Italian
Portuguese
Indonesian
Vietnamese

It bundles:

the multilingual flow-matching backend (lemas_tts)
the decoder only edit backend (lemas_edit)
pretrained checkpoints, vocabs and demo data (pretrained_models/)
an end‑to‑end Gradio web UI (gradio_mix.py)

Compared to the original LEMAS‑TTS repo, this project focuses on speech editing instead of pure TTS, and integrates both backends into a single interface.

1. Features

Autoregressive codec speech editing backend
- Support 7 languages (zh / en / de / fr / pt / es / it)
- Integrated with WhisperX + MMS alignment for “edit by text + span”
- Uses UVR5 and DeepFilterNet for denoising (Optional Choice)
Multilingual speech editing (flow-matching backend)
- Based on the LEMAS‑TTS models (multilingual_grl, multilingual_prosody)
- Supports the same languages as LEMAS‑TTS (zh / en / es / ru / fr / de / it / pt / id / vi)
One Gradio UI for both backends
- Edit Model selector: multilingual_grl, multilingual_prosody, autoregressive
- Shared transcription, alignment, denoise and visualization components
- All required models are expected under pretrained_models/

2. Installation

2.1 Environment

git clone https://github.com/LEMAS-Project/LEMAS-Edit.git
cd ./LEMAS-Edit

conda create -n lemas-edit python=3.10
conda activate lemas-edit

2.2 System Dependencies

You can install system dependencies via apt or conda:

sudo apt-get update
sudo apt-get install -y ffmpeg

or

conda install -c conda-forge ffmpeg

2.3 Python Dependencies

pip install -r requirements.txt

Install PyTorch + Torchaudio according to your device (CUDA / ROCm / CPU / MPS) following the official PyTorch instructions.

3.4 Download Pretrained Models

Download the pretrained models for both backends from https://huggingface.co/LEMAS-Project/LEMAS-Edit and place pretrained_models/ in the directory next to the lemas_edit/ folder.

Once pretrained_models/ is in place, both lemas_tts and lemas_edit will automatically find the checkpoints and vocabs.

3. Usage

All commands below assume:

cd ./LEMAS-Edit
export PYTHONPATH="$PWD:${PYTHONPATH}"

3.1 Gradio Web UI (Integrated Editing Demo)

To launch the full editing UI locally:

python gradio_mix.py

You can customize host/port and sharing:

python gradio_mix.py --host 0.0.0.0 --port 7861 --share

3.2 CLI: Multilingual TTS and CFM Speech Editing

The lemas_tts.scripts entrypoints are kept for convenience and behave as in the original LEMAS‑TTS repo:

TTS from text:
- Python: lemas_tts.scripts.tts_multilingual
- Shell: lemas_tts/scripts/tts_multilingual.sh
speech editing:
- Python: lemas_tts.scripts.speech_edit_multilingual
- Shell: lemas_tts/scripts/speech_edit_multilingual.sh

See those scripts for detailed CLI options (model choice, ckpt paths, speed / NFE / CFG / Sway, etc.).

3.3 CLI: Autoregressive Codec Speech Editing (WIP)

A direct CLI for the autoregressive codec backend is provided as a starting point:

Python entry: lemas_edit.scripts.inference_lemas_editing
Shell helper: lemas_edit/scripts/inference_lemas_editing.sh

This script is a port of the original VoiceCraft/inference_lemas_editing.py and is currently being adapted to the lemas_edit namespace. Its interface may change; please refer to the script source for up‑to‑date arguments and usage.

3.4 Subjective Evaluation

We provide simple subjective listening tests (MUSHRA and ABX preference test) setup under ./eval.

To install the extra dependencies for evaluation, run:

pip install git+https://github.com/descriptinc/audiotools
pip install joypy pandas

To start the ABX preference test, install the extra dependencies and launch the tools:

cd ./eval/abx
python abx.py      # launch Gradio ABX preference test UI
python plot.py     # aggregate results and plot preference distributions

To start the MUSHRA listening test, install the extra dependencies and launch the tools:

cd ./eval/mushra
python mushra.py      # launch Gradio MUSHRA listening test UI

4. Acknowledgements

This project builds on, and reuses code from, several open‑source projects:

VoiceCraft – Autoregressive speech editing model.
F5‑TTS – Flow Matching based TTS.
Vocos – Fourier-based neural vocoder.
Seamless-Expressive – Prosody encoder.
UVR5 – Separate an audio file into various stems, using multiple models.
DeepFilterNet – Noise supression using deep filtering.
audiotools – Audio tools for subjective evaluation.

If you use LEMAS‑Edit in your work, please also consider citing and acknowledging these upstream projects.

5. License

This repository is released under the CC‑BY‑NC‑4.0 license.
See https://creativecommons.org/licenses/by-nc/4.0/ for more details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LEMAS‑Edit: Multilingual Speech Editing System

1. Features

2. Installation

2.1 Environment

2.2 System Dependencies

2.3 Python Dependencies

3.4 Download Pretrained Models

3. Usage

3.1 Gradio Web UI (Integrated Editing Demo)

3.2 CLI: Multilingual TTS and CFM Speech Editing

3.3 CLI: Autoregressive Codec Speech Editing (WIP)

3.4 Subjective Evaluation

4. Acknowledgements

5. License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
eval		eval
lemas_edit		lemas_edit
lemas_tts		lemas_tts
uvr5		uvr5
README.md		README.md
gradio_mix.py		gradio_mix.py
requirements.txt		requirements.txt

LEMAS-Project/LEMAS-Edit

Folders and files

Latest commit

History

Repository files navigation

LEMAS‑Edit: Multilingual Speech Editing System

1. Features

2. Installation

2.1 Environment

2.2 System Dependencies

2.3 Python Dependencies

3.4 Download Pretrained Models

3. Usage

3.1 Gradio Web UI (Integrated Editing Demo)

3.2 CLI: Multilingual TTS and CFM Speech Editing

3.3 CLI: Autoregressive Codec Speech Editing (WIP)

3.4 Subjective Evaluation

4. Acknowledgements

5. License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages