S2S-Arena

Evaluating Paralinguistic Instruction Following in Speech-to-Speech Models

Overview

S2S-Arena is the official repository for our ACL 2026 main conference paper:

S2S-Arena: Evaluating Paralinguistic Instruction Following in Speech-to-Speech Models

Recent speech-to-speech (S2S) systems are becoming increasingly natural spoken agents, but existing benchmarks still rely heavily on text-based evaluation. They often miss key paralinguistic cues such as prosody, emotion, speaking style, and speaker traits, which are essential for expressive and human-like communication.

S2S-Arena is a speech-native benchmark for evaluating instruction-following S2S models. It explicitly assesses both semantic understanding and paralinguistic expression through a multi-level interaction protocol and an arena-style pairwise evaluation framework directly in the speech modality.

This repository is under active development. We are currently preparing the public release of the dataset, automatic evaluation scripts, and a continuously updated leaderboard.

News

2026-05: S2S-Arena was accepted to the ACL 2026 main conference.
Coming soon: Seed and Augment dataset release.
Coming soon: Automatic evaluation toolkit.
Coming soon: Live leaderboard for S2S model comparison.

Contact

For questions, feedback, or collaboration, please contact:

Feng Jiang: jiangfeng@suat-sz.edu.cn
Benyou Wang: wangbenyou@cuhk.edu.cn

Issues and pull requests are also welcome.

Citation

If you find S2S-Arena useful, please cite our paper. The final ACL proceedings citation will be updated once available.

@inproceedings{jiang2026s2sarena,
  title     = {S2S-Arena: Evaluating Paralinguistic Instruction Following in Speech-to-Speech Models},
  author    = {Jiang, Feng and Lin, Zhiyu and Bu, Fan and Du, Yuhao and Wang, Benyou and Li, Haizhou},
  booktitle = {Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics},
  year      = {2026}
}

License

The license will be updated soon. Please contact the authors before using unreleased dataset or evaluation resources for commercial purposes.

Acknowledgement

We thank the open-source speech and language model communities for their work. S2S-Arena builds on the progress of speech-to-speech modeling and aims to support more reliable, expressive, and human-aligned spoken interaction.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
models		models
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

S2S-Arena

Overview

News

Contact

Citation

License

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

S2S-Arena

Overview

News

Contact

Citation

License

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages