feat(voice): автоконвертация WAV → OGG/Opus в telegram_send_voice by konard · Pull Request #484 · xlabtg/teleton-agent

konard · 2026-05-16T01:15:02Z

Описание

Закрывает #483.

Инструмент telegram_send_voice теперь автоматически преобразует WAV‑файлы в OGG/Opus перед отправкой в Telegram. Раньше так делалось только в режиме TTS (через wavToOggOpus в Groq‑провайдере); ветка voicePath отправляла файл в Telegram «как есть», поэтому WAV-вход приходил как обычный документ, а не как голосовое сообщение, и возникала ошибка unsupported audio format.

Что изменилось

src/agent/tools/telegram/media/send-voice.ts
- После validateReadPath(voicePath) читается первые 12 байт и проверяется WAV‑сигнатура RIFF…WAVE.
- При совпадении файл прогоняется через существующий wavToOggOpus (чисто JS Opus‑энкодер на WebAssembly, без зависимости от системного ffmpeg).
- Результат пишется в tmpdir()/teleton-tts/<uuid>.ogg, путь подменяется и помечается для очистки в существующем finally‑блоке.
- OGG/Opus, MP3 и прочие форматы по‑прежнему отправляются без изменений (обратная совместимость).
- Обновлены description инструмента и voicePath (упомянут WAV и авто‑конверсия).

Покрытие тестами

src/agent/tools/telegram/media/__tests__/send-voice.test.ts:

Уже существующий тест: text → TTS → отправка как voice note (без регрессий).
Новый: WAV через voicePath транскодируется в OGG/Opus и попадает в GramJS как .ogg, временный файл удаляется после отправки.
Новый: OGG через voicePath отправляется без вызова wavToOggOpus (обратная совместимость).
Новый: при падении транскодинга возвращается явная ошибка Failed to convert WAV voice file to OGG/Opus: …, sendFile не вызывается.

Acceptance criteria

telegram_send_voice принимает voicePath или text и автоматически возвращает валидное OGG/Opus голосовое сообщение.
Дополнительные кодеки/аудиобиблиотеки не нужны — конверсия идёт через уже подключённый opusscript. Системный ffmpeg также не требуется.
Существующая функциональность (текст, готовые медиа) не сломана: см. unchanged‑тест и явный тест на OGG‑файл.
Тесты покрывают новый путь конверсии и проверяют, что итоговый файл — OGG.
Если voicePath уже OGG/Opus — конверсия пропускается.

Локальные проверки

npm run lint — без ошибок.
npx tsc --noEmit — без ошибок (после npm run build:sdk).
npm run format:check — все файлы соответствуют Prettier.
npx vitest run — 210 файлов, 3522 тестов, все зелёные.

Воспроизведение бага и проверка фикса

До фикса:

await telegramSendVoiceExecutor(
  { chatId: "123", voicePath: "downloads/groq-output.wav" },
  ctx
);
// → Telegram отвергает WAV: "unsupported audio format"

После фикса WAV прозрачно перекодируется в OGG/Opus, и Telegram принимает его как voice note. См. тест auto-converts a WAV file passed via voicePath into OGG/Opus before sending.

Fixes #483

Adding .gitkeep for PR creation (default mode). This file will be removed when the task is complete. Issue: xlabtg#483

…voice Groq TTS already transcodes WAV → OGG/Opus through wavToOggOpus, but the voicePath branch of telegram_send_voice forwarded the file as-is. A WAV forwarded to GramJS reaches Telegram as a generic document instead of a voice note, which surfaced as the "unsupported audio format" failure. Detect the RIFF/WAVE magic bytes after path validation. When the input is a WAV, transcode in-process via the existing pure-JS Opus encoder (no ffmpeg needed), point GramJS at the OGG temp file, and mark it for the existing cleanup hook. OGG/Opus and other formats fall through unchanged so existing callers stay unaffected. Closes xlabtg#483

konard · 2026-05-16T01:25:11Z

Working session summary

Готово. Что сделано:

Реализация — src/agent/tools/telegram/media/send-voice.ts

После валидации voicePath читаются первые байты, проверяется WAV‑сигнатура RIFF…WAVE.
При совпадении файл прогоняется через уже существующий wavToOggOpus (чистый JS Opus‑энкодер, без зависимости от системного ffmpeg), результат пишется в tmpdir()/teleton-tts/<uuid>.ogg и подменяется как audioPath.
Временный файл помечается в generatedFile, чтобы существующий finally его подчистил.
OGG/Opus, MP3 и прочие форматы — без изменений (обратная совместимость).
Обновлены description инструмента и параметра voicePath.

Тесты — src/agent/tools/telegram/media/__tests__/send-voice.test.ts

Сохранён существующий тест на TTS.
Добавлены три теста: WAV→OGG конвертация и cleanup; OGG проходит без перекодирования; явная ошибка при падении транскодера, sendFile не вызывается.

Проверки: npm run lint, npx tsc --noEmit, npm run format:check, npx vitest run (210 файлов / 3522 теста) — все зелёные.

PR xlabtg/teleton-agent#484 обновлён, переведён в ready: #484

This summary was automatically extracted from the AI working session output.

konard · 2026-05-16T01:25:21Z

🤖 Solution Draft Log

This log file contains the complete execution trace of the AI solution draft process.

💰 Cost: $3.145880

📊 Context and tokens usage:

98.1K / 1M (10%) input tokens, 19.2K / 128K (15%) output tokens

Total: (74 new + 94.1K cache writes + 4.2M cache reads) input tokens, 19.2K output tokens, $3.145880 cost

🤖 Models used:

Tool: Anthropic Claude Code
Requested: opus
Model: Claude Opus 4.7 (claude-opus-4-7)

📎 Log file uploaded as Gist (1644KB)

View complete solution draft log

Now working session is ended, feel free to review and add any feedback on the solution draft.

konard · 2026-05-16T01:27:40Z

✅ Ready to merge

This pull request is now ready to be merged:

All CI checks have passed
No merge conflicts
No pending changes

Monitored by hive-mind with --auto-restart-until-mergeable flag

This reverts commit df48231.

Initial commit with task details

df48231

Adding .gitkeep for PR creation (default mode). This file will be removed when the task is complete. Issue: xlabtg#483

konard temporarily deployed to pr-preview May 16, 2026 01:15 — with GitHub Actions Inactive

konard changed the title ~~[WIP] Add automatic WAV → OGG/Opus conversion for Groq-TTS voice messages in telegram_send_voice,~~ feat(voice): автоконвертация WAV → OGG/Opus в telegram_send_voice May 16, 2026

konard marked this pull request as ready for review May 16, 2026 01:24

konard temporarily deployed to pr-preview May 16, 2026 01:25 — with GitHub Actions Inactive

Revert "Initial commit with task details"

e1e1183

This reverts commit df48231.

konard temporarily deployed to pr-preview May 16, 2026 01:28 — with GitHub Actions Inactive

xlabtg merged commit d72cbe6 into xlabtg:main May 16, 2026
18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(voice): автоконвертация WAV → OGG/Opus в telegram_send_voice#484

feat(voice): автоконвертация WAV → OGG/Opus в telegram_send_voice#484
xlabtg merged 3 commits into
xlabtg:mainfrom
konard:issue-483-078403745758

konard commented May 16, 2026 •

edited

Loading

Uh oh!

konard commented May 16, 2026

Uh oh!

konard commented May 16, 2026

Uh oh!

konard commented May 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

konard commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Описание

Что изменилось

Покрытие тестами

Acceptance criteria

Локальные проверки

Воспроизведение бага и проверка фикса

Uh oh!

konard commented May 16, 2026

Working session summary

Uh oh!

konard commented May 16, 2026

🤖 Solution Draft Log

💰 Cost: $3.145880

📊 Context and tokens usage:

🤖 Models used:

📎 Log file uploaded as Gist (1644KB)

Uh oh!

konard commented May 16, 2026

✅ Ready to merge

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

konard commented May 16, 2026 •

edited

Loading