QVAC-19254 tts-cpp: Supertonic + Chatterbox/S3Gen GPU sched for Adreno OpenCL by pratiknarola-t · Pull Request #36 · tetherto/qvac-ext-lib-whisper.cpp

pratiknarola-t · 2026-06-01T06:45:59Z

QVAC-19254 — TTS Adreno GPU support

Supersedes #35 — the original PR was auto-closed when its branch was renamed to the correct ticket (QVAC-19213-tts-adreno-gpu → QVAC-19254-tts-adreno-gpu). Same signed commit (5205428e), identical content.

Enables Chatterbox + Supertonic TTS on Adreno (OpenCL / Vulkan) by routing GPU-unsupported ops to CPU via ggml_backend_sched, with a supporting backend-tiering fix for Adreno's OpenCL device-string format.

Commits

Supertonic GPU correctness via ggml_backend_sched — tts-cpp/src/supertonic_*.{cpp,h}. Routes the CPU-only GGML_OP_CUSTOM kernels (depthwise/pointwise Conv1D, LayerNorm, dense matmul) to CPU via sched; everything else runs on the GPU primary. Lifts the prior "GPU rejected because customs are CPU-only" guard. Verified corr ≈ 0.998 vs CPU on Adreno 740 (Vulkan) and macOS (Metal).
backend_selection: parse_adreno_version handles the OpenCL device string — tts-cpp/src/backend_selection.cpp. The OpenCL string is "QUALCOMM Adreno(TM) (OpenCL 3.0 Adreno(TM) 740)" — parsing only the first "Adreno" marker yielded 3 (from "OpenCL 3.0") and mis-tiered the GPU below Vulkan. The fix scans every marker and keeps the largest ≥ 100 (3-digit model). Recovers Adreno 740.
Route S3Gen CONV_TRANSPOSE_1D to CPU via ggml_backend_sched — tts-cpp/src/chatterbox_tts.cpp. The HiFT vocoder uses CONV_TRANSPOSE_1D, which neither ggml-opencl nor ggml-vulkan supports yet. The sched routes that op to CPU while keeping the rest on GPU. Includes the USAGE_WEIGHTS marking + per-call graph rebuild required by sched's GPU↔CPU copy machinery (mutates node->src[]).
--dump-mel-path CLI flag — tts-cpp/src/chatterbox_cli.cpp. Wires the CLI through to the existing opts.dump_mel_path field (the npy dump hooks are already on master), so a debug user can compare CPU vs GPU intermediates via --dump-mel-path /path/to/prefix.

Verification

On-device smoke against the just-synced qvac-ext-ggml/speech (ggml v0.10.2) + the matching Adreno OpenCL/Vulkan PRs (the QVAC-19253 ggml-vulkan PR + the QVAC-19254 ggml-opencl kernels PR):

Smoke	Result
Chatterbox-OpenCL	✅ EXIT=0, 3.44 s WAV, RTF 37.6 (consistent with prior baseline)
Supertonic-OpenCL	✅ EXIT=0, 3.57 s WAV
Supertonic-Vulkan	✅ EXIT=0, 3.57 s WAV — Adreno 740 detected, Qualcomm-gated guards active, no crashes

Hygiene

All source comments scrubbed of QVAC-#### ticket refs + internal hypothesis-log IDs (H016/H017).
The verbose model_ctx / s3gen_sched_alloc blocks were compressed from 8/6 lines to 5/2 while preserving the essential SIGSEGV-prevention + threading-race rationale.
Diff confirms only comments changed in the cleanup (apart from the one trailing-comment edit on the dump_mel_path field declaration).

…atterbox/S3Gen) Route Supertonic and Chatterbox/S3Gen GPU graphs through ggml_backend_sched so ops the GPU backend cannot run (CONV_TRANSPOSE_1D in the HiFT vocoder; the CPU-only GGML_OP_CUSTOM kernels in the Supertonic vector estimator/vocoder) are routed to CPU instead of asserting. Capability-gate the Chatterbox HiFT scheduler: a backend that runs every op in the graph (Metal, CUDA, CPU) computes directly on the primary backend; only a backend missing an op (Adreno OpenCL / Vulkan) uses the [GPU,CPU] scheduler. The gate queries ggml_backend_supports_op per node, so it is generic and does not regress iOS Metal (which supports CONV_TRANSPOSE_1D natively and otherwise aborts in the scheduler's graph-split). Gate Android GPU selection to Qualcomm Adreno: other Android GPU vendors are unvalidated and at least one (ARM Mali) aborts the host process uncatchably from graph compute, so non-Adreno devices fall through to CPU. parse_adreno_version handles the OpenCL device-name string (e.g. 'OpenCL 3.0 Adreno(TM) 740') by scanning every marker for the real model number. Also expose the pre-existing S3Gen mel/encoder/CFM intermediate dump via the --dump-mel-path CLI flag.

pratiknarola-t requested review from a team as code owners June 1, 2026 06:46

pratiknarola-t mentioned this pull request Jun 1, 2026

QVAC-19213 tts-cpp: Supertonic + Chatterbox/S3Gen GPU sched for Adreno OpenCL #35

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QVAC-19254 tts-cpp: Supertonic + Chatterbox/S3Gen GPU sched for Adreno OpenCL#36

QVAC-19254 tts-cpp: Supertonic + Chatterbox/S3Gen GPU sched for Adreno OpenCL#36
pratiknarola-t wants to merge 1 commit into
masterfrom
QVAC-19254-tts-adreno-gpu

pratiknarola-t commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pratiknarola-t commented Jun 1, 2026

QVAC-19254 — TTS Adreno GPU support

Commits

Verification

Hygiene

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant