fix(stable_audio): align batched initial audio with prompts (#13629) by Anai-Guo · Pull Request #13659 · huggingface/diffusers

Anai-Guo · 2026-04-30T07:26:54Z

Summary

Fixes Issue 1 in #13629.

StableAudioPipeline.prepare_latents() was expanding batched initial audio with encoded_audio.repeat((num_waveforms_per_prompt, 1, 1)), which produces an interleaved layout [audio0, audio1, audio0, audio1]. The corresponding text+duration embeds are expanded per prompt by encode_prompt() to [prompt0, prompt0, prompt1, prompt1]. For batched audio-to-audio with num_waveforms_per_prompt > 1, prompts and initial audio became misaligned, so each generation could be conditioned on another prompt’s initial audio. Existing tests only assert output shape and missed it.

Fix

Switch repeat to repeat_interleave(..., dim=0) so batched initial audio expands as [audio0, audio0, audio1, audio1], matching the prompt expansion.

Verification

The reproduction snippet from #13629 now prints the expected order:

import torch
from diffusers import StableAudioPipeline
# (same DummyVAE / SimpleNamespace scaffolding as in the issue)
print(latents[:, 0, 0].tolist())  # [10.0, 10.0, 20.0, 20.0]

num_waveforms_per_prompt == 1 is unchanged because repeat_interleave and repeat are equivalent in that case.

🤖 Generated with Claude Code

…e_latents

fix(stable_audio): align batched initial audio with prompts in prepar…

f2f294f

…e_latents

github-actions Bot added pipelines size/S PR with diff < 50 LOC labels Apr 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(stable_audio): align batched initial audio with prompts (#13629)#13659

fix(stable_audio): align batched initial audio with prompts (#13629)#13659
Anai-Guo wants to merge 1 commit intohuggingface:mainfrom
Anai-Guo:fix-stable-audio-prompt-alignment

Anai-Guo commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Anai-Guo commented Apr 30, 2026

Summary

Fix

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant