Skip to content

Add rolling KV cache hook#13569

Draft
gueraf wants to merge 5 commits intohuggingface:mainfrom
gueraf:sf-rolling-kv-minimal
Draft

Add rolling KV cache hook#13569
gueraf wants to merge 5 commits intohuggingface:mainfrom
gueraf:sf-rolling-kv-minimal

Conversation

@gueraf
Copy link
Copy Markdown

@gueraf gueraf commented Apr 27, 2026

Summary

  • Add a Self Forcing/Wan-focused (yet extensible) rolling self-attention KV cache hook for autoregressive video generation.
  • Scope this first upstream step to rolling self-attention caching; cross-attention caching and pinned/sink-frame retention are not included yet.
  • Add tests for cache append/overwrite/window behavior, context isolation, CacheMixin wiring, and Wan attention hook selection.

Motivation

This is a scoped follow-up to #12773 and a first step toward #12600. The previous draft explored similar functionality but also included Krea-specific experiments and broader integration work. This PR keeps the reusable self-attention cache mechanism isolated so it can be reviewed independently.

As for practical use, we (https://odyssey.ml/) would like to rely on the Hugging Face Diffusers ecosystem to ship Self-Forcing-like models without having to ship many custom modules, ideally none.

Bitwise equivalence against the original Self-Forcing implementation, plus example videos, are tracked at:
https://github.com/gueraf/self-forcing-diffusers/releases/tag/parity-artifacts-long-equivalence-7chunks-20260428

@github-actions github-actions Bot added models tests hooks size/L PR with diff > 200 LOC labels Apr 27, 2026
@github-actions github-actions Bot added size/L PR with diff > 200 LOC and removed size/L PR with diff > 200 LOC labels Apr 27, 2026
@gueraf gueraf marked this pull request as ready for review April 28, 2026 09:09
@gueraf gueraf marked this pull request as draft May 5, 2026 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hooks models size/L PR with diff > 200 LOC tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant