Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
bb70cab
initial
kashif Dec 16, 2025
f4990a9
use the scheduler
kashif Dec 16, 2025
c74d66d
added Block-wise sampling
kashif Dec 16, 2025
9a9470c
add hybrid
kashif Dec 16, 2025
886a76d
hybrid sample training
kashif Dec 16, 2025
6f94f4f
initial trainer
kashif Dec 20, 2025
c1bf227
added llada2
kashif Jan 1, 2026
f2b9223
add sample_llada2.py
kashif Jan 1, 2026
cf18acd
added api
kashif Jan 4, 2026
9209924
fix llada2 sampling
kashif Jan 4, 2026
5835d2c
formatting
kashif Jan 4, 2026
264886a
Merge branch 'main' into diff-d2
kashif Jan 6, 2026
541c03b
Merge branch 'main' into diff-d2
kashif Jan 11, 2026
1b3ffd7
make fix-copies
kashif Jan 11, 2026
c263f16
fix docs
kashif Jan 11, 2026
b97e11c
add dflash pipeline
kashif Jan 11, 2026
471bfd3
added SDAR JET pipeline and scheduler
kashif Jan 11, 2026
ec4b2a4
Merge branch 'main' into diff-d2
kashif Jan 12, 2026
2f8a48b
Merge branch 'main' into diff-d2
kashif Jan 18, 2026
3772ca1
Merge branch 'main' into diff-d2
kashif Jan 19, 2026
be16390
Merge branch 'main' into diff-d2
kashif Feb 4, 2026
3a4ea4d
Merge branch 'main' into diff-d2
kashif Feb 9, 2026
88fe29d
initial review changes
kashif Feb 9, 2026
61fdfda
add discrete diffusion mixin
kashif Feb 9, 2026
939c7ef
Pre-compute alpha schedule and Restructure step() with if/elif/else f…
kashif Feb 9, 2026
597060d
merge BlockTokenDiffusionScheduler into TokenDiffusionScheduler
kashif Feb 9, 2026
20b1134
support for llada 2.1
kashif Feb 10, 2026
70fc4c1
fix sampling example to use pipeline
kashif Feb 10, 2026
0432b80
formatting
kashif Feb 10, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions docs/source/en/api/pipelines/block_refinement.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# Block Refinement

`BlockRefinementPipeline` performs block-wise iterative refinement over a masked token template, sampling and
committing tokens based on confidence.

## Config defaults

You can set default sampling parameters when creating the pipeline. Passing `None` for a parameter in `__call__`
falls back to `pipe.config`.

```py
from diffusers import BlockRefinementPipeline

pipe = BlockRefinementPipeline(
model=model,
tokenizer=tokenizer,
gen_length=256,
block_length=32,
steps=16,
temperature=0.8,
sampling_method="multinomial",
)

out = pipe(prompt="Explain gradient descent.")
print(out.texts[0])
```

## Callbacks

Callbacks run after each refinement step and can inspect or override the current tokens.

```py
def on_step_end(pipe, step, timestep, callback_kwargs):
cur_x = callback_kwargs["cur_x"]
# Inspect or modify `cur_x` here.
return {"cur_x": cur_x}

out = pipe(
prompt="Write a short poem.",
callback_on_step_end=on_step_end,
callback_on_step_end_tensor_inputs=["cur_x"],
)
```

## BlockRefinementPipeline
[[autodoc]] BlockRefinementPipeline
- all
- __call__

## BlockRefinementPipelineOutput
[[autodoc]] pipelines.BlockRefinementPipelineOutput
23 changes: 23 additions & 0 deletions docs/source/en/api/pipelines/block_token_diffusion.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# Block Token Diffusion

`BlockTokenDiffusionPipeline` performs token diffusion by iterating over fixed-size blocks of the sequence.

## BlockTokenDiffusionPipeline
[[autodoc]] BlockTokenDiffusionPipeline
- all
- __call__

## BlockTokenDiffusionPipelineOutput
[[autodoc]] pipelines.BlockTokenDiffusionPipelineOutput
23 changes: 23 additions & 0 deletions docs/source/en/api/pipelines/hybrid_token_diffusion.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# Hybrid Token Diffusion

`HybridTokenDiffusionPipeline` is an alias of `TokenDiffusionPipeline` for hybrid-transition schedulers.

## HybridTokenDiffusionPipeline
[[autodoc]] HybridTokenDiffusionPipeline
- all
- __call__

## TokenDiffusionPipelineOutput
[[autodoc]] pipelines.TokenDiffusionPipelineOutput
23 changes: 23 additions & 0 deletions docs/source/en/api/pipelines/llada2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# LLaDA2

`LLaDA2Pipeline` adapts block refinement sampling for LLaDA2-style token diffusion models.

## LLaDA2Pipeline
[[autodoc]] LLaDA2Pipeline
- all
- __call__

## LLaDA2PipelineOutput
[[autodoc]] pipelines.LLaDA2PipelineOutput
7 changes: 7 additions & 0 deletions docs/source/en/api/pipelines/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ The table below lists all the pipelines currently available in 🤗 Diffusers an
| [AudioLDM2](audioldm2) | text2audio |
| [AuraFlow](aura_flow) | text2image |
| [BLIP Diffusion](blip_diffusion) | text2image |
| [Block Refinement](block_refinement) | text2text |
| [Block Token Diffusion](block_token_diffusion) | text2text |
| [Bria 3.2](bria_3_2) | text2image |
| [CogVideoX](cogvideox) | text2video |
| [Consistency Models](consistency_models) | unconditional image generation |
Expand All @@ -47,11 +49,14 @@ The table below lists all the pipelines currently available in 🤗 Diffusers an
| [Dance Diffusion](dance_diffusion) | unconditional audio generation |
| [DDIM](ddim) | unconditional image generation |
| [DDPM](ddpm) | unconditional image generation |
| [DFlash](dflash) | text2text |
| [SDAR](sdar) | text2text |
| [DeepFloyd IF](deepfloyd_if) | text2image, image2image, inpainting, super-resolution |
| [DiffEdit](diffedit) | inpainting |
| [DiT](dit) | text2image |
| [Flux](flux) | text2image |
| [Hunyuan-DiT](hunyuandit) | text2image |
| [Hybrid Token Diffusion](hybrid_token_diffusion) | text2text |
| [I2VGen-XL](i2vgenxl) | image2video |
| [InstructPix2Pix](pix2pix) | image editing |
| [Kandinsky 2.1](kandinsky) | text2image, image2image, inpainting, interpolation |
Expand All @@ -62,6 +67,7 @@ The table below lists all the pipelines currently available in 🤗 Diffusers an
| [Latent Diffusion](latent_diffusion) | text2image, super-resolution |
| [Latte](latte) | text2image |
| [LEDITS++](ledits_pp) | image editing |
| [LLaDA2](llada2) | text2text |
| [Lumina-T2X](lumina) | text2image |
| [Marigold](marigold) | depth-estimation, normals-estimation, intrinsic-decomposition |
| [MultiDiffusion](panorama) | text2image |
Expand All @@ -83,6 +89,7 @@ The table below lists all the pipelines currently available in 🤗 Diffusers an
| [T2I-Adapter](stable_diffusion/adapter) | text2image |
| [Text2Video](text_to_video) | text2video, video2video |
| [Text2Video-Zero](text_to_video_zero) | text2video |
| [Token Diffusion](token_diffusion) | text2text |
| [unCLIP](unclip) | text2image, image variation |
| [UniDiffuser](unidiffuser) | text2image, image2text, image variation, text variation, unconditional image generation, unconditional audio generation |
| [Value-guided planning](value_guided_sampling) | value guided sampling |
Expand Down
24 changes: 24 additions & 0 deletions docs/source/en/api/pipelines/token_diffusion.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# Token Diffusion

`TokenDiffusionPipeline` provides a generic token-space diffusion sampler for discrete denoising over token IDs. It
pairs a token denoiser model with a token diffusion scheduler.

## TokenDiffusionPipeline
[[autodoc]] TokenDiffusionPipeline
- all
- __call__

## TokenDiffusionPipelineOutput
[[autodoc]] pipelines.TokenDiffusionPipelineOutput
21 changes: 21 additions & 0 deletions docs/source/en/api/schedulers/block_token_diffusion.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# BlockTokenDiffusionScheduler

`BlockTokenDiffusionScheduler` extends `TokenDiffusionScheduler` with block-wise updates over token positions.

## BlockTokenDiffusionScheduler
[[autodoc]] BlockTokenDiffusionScheduler

## TokenDiffusionSchedulerOutput
[[autodoc]] schedulers.scheduling_token_diffusion.TokenDiffusionSchedulerOutput
22 changes: 22 additions & 0 deletions docs/source/en/api/schedulers/hybrid_token_diffusion.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# HybridTokenDiffusionScheduler

`HybridTokenDiffusionScheduler` defines hybrid discrete token diffusion updates with separate transitions for
masked and unmasked tokens.

## HybridTokenDiffusionScheduler
[[autodoc]] HybridTokenDiffusionScheduler

## HybridTokenDiffusionSchedulerOutput
[[autodoc]] schedulers.scheduling_hybrid_token_diffusion.HybridTokenDiffusionSchedulerOutput
22 changes: 22 additions & 0 deletions docs/source/en/api/schedulers/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,28 @@ Many schedulers are implemented from the [k-diffusion](https://github.com/crowso
| exponential | init with `timestep_spacing="linspace"`, `use_exponential_sigmas=True` |
| beta | init with `timestep_spacing="linspace"`, `use_beta_sigmas=True` |

## Token diffusion schedulers

These schedulers operate over categorical token IDs instead of continuous latents. They are designed for discrete
token diffusion models and expose the same `set_timesteps`/`step` interface as other schedulers.

Differences between the discrete token schedulers:
- `TokenDiffusionScheduler`: token-level diffusion with per-token corruption (e.g. mask/uniform) and a single-step `step` to denoise logits.
- `BlockTokenDiffusionScheduler`: block-wise token diffusion that updates fixed-size blocks in parallel.
- `HybridTokenDiffusionScheduler`: hybrid transitions that combine token- and block-wise updates in the same schedule.
- `DFlashTokenDiffusionScheduler`: block diffusion scheduler specialized for speculative decoding with a draft model and target acceptance.
- `SDARTokenDiffusionScheduler`: block diffusion scheduler with remasking strategies (sequential/low-confidence/entropy-bounded) per step.

[[autodoc]] TokenDiffusionScheduler

[[autodoc]] BlockTokenDiffusionScheduler

[[autodoc]] HybridTokenDiffusionScheduler

[[autodoc]] DFlashTokenDiffusionScheduler

[[autodoc]] SDARTokenDiffusionScheduler

All schedulers are built from the base [`SchedulerMixin`] class which implements low level utilities shared by all schedulers.

## SchedulerMixin
Expand Down
22 changes: 22 additions & 0 deletions docs/source/en/api/schedulers/token_diffusion.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# TokenDiffusionScheduler

`TokenDiffusionScheduler` defines discrete token diffusion updates over categorical token IDs and supports multiple
forward processes and alpha schedules.

## TokenDiffusionScheduler
[[autodoc]] TokenDiffusionScheduler

## TokenDiffusionSchedulerOutput
[[autodoc]] schedulers.scheduling_token_diffusion.TokenDiffusionSchedulerOutput
Loading
Loading