Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
687a81f
chore: make libwebp optional and support system libwebp (#1387)
wbruna Apr 5, 2026
9369ab7
feat: inpaint improvements (#1357)
stduhpf Apr 5, 2026
7397dda
feat: add webm support (#1391)
leejet Apr 5, 2026
359eb8b
refactor: apply RAII ownership to examples (#1392)
leejet Apr 6, 2026
5bf438d
refactor: split examples common into header and source (#1393)
leejet Apr 6, 2026
8afbeb6
chore: normalize text files to utf-8 without bom (#1394)
leejet Apr 6, 2026
dd75372
fix: correct double increment on flow denoisers sigma calculations (#…
wbruna Apr 8, 2026
e8323ca
feat: add flux2 small decoder support (#1402)
leejet Apr 8, 2026
be9f51b
refactor: simplify DiscreteFlowDenoiser (#1405)
wbruna Apr 11, 2026
118489e
chore: harden safetensors and gguf loading code (#1404)
wbruna Apr 11, 2026
7ade90e
feat: add sdcpp api support (#1407)
leejet Apr 11, 2026
fd35047
feat: use sdcpp-webui as embedded webui (#1408)
leejet Apr 11, 2026
12a369c
docs: update readme
leejet Apr 11, 2026
6b675a5
docs: update readme
leejet Apr 11, 2026
ee5bf95
chore: allow building the embedded UI header separately (#1415)
wbruna Apr 15, 2026
9ac7b67
refactor: introduce shared tokenizer abstraction and split implementa…
leejet Apr 15, 2026
c41c5de
feat: add left padding support to tokenizers (#1424)
leejet Apr 15, 2026
5c243db
feat: add ernie image support (#1427)
leejet Apr 16, 2026
d73b419
feat: SDXS-09 support and update doc (#1356)
akleine Apr 16, 2026
1b4e9be
feat: add er_sde sampler (#1403)
rmatif Apr 16, 2026
84fc544
fix: skip empty prompt segments around attention range (#1429)
leejet Apr 16, 2026
a564fdf
refactor: remove is_xl guard wrapper in get_sd_version (#1430)
leejet Apr 16, 2026
2bcff67
fix: correct dpm++2s_a second model call (#1435)
wbruna Apr 18, 2026
6a9cb31
fix: tune ernie-image default flow shift (#1433)
Green-Sky Apr 18, 2026
f3f69e2
feat: add DPM++ (2S) Ancestral implementation for flow models (#1428)
wbruna Apr 18, 2026
4d626d2
feat(server): implement vid_gen async API and mode-aware capabilities…
leejet Apr 18, 2026
3c99f70
ci: skip docker image build job on pull requests (#1439)
leejet Apr 18, 2026
7d33d4b
chore: enable MSVC parallel compilation with /MP (#1438)
leejet Apr 18, 2026
e77e4c4
feat: adapt LCM for flow models (#1413)
wbruna Apr 19, 2026
7023fc4
fix: correct image to image DDIM and TCD (#1410)
wbruna Apr 19, 2026
6614334
refactor: move model file IO into dedicated module (#1442)
leejet Apr 19, 2026
0a7ae07
feat: add restricted torch legacy checkpoint loading (#1443)
leejet Apr 19, 2026
44cca3d
feat: support safetensors export in convert mode (#1444)
leejet Apr 19, 2026
c97702e
feat: add sd-webui style Hires. fix support (#1451)
leejet Apr 22, 2026
b8bdffc
feat: add more built-in highres upscalers (#1456)
leejet Apr 23, 2026
970c4a3
chore: replace some NULL with nullptr + use "%zu" for printing some s…
akleine Apr 27, 2026
f40a707
feat: add sdcpp-specific generation metadata to image outputs (#1462)
leejet Apr 27, 2026
a81677f
docs: performance tips markup (#1460)
dwgrth Apr 27, 2026
331cfa5
fix: release VAE compute buffer after tiled encoding (#1465)
wbruna Apr 29, 2026
b8079e2
feat: transition from compile-time to runtime backend discovery (#1448)
wbruna Apr 29, 2026
3d6064b
perf: speed up tensor_to_sd_image conversion (#1466)
leejet Apr 29, 2026
9097ce5
fix: skip empty MultiLoraAdapter when no LoRAs target a model (#1469)
fszontagh May 6, 2026
586b6f1
feat: adapt res samplers for flow models for eta > 0 (#1436)
wbruna May 6, 2026
90e87bc
feat: add max-vram based segmented param offload (#1476)
leejet May 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ on:
"**/*.c",
"**/*.cpp",
"**/*.cu",
"examples/server/frontend",
"examples/server/frontend/**",
]
pull_request:
Expand All @@ -35,6 +36,7 @@ on:
"**/*.c",
"**/*.cpp",
"**/*.cu",
"examples/server/frontend",
"examples/server/frontend/**",
]

Expand Down Expand Up @@ -174,6 +176,7 @@ jobs:

build-and-push-docker-images:
name: Build and push container images
if: ${{ github.event_name != 'pull_request' }}
runs-on: ubuntu-latest

permissions:
Expand Down Expand Up @@ -239,6 +242,7 @@ jobs:
id: build-push
uses: docker/build-push-action@v6
with:
context: .
platforms: linux/amd64
push: ${{ ( github.event_name == 'push' && github.ref == 'refs/heads/master' ) || github.event.inputs.create_release == 'true' }}
file: Dockerfile.${{ matrix.variant }}
Expand Down
5 changes: 4 additions & 1 deletion .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,10 @@
url = https://github.com/ggml-org/ggml.git
[submodule "examples/server/frontend"]
path = examples/server/frontend
url = https://github.com/leejet/stable-ui.git
url = https://github.com/leejet/sdcpp-webui.git
[submodule "thirdparty/libwebp"]
path = thirdparty/libwebp
url = https://github.com/webmproject/libwebp.git
[submodule "thirdparty/libwebm"]
path = thirdparty/libwebm
url = https://github.com/webmproject/libwebm.git
95 changes: 82 additions & 13 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ endif()
if (MSVC)
add_compile_definitions(_CRT_SECURE_NO_WARNINGS)
add_compile_definitions(_SILENCE_CXX17_CODECVT_HEADER_DEPRECATION_WARNING)
add_compile_options(
$<$<COMPILE_LANGUAGE:C>:/MP>
$<$<COMPILE_LANGUAGE:CXX>:/MP>
)
endif()

set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/bin)
Expand All @@ -22,14 +26,37 @@ else()
set(SD_STANDALONE OFF)
endif()

set(SD_SUBMODULE_WEBP FALSE)
if(EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/thirdparty/libwebp/CMakeLists.txt")
set(SD_SUBMODULE_WEBP TRUE)
endif()
if(SD_SUBMODULE_WEBP)
set(SD_WEBP_DEFAULT ON)
else()
set(SD_WEBP_DEFAULT ${SD_USE_SYSTEM_WEBP})
endif()

set(SD_SUBMODULE_WEBM FALSE)
if(EXISTS "${CMAKE_CURRENT_SOURCE_DIR}/thirdparty/libwebm/CMakeLists.txt")
set(SD_SUBMODULE_WEBM TRUE)
endif()
if(SD_SUBMODULE_WEBM)
set(SD_WEBM_DEFAULT ON)
else()
set(SD_WEBM_DEFAULT ${SD_USE_SYSTEM_WEBM})
endif()

#
# Option list
#

# general
#option(SD_BUILD_TESTS "sd: build tests" ${SD_STANDALONE})
option(SD_BUILD_EXAMPLES "sd: build examples" ${SD_STANDALONE})
option(SD_WEBP "sd: enable WebP image I/O support" ON)
option(SD_WEBP "sd: enable WebP image I/O support" ${SD_WEBP_DEFAULT})
option(SD_USE_SYSTEM_WEBP "sd: link against system libwebp" OFF)
option(SD_WEBM "sd: enable WebM video output support" ${SD_WEBM_DEFAULT})
option(SD_USE_SYSTEM_WEBM "sd: link against system libwebm" OFF)
option(SD_CUDA "sd: cuda backend" OFF)
option(SD_HIPBLAS "sd: rocm backend" OFF)
option(SD_METAL "sd: metal backend" OFF)
Expand All @@ -45,51 +72,94 @@ option(SD_USE_SYSTEM_GGML "sd: use system-installed GGML library" OFF
if(SD_CUDA)
message("-- Use CUDA as backend stable-diffusion")
set(GGML_CUDA ON)
add_definitions(-DSD_USE_CUDA)
endif()

if(SD_METAL)
message("-- Use Metal as backend stable-diffusion")
set(GGML_METAL ON)
add_definitions(-DSD_USE_METAL)
endif()

if (SD_VULKAN)
message("-- Use Vulkan as backend stable-diffusion")
set(GGML_VULKAN ON)
add_definitions(-DSD_USE_VULKAN)
endif ()

if (SD_OPENCL)
message("-- Use OpenCL as backend stable-diffusion")
set(GGML_OPENCL ON)
add_definitions(-DSD_USE_OPENCL)
endif ()

if (SD_HIPBLAS)
message("-- Use HIPBLAS as backend stable-diffusion")
set(GGML_HIP ON)
add_definitions(-DSD_USE_CUDA)
endif ()

if(SD_MUSA)
message("-- Use MUSA as backend stable-diffusion")
set(GGML_MUSA ON)
add_definitions(-DSD_USE_CUDA)
endif()

if(SD_WEBP)
add_compile_definitions(SD_USE_WEBP)
if(NOT SD_SUBMODULE_WEBP AND NOT SD_USE_SYSTEM_WEBP)
message(FATAL_ERROR "WebP support enabled but no source found.
Either initialize the submodule:\n git submodule update --init thirdparty/libwebp\n\n"
"Or link against system library:\n cmake (...) -DSD_USE_SYSTEM_WEBP=ON")
endif()
if(SD_USE_SYSTEM_WEBP)
find_package(WebP REQUIRED)
add_library(webp ALIAS WebP::webp)
# libwebp CMake target naming is not consistent across versions/distros.
# Some export WebP::libwebpmux, others export WebP::webpmux.
if(TARGET WebP::libwebpmux)
add_library(libwebpmux ALIAS WebP::libwebpmux)
elseif(TARGET WebP::webpmux)
add_library(libwebpmux ALIAS WebP::webpmux)
else()
message(FATAL_ERROR
"Could not find a compatible webpmux target in system WebP package. "
"Expected WebP::libwebpmux or WebP::webpmux."
)
endif()
endif()
endif()

if(SD_WEBM)
if(NOT SD_WEBP)
message(FATAL_ERROR "SD_WEBM requires SD_WEBP because WebM output reuses libwebp VP8 encoding.")
endif()
if(NOT SD_SUBMODULE_WEBM AND NOT SD_USE_SYSTEM_WEBM)
message(FATAL_ERROR "WebM support enabled but no source found.
Either initialize the submodule:\n git submodule update --init thirdparty/libwebm\n\n"
"Or link against system library:\n cmake (...) -DSD_USE_SYSTEM_WEBM=ON")
endif()
if(SD_USE_SYSTEM_WEBM)
find_path(WEBM_INCLUDE_DIR
NAMES mkvmuxer/mkvmuxer.h mkvparser/mkvparser.h common/webmids.h
PATH_SUFFIXES webm
REQUIRED)
find_library(WEBM_LIBRARY
NAMES webm libwebm
REQUIRED)

add_library(webm UNKNOWN IMPORTED)
set_target_properties(webm PROPERTIES
IMPORTED_LOCATION "${WEBM_LIBRARY}"
INTERFACE_INCLUDE_DIRECTORIES "${WEBM_INCLUDE_DIR}")
endif()
endif()

set(SD_LIB stable-diffusion)

file(GLOB SD_LIB_SOURCES
file(GLOB SD_LIB_SOURCES CONFIGURE_DEPENDS
"src/*.h"
"src/*.cpp"
"src/*.hpp"
"src/vocab/*.h"
"src/vocab/*.cpp"
"src/model_io/*.h"
"src/model_io/*.cpp"
"src/tokenizers/*.h"
"src/tokenizers/*.cpp"
"src/tokenizers/vocab/*.h"
"src/tokenizers/vocab/*.cpp"
)

find_program(GIT_EXE NAMES git git.exe NO_CMAKE_FIND_ROOT_PATH)
Expand Down Expand Up @@ -146,7 +216,6 @@ if(SD_SYCL)
message("-- Use SYCL as backend stable-diffusion")
set(GGML_SYCL ON)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-narrowing -fsycl")
add_definitions(-DSD_USE_SYCL)
# disable fast-math on host, see:
# https://www.intel.com/content/www/us/en/docs/cpp-compiler/developer-guide-reference/2021-10/fp-model-fp.html
if (WIN32)
Expand Down Expand Up @@ -182,7 +251,7 @@ endif()
add_subdirectory(thirdparty)

target_link_libraries(${SD_LIB} PUBLIC ggml zip)
target_include_directories(${SD_LIB} PUBLIC . include)
target_include_directories(${SD_LIB} PUBLIC . src include)
target_include_directories(${SD_LIB} PUBLIC . thirdparty)
target_compile_features(${SD_LIB} PUBLIC c_std_11 cxx_std_17)

Expand Down
9 changes: 8 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ API and command-line option may change frequently.***

## 🔥Important News

* **2026/04/11** 🚀 stable-diffusion.cpp now uses a brand-new embedded web UI.
👉 Details: [PR #1408](https://github.com/leejet/stable-diffusion.cpp/pull/1408)

* **2026/01/18** 🚀 stable-diffusion.cpp now supports **FLUX.2-klein**
👉 Details: [PR #1193](https://github.com/leejet/stable-diffusion.cpp/pull/1193)

Expand Down Expand Up @@ -54,6 +57,7 @@ API and command-line option may change frequently.***
- [Z-Image](./docs/z_image.md)
- [Ovis-Image](./docs/ovis_image.md)
- [Anima](./docs/anima.md)
- [ERNIE-Image](./docs/ernie_image.md)
- Image Edit Models
- [FLUX.1-Kontext-dev](./docs/kontext.md)
- [Qwen Image Edit series](./docs/qwen_image_edit.md)
Expand All @@ -73,9 +77,10 @@ API and command-line option may change frequently.***
- OpenCL
- SYCL
- Supported weight formats
- Pytorch checkpoint (`.ckpt` or `.pth`)
- Pytorch checkpoint (`.ckpt` or `.pth` or `.pt`)
- Safetensors (`.safetensors`)
- GGUF (`.gguf`)
- Convert mode supports converting model weights to `.gguf` or `.safetensors`
- Supported platforms
- Linux
- Mac OS
Expand All @@ -93,6 +98,7 @@ API and command-line option may change frequently.***
- `DPM++ 2M`
- [`DPM++ 2M v2`](https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/8457)
- `DPM++ 2S a`
- `ER-SDE`
- [`LCM`](https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/13952)
- Cross-platform reproducibility
- `--rng cuda`, default, consistent with the `stable-diffusion-webui GPU RNG`
Expand Down Expand Up @@ -141,6 +147,7 @@ If you want to improve performance or reduce VRAM/RAM usage, please refer to [pe
- [🔥Z-Image](./docs/z_image.md)
- [Ovis-Image](./docs/ovis_image.md)
- [Anima](./docs/anima.md)
- [ERNIE-Image](./docs/ernie_image.md)
- [LoRA](./docs/lora.md)
- [LCM/LCM-LoRA](./docs/lcm.md)
- [Using PhotoMaker to personalize image generation](./docs/photo_maker.md)
Expand Down
Binary file added assets/ernie_image/example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/ernie_image/turbo_example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
16 changes: 12 additions & 4 deletions docs/build.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,23 @@ git submodule init
git submodule update
```

## WebP Support in Examples
## WebP and WebM Support in Examples

The example applications (`examples/cli` and `examples/server`) use `libwebp` to support WebP image I/O. This is enabled by default.
The example applications (`examples/cli` and `examples/server`) use `libwebp` to support WebP image I/O, and `examples/cli` can also use `libwebm` for `.webm` video output. Both are enabled by default. WebM output currently reuses `libwebp` to encode each frame as VP8 before muxing with `libwebm`.

If you do not want WebP support, you can disable it at configure time:
If you do not want WebP/WebM support, you can disable them at configure time:

```shell
mkdir build && cd build
cmake .. -DSD_WEBP=OFF
cmake .. -DSD_WEBP=OFF -DSD_WEBM=OFF
cmake --build . --config Release
```

If the submodules are not available, you can also link against system packages instead:

```shell
mkdir build && cd build
cmake .. -DSD_USE_SYSTEM_WEBP=ON -DSD_USE_SYSTEM_WEBM=ON
cmake --build . --config Release
```

Expand Down
2 changes: 0 additions & 2 deletions docs/caching.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,8 +131,6 @@ sd-cli -m model.safetensors -p "a cat" --cache-mode spectrum
| `warmup` | Steps to always compute before caching starts | 4 |
| `stop` | Stop caching at this fraction of total steps | 0.9 |

```
### Performance Tips

- Start with default thresholds and adjust based on output quality
Expand Down
51 changes: 16 additions & 35 deletions docs/distilled_sd.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,51 +87,32 @@ pipe.save_pretrained("segmindtiny-sd", safe_serialization=True)
```bash
python convert_diffusers_to_original_stable_diffusion.py \
--model_path ./segmindtiny-sd \
--checkpoint_path ./segmind_tiny-sd.ckpt --half
--checkpoint_path ./segmind_tiny-sd.safetensors --half --use_safetensors
```

The file segmind_tiny-sd.ckpt will be generated and is now ready for use with sd.cpp. You can follow a similar process for the other models mentioned above.
The file segmind_tiny-sd.safetensors will be generated and is now ready for use with sd.cpp. You can follow a similar process for the other models mentioned above.


##### Another available .ckpt file:

* https://huggingface.co/ClashSAN/small-sd/resolve/main/tinySDdistilled.ckpt

To use this file, you must first adjust its non-contiguous tensors:

```python
import torch
ckpt = torch.load("tinySDdistilled.ckpt", map_location=torch.device('cpu'))
for key, value in ckpt['state_dict'].items():
if isinstance(value, torch.Tensor):
ckpt['state_dict'][key] = value.contiguous()
torch.save(ckpt, "tinySDdistilled_fixed.ckpt")
```


### SDXS-512
### SDXS-512-DreamShaper

Another very tiny and **incredibly fast** model is SDXS by IDKiro et al. The authors refer to it as *"Real-Time One-Step Latent Diffusion Models with Image Conditions"*. For details read the paper: https://arxiv.org/pdf/2403.16627 . Once again the authors removed some more blocks of U-Net part and unlike other SD1 models they use an adjusted _AutoEncoderTiny_ instead of default _AutoEncoderKL_ for the VAE part.
##### Some ready-to-run SDXS-512 model files are available online, such as:

##### 1. Download the diffusers model from Hugging Face using Python:

```python
from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("IDKiro/sdxs-512-dreamshaper")
pipe.save_pretrained(save_directory="sdxs")
```
##### 2. Create a safetensors file

```bash
python convert_diffusers_to_original_stable_diffusion.py \
--model_path sdxs --checkpoint_path sdxs.safetensors --half --use_safetensors
```

##### 3. Run the model as follows:
* https://huggingface.co/akleine/sdxs-512
* https://huggingface.co/concedo/sdxs-512-tinySDdistilled-GGUF

##### Run the model as follows:
```bash
~/stable-diffusion.cpp/build/bin/sd-cli -m sdxs.safetensors -p "portrait of a lovely cat" \
--cfg-scale 1 --steps 1
```
Both options: ``` --cfg-scale 1 ``` and ``` --steps 1 ``` are mandatory here.

### SDXS-512-0.9

Even though the name "SDXS-512-0.9" is similar to "SDXS-512-DreamShaper", it is *completely different* but also **incredibly fast**. Sometimes it is preferred, so try it yourself.
##### Download a ready-to-run file from here:

* https://huggingface.co/akleine/sdxs-09

Both options: ``` --cfg-scale 1 ``` and ``` --steps 1 ``` are mandatory here.
For the use of this model, both options ``` --cfg-scale 1 ``` and ``` --steps 1 ``` are again absolutely necessary.
Loading