feat(ltx2): LTX-2.3 video generation — conversion script, ltx2.h C API, CPU VAE fix#7
Open
64johnlee wants to merge 142 commits into
Open
feat(ltx2): LTX-2.3 video generation — conversion script, ltx2.h C API, CPU VAE fix#764johnlee wants to merge 142 commits into
64johnlee wants to merge 142 commits into
Conversation
* feat: add support for the eta parameter to ancestral samplers * feat: Euler Ancestral sampler implementation for flow models * refine flow ancestral sampling and normalize eta defaults --------- Co-authored-by: leejet <leejet714@gmail.com>
Co-authored-by: leejet <leejet714@gmail.com>
* Temporal tile size + overlap * add --extra-tiling-args support --------- Co-authored-by: leejet <leejet714@gmail.com>
Co-authored-by: leejet <leejet714@gmail.com>
…eejet#1564) Co-authored-by: Serge F. Chirik <s.chirik@timbel.info>
…ol crash - script/convert_ltx2.py: safetensors → GGUF at Q4_0/Q5_1/Q8_0/F16 with selective F16 preservation for norms, biases, and embeddings - include/ltx2.h: focused public C API for LTX-2 T2V and I2V inference, wrapping stable-diffusion.h with ltx2_new_ctx / ltx2_generate_t2v / ltx2_generate_i2v helpers - fix(ggml_ext_conv_3d): fall back to explicit im2col+mul_mat when weight type is not F16/F32, fixing assertion crash in ggml_compute_forward_im2col_f16 on CPU with quantized VAE weights (upstream issue leejet#1577) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous element-wise Python loop was O(n) in pure Python — too slow for 14B-parameter tensors. Replace with a numpy byte-copy: write the two BF16 bytes into positions [2] and [3] of each uint32 word (BF16 is float32 with the low 16 bits zeroed), then reinterpret as float32. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three jobs on every push to ltx2-video-generation and on PRs to master: - build-linux: cmake + Ninja on ubuntu-22.04, asserts vid_gen / embeddings-connectors / diffusion-fa flags present in sd-cli --help - convert-script: syntax check + --help + two synthetic GGUF round-trips (F32→Q8_0 and BF16→F16 via KEEP_F16_PATTERNS) - build-macos-arm64: cmake + Metal on macos-14 (ARM64), uploads sd-cli artifact for 7 days Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…f16_to_fp32 safe_open(framework="numpy") doesn't support BF16 tensors because numpy has no bfloat16 dtype. Replace with a hand-rolled parser (_iter_safetensors) that reads the safetensors binary format directly (8-byte LE header size + JSON metadata + raw tensor bytes), eliminating the torch/safetensors dep. Also fix bf16_to_fp32: calling .view(uint8) on a multi-dimensional array gives a multi-dim byte array whose [0::2] slice has the wrong shape. Flatten to 1D first with .ravel() so the byte interleaving works correctly. CI: drop safetensors from pip install since it is no longer imported. Both round-trips (F32→Q8_0 and BF16→F16) verified locally. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GitHub is forcing Node 24 as default on June 16; set FORCE_JAVASCRIPT_ACTIONS_TO_NODE24 at workflow level to adopt it now. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
sd-cli appends .avi to the -o path unconditionally; update the results ls check to match the actual filenames produced. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds **/*.sh plus explicit test_m2.sh, test_*.sh, and .github/test_*.sh to the on.push paths filter so test scripts (like the recently-added test_m2.sh that didn't trigger CI on commit 259b7ad) participate in the CI gating cycle. The wildcard alone would suffice; the explicit entries are kept as documentation of which scripts we specifically care about.
Silently mismatched data_offsets produced wrong tensor data without error. Now raises ValueError with tensor name, expected bytes, shape, dtype, and actual bytes for fast diagnosis. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- ltx2_ctx_params_set_defaults: remove schedule/sample_method/cfg_scale which do not exist on sd_ctx_params_t (they live on sd_sample_params_t) - Add ltx2_vid_params_set_defaults() to set LTX-2 sample defaults on sd_vid_gen_params_t.sample_params where they actually belong - Call ltx2_vid_params_set_defaults() in both generate_t2v and generate_i2v - Fix typo: embeddings_connector_path -> embeddings_connectors_path Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds LTX-2.3 (14B DiT, Gemma 3 text encoder, spatiotemporal Video-VAE) video generation support to the fork, plus M1 deliverables for the Tether LTX-2 bounty.
Changes
Sync upstream — merges 133 commits from `leejet/stable-diffusion.cpp` master, including all LTX-2.3 work (transformer, VAE, temporal upscaler, FLF2V, TAE support, Vulkan/Metal backends).
`script/convert_ltx2.py` — Python conversion script (safetensors → GGUF):
`include/ltx2.h` — focused public C API for LTX-2 consumers:
Bug fix: CPU VAE im2col assertion crash (upstream leejet#1577)
Test plan
Related
🤖 Generated with Claude Code