Skip to content

Conversation

@kitsune-hash
Copy link

Summary

Fixes #12009

When loading models from single-file checkpoints (e.g., GGUF format), some parameters or buffers may not be present in the checkpoint and remain on the meta device after load_model_dict_into_meta. This causes dispatch_model to fail with Cannot copy out of meta tensor errors.

Root Cause

In from_single_file(), models are created under init_empty_weights(), which initializes all tensors on the meta device. After loading checkpoint weights via load_model_dict_into_meta, any tensors not present in the checkpoint remain on meta. This includes:

  1. Non-persistent buffers like RoPE sinusoidal embeddings (freqs_cos, freqs_sin in WanTransformer3DModel) — these are computed deterministically during __init__() and intentionally excluded from checkpoints
  2. Missing parameters due to incomplete key mapping in checkpoint converters

Fix

Three layers of handling before dispatch_model():

  1. Non-persistent buffer re-materialization: Identifies submodules with non-persistent meta buffers and re-creates them outside init_empty_weights() context, allowing deterministic buffers to be properly recomputed
  2. Persistent buffer fallback: Zero-initializes any persistent buffers still on meta with a warning
  3. Parameter safety net: Zero-initializes remaining meta parameters with a warning (indicates incomplete key mapping)

Testing

Validated with:

  • WanTransformer3DModel.from_single_file() with GGUF checkpoints (Q2_K, Q4_K_M, Q8_0)
  • Confirmed RoPE buffers are correctly recomputed (cos/sin values in [-1, 1], 100% match with reference model)
  • Full T2V inference produces valid video output
  • 0 meta tensors remaining after fix

Specific models tested:

  • QuantStack/Wan2.2-T2V-A14B-GGUF (1095 params, 2 non-persistent buffers recomputed)
  • QuantStack/Wan2.2-I2V-A14B-GGUF (safety net correctly catches 150+ missing I2V-specific params from incomplete key mapping)

Before this fix

RuntimeError: Cannot copy out of meta tensor; no data!

After this fix

INFO: Re-creating submodule 'rope' (WanRotaryPosEmbed) to materialize non-persistent buffers.
✅ 0 meta tensors remaining, model loads and runs correctly

When loading models from single-file checkpoints (e.g., GGUF format),
some parameters or buffers may not be present in the checkpoint and
remain on the meta device after load_model_dict_into_meta. This causes
dispatch_model to fail with 'Cannot copy out of meta tensor' errors.

This fix adds three layers of handling before dispatch_model():

1. **Non-persistent buffer re-materialization**: Identifies submodules
   with non-persistent meta buffers (e.g., RoPE sinusoidal embeddings
   in WanTransformer3DModel) and re-creates them outside the
   init_empty_weights context, allowing deterministic buffers to be
   properly computed.

2. **Persistent buffer fallback**: Zero-initializes any persistent
   buffers still on meta device with a warning.

3. **Parameter safety net**: Zero-initializes any parameters still on
   meta device (indicates incomplete key mapping in the checkpoint
   converter) with a warning.

Fixes huggingface#12009
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WanTransformer3DModel.from_single_file wont load Wan2.2 GGUF (NotImplementedError: Cannot copy out of meta tensor; no data)

1 participant