Environment
- LightX2V version: latest main branch
- Python: 3.11
- PyTorch: 2.x
- GPU: NVIDIA CUDA
Bug Description
When running WAN 2.2 I2V with FP8 quantized T5 encoder and CPU offload at phase granularity, the _get_actual_bias() method in mm_weight.py raises an AttributeError because it accesses self.bias directly without checking if the attribute exists.
Configuration
{
"t5_cpu_offload": true,
"t5_offload_granularity": "phase",
"t5_quantized": true,
"t5_quant_scheme": "fp8-q8f"
}
Error Traceback
File "lightx2v/models/input_encoders/hf/wan/t5/model.py", line 524, in forword_attn_with_offload
q = attn_phase.attn_q.apply(x.squeeze(0)).view(b, -1, n, c)
File "lightx2v/common/ops/mm/mm_weight.py", line 1319, in apply
self._get_actual_bias(),
File "lightx2v/common/ops/mm/mm_weight.py", line 152, in _get_actual_bias
if self.bias is None:
^^^^^^^^^
AttributeError: 'MMWeightWfp8channelAfp8channeldynamicQ8F' object has no attribute 'bias'
Root Cause
In MMWeightTemplate._get_actual_bias() (line 152), the code directly accesses self.bias without first checking if the attribute exists:
def _get_actual_bias(self, bias=None):
if bias is not None:
...
else:
if self.bias is None: # <-- AttributeError if self.bias doesn't exist!
return None
When using create_cpu_buffer=True with phase-level offload, the load_quantized() method only initializes self.bias = None if bias is in base_attrs. However, _update_base_attrs() only adds bias to base_attrs if bias_name is not None. For attention layers without a bias term (like T5 attention Q/K/V projections), the bias attribute is never created.
Proposed Fix
Change line 152 in lightx2v/common/ops/mm/mm_weight.py:
# Before
if self.bias is None:
# After
if not hasattr(self, "bias") or self.bias is None:
Workarounds
- Use
t5_offload_granularity: "block" instead of "phase"
- Use a different T5 quant scheme (e.g.,
fp8-vllm or int8-vllm)
- Disable T5 quantization:
t5_quantized: false
Environment
Bug Description
When running WAN 2.2 I2V with FP8 quantized T5 encoder and CPU offload at phase granularity, the
_get_actual_bias()method inmm_weight.pyraises anAttributeErrorbecause it accessesself.biasdirectly without checking if the attribute exists.Configuration
{ "t5_cpu_offload": true, "t5_offload_granularity": "phase", "t5_quantized": true, "t5_quant_scheme": "fp8-q8f" }Error Traceback
Root Cause
In
MMWeightTemplate._get_actual_bias()(line 152), the code directly accessesself.biaswithout first checking if the attribute exists:When using
create_cpu_buffer=Truewith phase-level offload, theload_quantized()method only initializesself.bias = Noneifbiasis inbase_attrs. However,_update_base_attrs()only adds bias tobase_attrsifbias_name is not None. For attention layers without a bias term (like T5 attention Q/K/V projections), thebiasattribute is never created.Proposed Fix
Change line 152 in
lightx2v/common/ops/mm/mm_weight.py:Workarounds
t5_offload_granularity: "block"instead of"phase"fp8-vllmorint8-vllm)t5_quantized: false