Skip to content

Conversation

@zhangtao0408
Copy link
Contributor

@zhangtao0408 zhangtao0408 commented Jan 25, 2026

What does this PR do?

Fixes: huggingface/diffusers#13016
Related PR: huggingface/diffusers#13017

Test Codes

export ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export MODEL_PATH=/PATH/TO/Qwen-Image-Edit-2509
export LORA_PATH=/PATH/TO/Qwen-Image-Edit-2509/lora

torchrun --nproc_per_node=4 generate.py qwen_image_edit_lightning \
	--warmup 1 \
	--model-path $MODEL_PATH \
	--lora-path $LORA_PATH \
	--height 1024 \
	--width 1024 \
	--steps 4 \
	--parallel ulysses \
	--attn _native_npu \
	--ulysses-anything \
	--parallel-text-encoder \
	--parallel-vae \
    --ulysses-async

Results

Stage E2E Time
Before PR 3.64s
After PR 3.72s
  • Before this PR
INFO 01-23 12:54:04 [base.py:630] ----------------------------------------------------------------------------------------------------
INFO 01-23 12:54:04 [base.py:395] 🤖 Example Init Config Summary:
INFO 01-23 12:54:04 [base.py:418] - Model: 
INFO 01-23 12:54:04 [base.py:418]     - /PATH/TO/Qwen-Image-Edit-2509
INFO 01-23 12:54:04 [base.py:418]     - lightx2v/Qwen-Image-Lightning
INFO 01-23 12:54:04 [base.py:418] - Task Type: IE2I - Image Editing to Image
INFO 01-23 12:54:04 [base.py:418] - Torch Dtype: torch.bfloat16
INFO 01-23 12:54:04 [base.py:418] - LoRA Weights: /PATH/TO/Qwen-Image-Edit-2509-Lightning-4steps-V1.0-bf16.safetensors
[rank2]:[W123 12:54:04.983532073 ProcessGroup.hpp:916] Warning: No backend of type 0 found for Process Group with name undefined. Assuming no hooks are registered. (function hasHooks)
INFO 01-23 12:54:04 [base.py:212] 🤖 Example Input Summary:
INFO 01-23 12:54:04 [base.py:212] - prompt: The magician bear is on the left, the alchemist bear is on the right, facing each other in the central park square.
INFO 01-23 12:54:04 [base.py:212] - negative_prompt:  
INFO 01-23 12:54:04 [base.py:212] - height: 1024
INFO 01-23 12:54:04 [base.py:212] - width: 1024
INFO 01-23 12:54:04 [base.py:212] - true_cfg_scale: 1.0
INFO 01-23 12:54:04 [base.py:212] - num_inference_steps: 4
INFO 01-23 12:54:04 [base.py:212] - image: List Images (2 images)
INFO 01-23 12:54:04 [base.py:212]     - Image 0: (1228x1228)
INFO 01-23 12:54:04 [base.py:212]     - Image 1: (1226x1228)
INFO 01-23 12:54:04 [base.py:212] - generator: device cpu, seed 0
INFO 01-23 12:54:04 [base.py:307] 🤖 Example Output Summary:
INFO 01-23 12:54:04 [base.py:323] - Model: qwen_image_edit_lightning
INFO 01-23 12:54:04 [base.py:323] - Optimization: 1024x1024_C0_Q0_NONE_Ulysses4_TEP_VAEP_ulysses_anything_ulysses_async_native_npu
INFO 01-23 12:54:04 [base.py:323] - Device: Ascend910B3 x 4
INFO 01-23 12:54:04 [base.py:323] - Load Time: 146.50s
INFO 01-23 12:54:04 [base.py:323] - Warmup Time: 26.16s
INFO 01-23 12:54:04 [base.py:323] - Inference Time: 3.64s
INFO 01-23 12:54:05 [base.py:246] Image saved to qwen_image_edit_lightning.1024x1024_C0_Q0_NONE_Ulysses4_TEP_VAEP_ulysses_anything_ulysses_async_native_npu.png
INFO 01-23 12:54:05 [base.py:641] ----------------------------------------------------------------------------------------------------
  • After this PR
INFO 01-23 13:04:47 [base.py:630] ----------------------------------------------------------------------------------------------------
INFO 01-23 13:04:47 [base.py:395] 🤖 Example Init Config Summary:
INFO 01-23 13:04:47 [base.py:418] - Model: 
INFO 01-23 13:04:47 [base.py:418]     - /PATH/TO/Qwen-Image-Edit-2509
INFO 01-23 13:04:47 [base.py:418]     - lightx2v/Qwen-Image-Lightning
INFO 01-23 13:04:47 [base.py:418] - Task Type: IE2I - Image Editing to Image
INFO 01-23 13:04:47 [base.py:418] - Torch Dtype: torch.bfloat16
INFO 01-23 13:04:47 [base.py:418] - LoRA Weights: /PATH/TO/Qwen-Image-Edit-2509-Lightning-4steps-V1.0-bf16.safetensors
[rank1]:[W123 13:04:47.961069787 ProcessGroup.hpp:916] Warning: No backend of type 0 found for Process Group with name undefined. Assuming no hooks are registered. (function hasHooks)
INFO 01-23 13:04:47 [base.py:212] 🤖 Example Input Summary:
INFO 01-23 13:04:47 [base.py:212] - prompt: The magician bear is on the left, the alchemist bear is on the right, facing each other in the central park square.
INFO 01-23 13:04:47 [base.py:212] - negative_prompt:  
INFO 01-23 13:04:47 [base.py:212] - height: 1024
INFO 01-23 13:04:47 [base.py:212] - width: 1024
INFO 01-23 13:04:47 [base.py:212] - true_cfg_scale: 1.0
INFO 01-23 13:04:47 [base.py:212] - num_inference_steps: 4
INFO 01-23 13:04:47 [base.py:212] - image: List Images (2 images)
INFO 01-23 13:04:47 [base.py:212]     - Image 0: (1228x1228)
INFO 01-23 13:04:47 [base.py:212]     - Image 1: (1226x1228)
INFO 01-23 13:04:47 [base.py:212] - generator: device cpu, seed 0
INFO 01-23 13:04:47 [base.py:307] 🤖 Example Output Summary:
INFO 01-23 13:04:47 [base.py:323] - Model: qwen_image_edit_lightning
INFO 01-23 13:04:47 [base.py:323] - Optimization: 1024x1024_C0_Q0_NONE_Ulysses4_TEP_VAEP_ulysses_anything_ulysses_async_native_npu
INFO 01-23 13:04:47 [base.py:323] - Device: Ascend910B3 x 4
INFO 01-23 13:04:47 [base.py:323] - Load Time: 150.44s
INFO 01-23 13:04:47 [base.py:323] - Warmup Time: 17.23s
INFO 01-23 13:04:47 [base.py:323] - Inference Time: 3.72s
INFO 01-23 13:04:48 [base.py:246] Image saved to qwen_image_edit_lightning.1024x1024_C0_Q0_NONE_Ulysses4_TEP_VAEP_ulysses_anything_ulysses_async_native_npu.png
INFO 01-23 13:04:48 [base.py:641] ----------------------------------------------------------------------------------------------------

@DefTruth
Copy link
Member

please fix pre-commit

@zhangtao0408
Copy link
Contributor Author

  • Pre-Commit Check passed
# pre-commit run --all-files
[INFO] Initializing environment for https://github.com/pre-commit/pre-commit-hooks.
[WARNING] repo `https://github.com/pre-commit/pre-commit-hooks` uses deprecated stage names (commit, push) which will be removed in a future version.  Hint: often `pre-commit autoupdate --repo https://github.com/pre-commit/pre-commit-hooks` will fix this.  if it does not -- consider reporting an issue to that repo.
[INFO] Initializing environment for https://github.com/PyCQA/flake8.
[INFO] Initializing environment for https://github.com/PyCQA/pydocstyle.
[INFO] Initializing environment for https://github.com/psf/black.
[INFO] Initializing environment for https://github.com/psf/black:.[jupyter].
[INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
[INFO] Installing environment for https://github.com/PyCQA/flake8.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
[INFO] Installing environment for https://github.com/PyCQA/pydocstyle.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
[INFO] Installing environment for https://github.com/psf/black.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
Check docstring is first.................................................Passed
Check Toml...............................................................Passed
Check Yaml...............................................................Passed
Mixed line ending........................................................Passed
Fix End of Files.........................................................Passed
flake8...................................................................Passed
pydocstyle...............................................................Passed
black-jupyter............................................................Passed

@DefTruth DefTruth changed the title Feat. NPU FA support attention mask. feat: NPU FA support attention mask Jan 26, 2026
Copy link
Member

@DefTruth DefTruth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM~ Thanks for working on this!

@DefTruth DefTruth merged commit 9baf4b8 into vipshop:main Jan 26, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Support QwenImageEditPlus series attention mask for NPU

3 participants