Skip to content

[qwenimage] add image_area kwarg to QwenImageEditPlusPipeline#13660

Open
david-PHR wants to merge 1 commit intohuggingface:mainfrom
Photoroom:qwenimage-edit-plus-image-area-kwarg
Open

[qwenimage] add image_area kwarg to QwenImageEditPlusPipeline#13660
david-PHR wants to merge 1 commit intohuggingface:mainfrom
Photoroom:qwenimage-edit-plus-image-area-kwarg

Conversation

@david-PHR
Copy link
Copy Markdown
Contributor

What does this PR do?

Adds an image_area keyword argument to QwenImageEditPlusPipeline.__call__, letting callers control the target pixel area used for generation. Previously this was hardcoded to 1024 * 1024 in two places, with no way to override it short of computing and passing both height and width manually — and even then the input images were still re-encoded by the VAE at ~1MP regardless.

With this PR a user can now do:

pipe(image, prompt, image_area=768 * 768)   # smaller, faster
pipe(image, prompt, image_area=1280 * 1280) # larger

image_area controls two things consistently:

  1. The default height/width derived from the input image's aspect ratio when those are not explicitly passed.
  2. The resolution at which the input image(s) are encoded by the VAE.

When height and width are both passed explicitly they continue to override (1); image_area still drives (2).

Motivation

The pipeline previously offered no way to dial generation/encoding resolution up or down from the ~1MP default. That's restrictive for users who want to trade fidelity for speed (lower) or push for higher fidelity (higher). This PR exposes the existing internal sizing knob as a public kwarg, with no behavior change at the default.

Changes

All changes are confined to src/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit_plus.py:

  • New image_area: int = 1024 * 1024 kwarg on __call__, placed between width and num_inference_steps to keep resolution-related kwargs grouped.
  • New docstring entry documenting the kwarg's dual effect (default-derivation + VAE-encoding size).
  • calculate_dimensions(1024 * 1024, ...)calculate_dimensions(image_area, ...) at the default-dimension call site.
  • calculate_dimensions(VAE_IMAGE_SIZE, ...)calculate_dimensions(image_area, ...) at the VAE-encode call site.
  • Removed the now-unused module-level constant VAE_IMAGE_SIZE = 1024 * 1024. The companion constant CONDITION_IMAGE_SIZE = 384 * 384 is kept (it controls VLM token budget for the conditioning branch — a different concern, out of scope here).

Backward compatibility

Fully backward-compatible. The default value is exactly the previously-hardcoded 1024 * 1024, so existing call sites get identical behavior.

Fixes # N/A — small standalone enhancement, no associated issue.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline?
  • Did you read our philosophy doc (important for complex PRs)?
  • Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes? (docstring entry added for the new kwarg)
  • Did you write any new necessary tests? — N/A: no behavior change at default; this is a passthrough kwarg whose default exactly preserves prior behavior.

Local checks: make fix-copies clean, ruff format / ruff check clean.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@github-actions github-actions Bot added pipelines size/S PR with diff < 50 LOC labels Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pipelines size/S PR with diff < 50 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant