Skip to content

Conversation

@adi776borate
Copy link
Contributor

What does this PR do?

Fixes #13005
This PR adds the Flux2KleinInpaintPipeline for image inpainting using the FLUX.2 [Klein] model with optional reference image conditioning.

Examples

Basic Inpainting

Image Mask Prompt Result
Face of a yellow cat, high resolution, sitting on a park bench
A young college boy, high resolution, sitting on a park bench

Inpainting with Reference Image

Image Mask Reference Prompt Result
Replace this ball

Known Limitations

  1. Generation quality may vary - Some outputs may contain artifacts. This can often be mitigated with better prompts and tuning hyperparameters (strength, guidance_scale, num_inference_steps).

  2. Reference image conditioning is experimental - Inpainting with image_reference may not consistently produce desired results.

Image Mask Reference Prompt Result
Replace puppy with panda

These limitations may stem either from a bug in the pipeline implementation by me or from inherent constraints of the model. Feedback is appreciated.

Before submitting

Who can review?

@asomoza @sayakpaul
Anyone in the community is free to review the PR once the tests have passed.

@Natans8
Copy link

Natans8 commented Jan 28, 2026

I have to be honest, I am not getting good results at all so far, especially when working with bounding box masks, even without a reference.
With black-forest-labs/FLUX.2-klein-4B, since it's a distilled model, guidance_scale is ignored, so I essentially have only strength and num_inference_steps to tweak.

At strength=1.0 as it often goes with editing models, I get the entire inpainting area overridden edge to edge instead of just the prompted subject, usually with a lot of artefacts.
At lower strengths I get my prompt plain ignored.

On the other hand, I don't have the same issues with the regular Flux2KleinPipeline with the same parameter values and prompt, it changes the prompted subject and leaves the rest of the image untouched. Same goes for other model pipelines like FluxKontextInpaintPipeline.

If you ask, I can provide examples of the results I get once my GPU frees up.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds the Flux2KleinInpaintPipeline to enable image inpainting capabilities for the FLUX.2 Klein model. The pipeline supports both basic text-guided inpainting and experimental reference image conditioning, addressing issue #13005.

Changes:

  • Implements a new inpainting pipeline for FLUX.2 Klein with masking support
  • Adds optional reference image conditioning for more controlled inpainting
  • Extends Flux2ImageProcessor with do_binarize and do_convert_grayscale parameters for mask processing

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/diffusers/pipelines/flux2/pipeline_flux2_klein_inpaint.py Main pipeline implementation with inpainting logic, mask handling, and reference image support
tests/pipelines/flux2/test_pipeline_flux2_klein_inpaint.py Test suite covering basic inpainting functionality including different prompts, output shapes, and strength variations
src/diffusers/pipelines/flux2/image_processor.py Enhanced image processor with binarization and grayscale conversion options
src/diffusers/pipelines/flux2/__init__.py Export declarations for the new pipeline
src/diffusers/pipelines/__init__.py Top-level export declarations
src/diffusers/__init__.py Main package export declarations
src/diffusers/utils/dummy_torch_and_transformers_objects.py Dummy object for missing dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +714 to +717
height is not None
and height % (self.vae_scale_factor * 2) != 0
or width is not None
and width % (self.vae_scale_factor * 2) != 0
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition has a logical error due to operator precedence. The and operators bind more tightly than or, causing incorrect evaluation. This should be:

if (
    (height is not None and height % (self.vae_scale_factor * 2) != 0)
    or (width is not None and width % (self.vae_scale_factor * 2) != 0)
):

Otherwise, when height is None, the condition height is not None and height % (self.vae_scale_factor * 2) != 0 evaluates to False, but then it will evaluate width is not None and width % (self.vae_scale_factor * 2) != 0, which could raise a TypeError if width is None when trying to perform the modulo operation.

Alternatively, following the pattern in FluxInpaintPipeline (line 515), you can remove the None checks entirely if height and width are guaranteed to be set by this point.

Suggested change
height is not None
and height % (self.vae_scale_factor * 2) != 0
or width is not None
and width % (self.vae_scale_factor * 2) != 0
(height is not None and height % (self.vae_scale_factor * 2) != 0)
or (width is not None and width % (self.vae_scale_factor * 2) != 0)

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

@adi776borate adi776borate Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have followed the pattern from Flux2KleinPipeline and seems correct to me.
In Python, operator precedence is and > or
so, Python will evaluate this as

(height is not None and height % (self.vae_scale_factor * 2) != 0) or
(width is not None and width % (self.vae_scale_factor * 2) != 0)

So, there is no precedence bug here.

And

could raise a TypeError if width is None when trying to perform the modulo operation.

This cannot happen because of short-circuiting.


class Flux2KleinInpaintPipelineFastTests(PipelineTesterMixin, unittest.TestCase):
pipeline_class = Flux2KleinInpaintPipeline
params = frozenset(["prompt", "height", "width", "guidance_scale", "prompt_embeds"])
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The params frozenset should include "image" and "mask_image" since these are required parameters for an inpainting pipeline. Following the pattern in other inpainting pipelines (see tests/pipelines/pipeline_params.py TEXT_GUIDED_IMAGE_INPAINTING_PARAMS), the params should be:

params = frozenset(["prompt", "image", "mask_image", "height", "width", "guidance_scale", "prompt_embeds"])

Without including these parameters, the standard pipeline tests from PipelineTesterMixin may not properly validate these critical inputs.

Copilot uses AI. Check for mistakes.
Comment on lines +175 to +177
@unittest.skip("Needs to be revisited")
def test_encode_prompt_works_in_isolation(self):
pass
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description highlights the reference image conditioning feature (image_reference parameter), which is a key differentiating feature of this pipeline. However, there are no tests for this functionality. Consider adding a test like:

def test_flux2_klein_inpaint_with_reference_image(self):
    pipe = self.pipeline_class(**self.get_dummy_components()).to(torch_device)
    inputs = self.get_dummy_inputs(torch_device)
    
    # Add a reference image
    image_reference = floats_tensor((1, 3, 32, 32), rng=random.Random(0)).to(torch_device)
    inputs["image_reference"] = image_reference
    
    output_with_ref = pipe(**inputs).images[0]
    
    # Test without reference
    inputs_no_ref = self.get_dummy_inputs(torch_device)
    output_no_ref = pipe(**inputs_no_ref).images[0]
    
    # Outputs should be different
    max_diff = np.abs(output_with_ref - output_no_ref).max()
    assert max_diff > 1e-6

This would help ensure the reference image functionality works as intended.

Copilot uses AI. Check for mistakes.
@adi776borate
Copy link
Contributor Author

I'd missed patchifying the mask latents earlier. With that corrected, here are some new example generations:

Image Mask Prompt Result
Face of a yellow cat, high resolution, sitting on a park bench Using 4B model
Using 9B model
A young college boy, high resolution, sitting on a park bench Using 4B model
Using 9B model
boy A young woman standing with bouquet, natural pose, realistic proportions, consistent lighting and environment Using 4B model
A young woman standing with bouquet, natural pose, realistic proportions, consistent lighting and environment Using 4B model

Below all are using 9B model with strength=1.0:

Image Mask Reference Prompt Result
Replace this ball
Replace puppy with panda

@Natans8 this might improve what you observed but I'll suggest wait till maintainers' review.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Flux.2 Klein Inpaint pipeline

3 participants