diffmat

Remove backgrounds from AI-generated images by comparing multiple renders on different solid colours.

AI image generators (Nano Banana, Midjourney, DALL·E, etc.) cannot output true transparency. Diffmat solves this: give it 2–5 renders of the same subject that differ only in background colour, and it isolates the subject pixel by pixel — producing a clean RGBA PNG.

No AI, no chroma key — just pixel math.

Why this exists

If you've used AI to generate images and wanted a true transparent background, you know it isn't simple. Asking the AI for transparency often yields a fake checkered pattern baked into the image — which makes removal harder, not easier. Existing AI matting tools and Photoshop techniques work, but results vary wildly. Diffmat is another strategy. I wrote it for my own use and paid for my own AI tokens; I'm happy with the results, so I'm sharing the code.

How It Works

Start by generating your source images. The following prompt works well with most AI image tools:

Act as a professional, expert image manipulation AI. Your task is to take the provided source image and generate five distinct, high-resolution versions.

Mandatory Constraints:

Subject Isolation: The subject must be perfectly and completely isolated from the original background.

Fidelity: The subject must retain 100% of its original details, poses, lighting, and appearance across all five outputs. No details should be lost or altered.

Background Constraint: The background must be replaced entirely with a solid, opaque colour. Absolutely no transparency, gradients, or checkered patterns are allowed.

Output Requirements: Generate five separate image files, strictly adhering to the following specifications:

Pure Black — RGB (0, 0, 0)

Pure White — RGB (255, 255, 255)

Pure Red — RGB (255, 0, 0)

Pure Green — RGB (0, 255, 0)

Pure Blue — RGB (0, 0, 255)

Once you have the renders, diffmat inspects every pixel position (x, y) across all N input images:

Situation	Result
Every image shows its own background at (x, y)	→ transparent (alpha = 0)
All images show a similar colour at (x, y)	→ opaque subject (alpha = 255)
Mixed — some show background, some don't	→ edge (alpha between 0–255)

The exact way edge alpha is calculated depends on the method you choose.

Choosing a Method

Three algorithms are built in. They all produce a transparent PNG — they differ in how they decide the alpha value at edge pixels and whether they reconstruct the true subject colour.

	`simple`	`variance`	`decomposition`
Approach	Binary classification	Statistical variance	Linear unmixing
Speed	Fastest (~30 s / 1 MP)	Fast (~40 s / 1 MP)	Slowest (~60 s / 1 MP)
Edge quality	Hard or lightly smoothed	Smooth gradient	Physically modelled
Re-colours subject?	No (uses reference image)	No (uses reference image)	Yes (estimates true colour)
Best for	Crisp icons, logos, solids	Anti-aliased edges, vector art	Semi-transparent, shadows, glass
Weakness	Jagged on fine detail	Can over-smooth thin features	Sensitive to JPEG noise

`simple` — Binary Classification

Each pixel gets one of three labels: background, subject, or edge. Edge alpha is either a hard 0/255 majority vote or a smooth blend based on average distance to each assigned background colour.

Pros

Fastest — processes every pixel once.
Predictable and easy to reason about.
Great for icons, logos, and objects with hard, well-defined edges.

Cons

Binary "similar or not" test produces jagged silhouettes on anti-aliased inputs.
Cannot distinguish why pixels differ — a shadow looks the same as a background edge.

Use when: your subject has crisp, hard edges and speed matters. Avoid when: the subject contains hair, fur, feathers, drop shadows, glass, smoke, or translucent materials.

`variance` — Statistical Variance Matting

Instead of thresholding, this method computes the variance of pixel values across all images at each position. Low variance + far from background → opaque. Low variance + near background → transparent. Medium variance → partial alpha proportional to 1 − normalised_variance.

Pros

Smoother transitions — alpha is driven by a continuous statistic, not a threshold.
No hard cut-off — the gradient reflects how consistently the subject appears.
Good middle ground between speed and quality.

Cons

Variance alone cannot distinguish edge blur from colour noise. Noisy JPEGs can produce false edges.
Can over-smooth very fine detail (individual hairs, thin strokes).

Use when: source images have anti-aliased edges and you want smoother silhouettes than simple gives. Avoid when: the subject has semi-transparent parts (use decomposition) or inputs are heavily compressed.

`decomposition` — Linear Unmixing

Models each observed pixel as a physical blend:

I_i = α × S + (1 − α) × B_i

where I_i is the observed colour in image i, B_i is the known background colour, S is the true subject colour, and α is the opacity. The method solves iteratively for both α and S at every pixel using least-squares.

Pros

Best quality for semi-transparent edges, drop shadows, glass, smoke, and reflections.
Recovers the true subject colour by subtracting each image's background contribution.
Physically meaningful — α directly represents "how much of this pixel is the subject."

Cons

Slowest — runs 3 iterations of a solve loop at every pixel.
Sensitive to compression artefacts: JPEG noise can look like background bleed, causing α to be underestimated.
Reconstructs RGB values — output colours may differ slightly from input.

Use when: the subject has semi-transparent regions (hair strands, glass, smoke, drop shadows) and you want the cleanest possible edge. Avoid when: inputs are heavily compressed JPEGs or you need exact input RGB preservation.

Decision Flowchart

Does your subject have semi-transparent parts? (hair, glass, shadow)
  ├─ Yes → decomposition
  └─ No
      ├─ Are edges clean and hard? (icons, logos)
      │   └─ Yes → simple
      └─ Are edges anti-aliased / slightly soft?
          └─ Yes → variance  (or decomposition for best quality)

Installation

git clone https://github.com/YOUR_USERNAME/diffmat.git
cd diffmat
python3 -m venv venv
source venv/bin/activate          # Windows: venv\Scripts\activate
pip install -r requirements.txt

Requirements: Python 3.8+, Pillow, NumPy.

Usage

Basic

python diffmat.py path/to/folder/
# Output: path/to/folder/output.png

Choose a method

python diffmat.py path/to/folder/ --method decomposition
python diffmat.py path/to/folder/ --method variance --tolerance 30

Hard edges (no anti-aliasing)

python diffmat.py path/to/folder/ --no-antialiasing

Custom colours and tolerance

python diffmat.py path/to/folder/ \
  --colors 255,255,255 0,0,0 255,0,0 \
  --tolerance 30

Explicit file list

python diffmat.py --images white.png black.png red.png green.png blue.png -o result.png

All options

  folder                       Folder with 2-5 images
  --images IMG [IMG ...]       2-5 image files
  -o, --output PATH            Output PNG (default: folder/output.png)
  --colors R,G,B [R,G,B ...]   Background colours per image
  --tolerance FLOAT            RGB distance for bg matching (default: 50)
  --similarity-tolerance FLOAT Max inter-image distance for subject (default: --tolerance)
  --reference INT              Reference image index for RGB output (default: 0)
  --no-antialiasing            Hard 0/255 alpha only
  --method {simple,variance,decomposition}   (default: simple)
  --debug                      Print border colours and assignment distances

Input Requirements

Requirement	Detail
Count	2–5 images
Dimensions	All identical (same width × height)
Subject	Same subject, same pose — only the background changes
Backgrounds	Solid colours (no gradients, no textures)
Format	PNG recommended (lossless). JPEG works but may add noise.

Filename Convention (Recommended)

Include one of these keywords in each filename so the script can auto-match image → background:

Background	Keywords
White	`white`, `whit`
Black	`black`, `blac`
Red	`red`
Green	`green`, `gree`
Blue	`blue`, `blu`

Example: logo_white.png, logo_black.png, logo_red.png.

If filenames don't contain keywords, the script falls back to dominant border colour detection — reliable, but less predictable.

Default Background Colours

Colour	RGB
White	255, 255, 255
Black	0, 0, 0
Red	255, 0, 0
Green	0, 255, 0
Blue	0, 0, 255

Tolerance Tuning

Parameter	Default	Increase when…	Decrease when…
`--tolerance`	50	Input is compressed (JPEG artefacts), or backgrounds aren't pure	Backgrounds are exact solid colours and you see halos
`--similarity-tolerance`	= `--tolerance`	Subject has subtle shading differences across renders	Subject is truly identical across all backgrounds

Rule of thumb: start at 50. If you see background leftovers, raise to 70–80. If subject areas are going transparent, lower to 30–40.

How Antialiasing Works

Antialiasing applies only to the silhouette edge — the transition from transparent to opaque. It smooths the alpha channel, not the RGB colours.

What it does: at boundary pixels, alpha is set to a gradient value (0–255) for a smooth transition instead of a jagged step.
What it doesn't do: it doesn't blend or soften colours within the subject. Red next to blue stays sharp.
Disable it: use --no-antialiasing for hard 0/255 alpha only.

Performance

All three methods use nested Python loops over pixels (full vectorisation is limited by the per-pixel decision logic). Rough timings on a modern laptop:

Image size	`simple`	`variance`	`decomposition`
512 × 512	~8 s	~10 s	~15 s
1024 × 1024	~30 s	~40 s	~60 s

decomposition is slowest because it runs an iterative least-squares solve at every pixel. For batch processing, consider downscaling first or running on a subset.

Limitations

JPEG compression noise — lossy artefacts can mimic background bleed, causing holes or fringes in the output. Use PNG inputs when possible, or increase --tolerance to 70–100.
Non-solid backgrounds — gradients, patterns, or vignettes break the assumption that "background = single solid colour."
Subject changes between renders — if the AI changes the subject's pose, expression, or detail, those differences will be classified as edge or transparent.
Very large images — pixel-by-pixel Python loops are slow on multi-megapixel inputs. Downscale first if speed matters.
Distinct backgrounds required — two images with very similar backgrounds (e.g. white + light grey) produce poor results. Backgrounds must be far apart in RGB space.

Samples

The repo includes ready-to-run samples:

# Feather icon (3 backgrounds) — compare edge quality between methods
python diffmat.py samples/feather-icon2/ --method simple
python diffmat.py samples/feather-icon2/ --method variance
python diffmat.py samples/feather-icon2/ --method decomposition

# Sparkdown icon (2 backgrounds) — quick test
python diffmat.py samples/sparkdown-icon/ --method decomposition

Each sample folder contains 2–5 source images. Run any of the commands above to generate the output.

Contributing

Contributions are welcome.

Fork the repo and create a feature branch.
Test your changes against samples/cat/ (5 images) — it's the most comprehensive sample.
Open a PR with a clear description of what changed and why.

Adding a new method: implement a function with the signature _method_yourname(images, bg_colors, tol, sim_tol, ref_idx, aa) -> np.ndarray and register it in the _METHODS dict at the top of the file.

Adding samples: create a folder under samples/ with 2–5 images following the filename convention. Keep total repo size under 100 MB — use Git LFS for larger sets.

Bug reports: please include the method used, tolerance value, number of input images, and whether inputs are PNG or JPEG.

License

MIT. See LICENSE.

Reference

Inspired by the difference-matting approach described in Generating transparent background images with Nano Banana Pro 2.

Author

Edward Tsang — blockchain & AI engineer. Open to consulting → Email · LinkedIn

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
samples		samples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
diffmat.py		diffmat.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

diffmat

Why this exists

Table of Contents

How It Works

Choosing a Method

simple — Binary Classification

variance — Statistical Variance Matting

decomposition — Linear Unmixing

Decision Flowchart

Installation

Usage

Basic

Choose a method

Hard edges (no anti-aliasing)

Custom colours and tolerance

Explicit file list

All options

Input Requirements

Filename Convention (Recommended)

Default Background Colours

Tolerance Tuning

How Antialiasing Works

Performance

Limitations

Samples

Contributing

License

Reference

Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`simple` — Binary Classification

`variance` — Statistical Variance Matting

`decomposition` — Linear Unmixing

Packages