Skip to content

Reshape per-channel ImageType.scale to broadcast over channel-first input (fixes #2461)#2709

Open
LeSingh1 wants to merge 1 commit into
apple:mainfrom
LeSingh1:fix/imagetype-per-channel-scale
Open

Reshape per-channel ImageType.scale to broadcast over channel-first input (fixes #2461)#2709
LeSingh1 wants to merge 1 commit into
apple:mainfrom
LeSingh1:fix/imagetype-per-channel-scale

Conversation

@LeSingh1
Copy link
Copy Markdown
Contributor

Summary

ImageType.scale is documented as

scale: float or list of floats

The scaling factor for all values in the image channels.

…matching bias. But when a user actually supplied a per-channel list (e.g. [1/127.5, 1/127.5, 1/127.5] for an RGB MobileNetV2 input), the mil_backend::insert_image_preprocessing_ops pass dropped it through np.array(...) and emitted an mb.mul whose scale tensor still had shape (3,). That tensor then tried to broadcast against the channel-first input (N, 3, H, W) and failed in backend_mlprogram with:

ValueError: Incompatible dim 3 in shapes (1, 3, 224, 224) vs. (1, 1, 1, 3)

Fix

The neighbouring per-channel bias path already reshapes the constant to (3, 1, 1) for rank-3 inputs and (1, 3, 1, 1) for rank-4 inputs. Apply the same reshape to scale:

scale_arr = np.array(input_type.scale, dtype=input_nptype)
if scale_arr.ndim > 0 and input_type.color_layout not in (
    _input_types.ColorLayout.GRAYSCALE,
    _input_types.ColorLayout.GRAYSCALE_FLOAT16,
):
    if len(last_output.shape) == 3:
        scale_arr = scale_arr.reshape([3, 1, 1])
    elif len(last_output.shape) == 4:
        scale_arr = scale_arr.reshape([1, 3, 1, 1])
    else:
        raise TypeError("Unsupported rank for image input type.")
last_output = mb.mul(x=last_output, y=scale_arr, name=input_var.name + "__scaled__")

Scalar scales (scale_arr.ndim == 0) and grayscale layouts skip the reshape — no other code paths change.

Tests

Two new regression tests under TestImagePreprocessingPass:

Both pass locally. The full TestImagePreprocessingPass suite (14 tests including the existing scalar-scale and per-channel-bias cases) also continues to pass:

$ pytest coremltools/converters/mil/backend/mil/passes/test_passes.py::TestImagePreprocessingPass -v
============================== 14 passed in 0.05s ==============================

Issue

Fixes #2461

`ImageType.scale` is documented as "float or list of floats", but when a
user supplied a list (e.g. `[1/127.5, 1/127.5, 1/127.5]` for RGB) the
preprocessing pass dropped it through `np.array(...)` and emitted an
`mb.mul` whose scale tensor still had shape `(3,)`. That then tried to
broadcast against the channel-first input `(N, 3, H, W)` and failed in
`backend_mlprogram` with:

    ValueError: Incompatible dim 3 in shapes (1, 3, 224, 224) vs. (1, 1, 1, 3)

The neighbouring per-channel `bias` path already reshapes to `(3, 1, 1)`
for rank-3 inputs and `(1, 3, 1, 1)` for rank-4 inputs; scale should do
the same. Scalar scales and grayscale layouts keep the existing shape so
no other paths change.

Adds two regression tests (rank-3 and rank-4) covering the `(3,) →
(3,1,1)` / `(1,3,1,1)` reshape on RGB images. Both tests pass; existing
`TestImagePreprocessingPass` cases continue to pass.

Fixes apple#2461
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ImageType causes an error in TensorFlow 2 model conversion with a scale parameter provided as list

1 participant