Skip to content

Fix matmul type_inference for 1-D inputs#2719

Open
adityasingh2400 wants to merge 1 commit into
apple:mainfrom
adityasingh2400:fix-matmul-1d-type-inference
Open

Fix matmul type_inference for 1-D inputs#2719
adityasingh2400 wants to merge 1 commit into
apple:mainfrom
adityasingh2400:fix-matmul-1d-type-inference

Conversation

@adityasingh2400
Copy link
Copy Markdown

Summary

matmul.type_inference in iOS15/linear.py does not match numpy.matmul (and the op's own value_inference) when either input is 1-D. This is issue #2263.

For the reported case, x.shape == (4,), y.shape == (10, 4, 3):

import numpy as np
import coremltools as ct
from coremltools.converters.mil import Builder as mb

@mb.program(input_specs=[mb.TensorSpec(shape=(4,)), mb.TensorSpec(shape=(10, 4, 3))],
            opset_version=ct.target.iOS15)
def prog(x, y):
    return mb.matmul(x=x, y=y)
# inferred output shape was (1, 3); numpy gives (10, 3)

Root cause

Two separate problems:

  • 1-D x: the code promotes x to [1, K], computes the broadcast result, then returns ret_shape[1:] to drop the inserted dimension. The inserted 1 lands at index -2 of the result (the result is [*batch, 1, N]), not index 0, so the slice drops the leading batch dimension instead. (4,) x (10, 4, 3) returned (1, 3) rather than (10, 3).

  • 1-D y: never handled. The contraction check reads y_shape[-2] before any promotion, so any 1-D y (e.g. (10, 3, 4) x (4,)) raised IndexError, and the both-1-D scalar case crashed the same way. This is the second symptom reported in the issue thread.

Fix

Follow numpy semantics directly: prepend a 1 to a 1-D x, append a 1 to a 1-D y, run the existing batch broadcast, then remove exactly the inserted axes (x's prepended dim at index -2, y's appended dim at index -1, dropping the higher index first so the lower one stays valid). transpose_y on a 1-D y is now rejected to mirror the existing transpose_x guard. 2-D and batched cases are untouched.

The inferred shape now matches numpy.matmul for every combination of {x 1-D, y 1-D, both, neither} x {transpose_x, transpose_y}, including the scalar both-1-D case and symbolic batch dims.

Test

Added TestMatMul::test_builder_1d_input, a parametrized regression test in iOS14/test_linear.py covering 1-D x, 1-D y, both 1-D, and the transpose variants. It fails on the current code (wrong shapes and IndexErrors) and passes with the fix. Existing matmul tests are unaffected.

Fixes #2263

matmul.type_inference did not match numpy.matmul (and the op's own
value_inference) when either input is 1-D.

For a 1-D x it promoted x to [1, K] and then sliced ret_shape[1:] to
drop the inserted dimension. The inserted 1 sits at index -2 of the
result, not index 0, so this dropped the leading batch dimension
instead. For x=(4,), y=(10, 4, 3) it returned (1, 3) where numpy
gives (10, 3).

A 1-D y was never handled at all: the contraction check reads
y_shape[-2] before any promotion, so any 1-D y (e.g. x=(10, 3, 4),
y=(4,)) raised IndexError, and the both-1-D case meant to return a
scalar crashed the same way.

Follow numpy semantics directly: prepend a 1 to a 1-D x, append a 1
to a 1-D y, run the existing batch broadcast, then remove exactly the
inserted axes (x's at index -2, y's at index -1, dropping the higher
index first). transpose_y on a 1-D y is now rejected to match the
existing transpose_x guard. 2-D and batched cases are unchanged.

Add a parametrized regression test covering the 1-D x, 1-D y, both
1-D, and transpose combinations.

Fixes apple#2263
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ML Program matmul not handling 1D input correctly

1 participant