When implementing and testing maxPool2d with roundingType: "ceil", we encountered a case where different backends produce inconsistent results for windows that cover only padding elements. The WebNN specification does not seem to explicitly define the expected behavior for this edge case.
Test Case
Input Shape: [1, 2, 5, 5]
Window Dimensions: [3, 3]
Strides: [3, 3]
Padding: [1, 1, 1, 1] (Symmetric)
Rounding Type: 'ceil'
With these parameters, the output shape is [1, 2, 3, 3]. The windows in the last row and last column cover only padding elements (since the valid input is 5x5 and padding expands it to 7x7, but the stride of 3 moves the last window to start at index 6, covering indices 6, 7, 8 which are padding or outside).
Observations
- TFLite Backend: Returns 0 for these empty windows.
- CoreML Backend: Returns -Infinity for these empty windows.
If padding elements are strictly ignored for maxPool2d (to prevent padding values from becoming the maximum), then a window covering only padding elements contains no valid elements.
Mathematically, the maximum of an empty set is defined as $-\infty$. CoreML's behavior is consistent with this interpretation.
TFLite's behavior suggesting it treats padding as 0 or defaults to 0 when no valid elements are present.
Questions for the Working Group
- What is the expected behavior when a pooling window covers only padding elements for maxPool2d?
- Should the spec clarify whether padding elements are treated as $-\infty$ (ignored) or if there is a fallback value like 0 when no valid input elements are in the window?
When implementing and testing maxPool2d with roundingType: "ceil", we encountered a case where different backends produce inconsistent results for windows that cover only padding elements. The WebNN specification does not seem to explicitly define the expected behavior for this edge case.
Test Case
Input Shape: [1, 2, 5, 5]
Window Dimensions: [3, 3]
Strides: [3, 3]
Padding: [1, 1, 1, 1] (Symmetric)
Rounding Type: 'ceil'
With these parameters, the output shape is [1, 2, 3, 3]. The windows in the last row and last column cover only padding elements (since the valid input is 5x5 and padding expands it to 7x7, but the stride of 3 moves the last window to start at index 6, covering indices 6, 7, 8 which are padding or outside).
Observations
If padding elements are strictly ignored for maxPool2d (to prevent padding values from becoming the maximum), then a window covering only padding elements contains no valid elements.
Mathematically, the maximum of an empty set is defined as$-\infty$ . CoreML's behavior is consistent with this interpretation.
TFLite's behavior suggesting it treats padding as 0 or defaults to 0 when no valid elements are present.
Questions for the Working Group