Fix ReplicationPad3d failing at CoreML ML Program runtime by tritolol · Pull Request #2697 · apple/coremltools

tritolol · 2026-05-14T08:55:16Z

Summary

Fixes torch.nn.ReplicationPad3d fails at runtime when there is padding #2571: torch.nn.ReplicationPad3d converted silently but failed at prediction time with "Padding for more than two dimensions only supports constant mode"
Adds a new MIL backend pass split_non_constant_pads that decomposes non-constant pad ops (reflect/replicate) covering >2 dimensions into sequential pads of ≤2 dimensions each, which CoreML supports
The pass runs in _BACKEND_MIL_PASSES after merge_consecutive_paddings so splits aren't merged back

Test plan

TestPad::test_replication_pad3d — regression test for symmetric 3D replication padding
TestPad::test_replication_pad3d_asymmetric — regression test for asymmetric 3D replication padding
Full TestPad suite passes (26 passed, 2 skipped for NN backend which doesn't support this case)
All mil/passes/tests/test_passes.py -k pad tests pass (215 passed)

🤖 Generated with Claude Code

LeSingh1 · 2026-05-20T00:32:04Z

@TobyRoseman, @tritolol -- I went through this carefully (was sent here from #2701). The approach looks right to me and is strictly better than what I had in #2701 (which only raised earlier); this one actually makes the case work.

Why the split is semantically safe

Sequential 1D reflect/replicate along independent axes is provably equivalent to a single multi-axis op, because each axis operation is index-independent:

replicate is just per-axis clamp; replicating an already-replicated edge value still gives the edge value.
reflect: O[i,j,k] = I[reflect_idx(i,D), reflect_idx(j,H), reflect_idx(k,W)] factors cleanly into sequential per-axis reflection over the previous result. Corner cells come out identical either way.

So the chain in _split_pads_block preserves semantics for both modes -- no corner-region drift.

Pipeline placement

Confirmed by reading pass_pipeline.py: common::merge_consecutive_paddings lives in the main _PASSES block (line 63), which runs well before _BACKEND_MIL_PASSES. The new mil_backend::split_non_constant_pads sits inside _BACKEND_MIL_PASSES (before sanitize_name_strings), so nothing later in the pipeline re-merges the chain. The placement comment in the diff is accurate.

Tests look right, plus a couple of suggestions

The two new ReplicationPad3d tests are good. Possible additions, in priority order:

ReflectionPad3d regression test -- the pass handles both modes but only replicate is exercised end-to-end. A symmetric and/or asymmetric torch.nn.ReflectionPad3d case would close the loop. Issue ReflectionPad3D fails at runtime #2576 specifically mentions reflect as one of the affected modes.
torch.nn.functional.pad rank-5 reflect/replicate -- the existing test_pad_reflect_replicate only parametrizes rank in {3, 4}. Extending it (or adding a sibling) to rank 5 with a >2D pad would catch any future regression in the functional path.
(Optional) Negative test for constant mode -- a quick assertion that the pass is a no-op when mode == "constant" (e.g., a 5D F.pad(..., mode="constant") test that completes without splitting). test_pad_constant already exercises rank 5 constants and would catch any breakage, so this is nice-to-have rather than required.

Minor

The pass skips op.inputs["pad"].val is None (dynamic pad values). Worth a one-line docstring note that models with a runtime-computed pad still hit the original runtime error -- not a blocker, just a known limit.
Naming: last chunk reuses op.name, intermediates use {op.name}_split_{i} -- good, preserves identifiability and the post-pass sanitize_name_strings handles the new names.

#2701

Happy to close #2701 once this lands -- they cover the same issues (#2576 / #2571) and this is the cleaner fix.

Nice work, @tritolol.

CoreML's ML Program runtime rejects non-constant padding modes (reflect, replicate) when more than 2 dimensions are padded, with: "Padding for more than two dimensions only supports constant mode". Previously, torch.nn.ReplicationPad3d converted silently but failed at prediction time. Add a new mil_backend pass `split_non_constant_pads` that decomposes such pad ops into sequential pads each covering at most 2 dimensions, which CoreML supports. The pass runs in _BACKEND_MIL_PASSES after merge_consecutive_paddings to prevent splits from being merged back. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Extend test_pad_reflect_replicate parametrization to rank 5 to cover the functional path for both reflect and replicate modes on >2 dims - Add test_reflection_pad3d and test_reflection_pad3d_asymmetric regression tests (reflect mode was handled by the pass but untested) - Add docstring note to split_non_constant_pads about the known limitation with dynamic (runtime-computed) pad values Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Explain why the skip applies only to rank 5: the NN backend's add_padding (op_mapping.py:2604) requires all padding outside the last 2 dimensions to be zero. Rank-5 with 3 spatial dims padded violates this; ranks 3 and 4 pad only 1-2 spatial dims and pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

tritolol · 2026-05-20T14:52:03Z

Thanks for the thorough review @LeSingh1! Addressed all points:

Tests added

Extended test_pad_reflect_replicate to rank 5 (range(3, 6)) — covers the functional F.pad path for both reflect and replicate with 3 spatial dims padded
Added test_reflection_pad3d and test_reflection_pad3d_asymmetric — closes the gap where reflect mode was handled by the pass but not exercised end-to-end

Docstring

Added a **Limitation** note to split_non_constant_pads explaining that pad ops with runtime-computed (dynamic) pad values are skipped and will still fail at CoreML runtime

Clarified the NN backend skip

The skip in test_pad_reflect_replicate now has an inline comment pointing to op_mapping.py:2604 and explaining why it only applies to rank 5: the NN backend's add_padding requires all(i == 0 for i in pad[:-4]), which holds for ranks 3–4 (1–2 spatial dims padded) but fails for rank 5 (3 spatial dims padded, so the D-dimension leaks outside the pad[-4:] window)

Also rebased onto the latest main.

TobyRoseman · 2026-05-22T16:43:56Z

@tritolol - can you please explain, in your own words (i.e. not output from Claude), why this is implemented as a pass. Why can't this be implemented at op conversion time? That would be a much more self-contained and straightforward change.

tritolol force-pushed the fix/replication-pad3d-runtime-failure branch from 413106c to 608c8dd Compare May 14, 2026 09:04

TobyRoseman mentioned this pull request May 18, 2026

Raise NotImplementedError at conversion time for reflect/replicate pad >2D (fixes #2576, #2571) #2701

Open

Adrian Bauer and others added 3 commits May 20, 2026 16:42

tritolol force-pushed the fix/replication-pad3d-runtime-failure branch from 608c8dd to 847ad7e Compare May 20, 2026 14:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ReplicationPad3d failing at CoreML ML Program runtime#2697

Fix ReplicationPad3d failing at CoreML ML Program runtime#2697
tritolol wants to merge 3 commits into
apple:mainfrom
tritolol:fix/replication-pad3d-runtime-failure

tritolol commented May 14, 2026 •

edited

Loading

Uh oh!

LeSingh1 commented May 20, 2026

Uh oh!

tritolol commented May 20, 2026 •

edited

Loading

Uh oh!

TobyRoseman commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tritolol commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

LeSingh1 commented May 20, 2026

Why the split is semantically safe

Pipeline placement

Tests look right, plus a couple of suggestions

Minor

#2701

Uh oh!

tritolol commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TobyRoseman commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tritolol commented May 14, 2026 •

edited

Loading

tritolol commented May 20, 2026 •

edited

Loading