Arm backend: Eliminate redundant NCHW↔NHWC permute_copy and NHWC-safe view_copy transposes in ToTosaMemoryFormatPass (#18167)#18167
Arm backend: Eliminate redundant NCHW↔NHWC permute_copy and NHWC-safe view_copy transposes in ToTosaMemoryFormatPass (#18167)#181673l1 wants to merge 1 commit intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18167
Note: Links to docs will display an error until the docs builds have been completed. ❌ 8 New Failures, 2 Cancelled Jobs, 2 Pending, 3 Unrelated FailuresAs of commit 79951e1 with merge base bb8318d ( NEW FAILURES - The following jobs have failed:
CANCELLED JOBS - The following jobs were cancelled. Please retry:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
digantdesai
left a comment
There was a problem hiding this comment.
Review automatically exported from Phabricator review in Meta.
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Pull Request resolved: pytorch#18167 Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. NHWC-safe reshape detection: When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. Redundant permute_copy elimination: Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Handles both 4D (rank>=4, sr>=2) and 3D (rank>=3, sr>=1) permutations. This reduces Vela Transpose entries from 75→33 (-56%), Transpose op cycles from 33.4K→6.1K (-82%), and NPU operators from 367→329 (-38). Reviewed By: digantdesai Differential Revision: D96432610
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. NHWC-safe reshape detection: When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. Redundant permute_copy elimination: Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Handles both 4D (rank>=4, sr>=2) and 3D (rank>=3, sr>=1) permutations. Reviewed By: digantdesai Differential Revision: D96432610
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Pull Request resolved: pytorch#18167 Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. NHWC-safe reshape detection: When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. Redundant permute_copy elimination: Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Handles both 4D (rank>=4, sr>=2) and 3D (rank>=3, sr>=1) permutations. Reviewed By: digantdesai Differential Revision: D96432610
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. NHWC-safe reshape detection: When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. Redundant permute_copy elimination: Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Handles both 4D (rank>=4, sr>=2) and 3D (rank>=3, sr>=1) permutations. Reviewed By: digantdesai Differential Revision: D96432610
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. NHWC-safe reshape detection: When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. Redundant permute_copy elimination: Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Handles both 4D (rank>=4, sr>=2) and 3D (rank>=3, sr>=1) permutations. Reviewed By: digantdesai Differential Revision: D96432610
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. NHWC-safe reshape detection: When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. Redundant permute_copy elimination: Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Handles both 4D (rank>=4, sr>=2) and 3D (rank>=3, sr>=1) permutations. Reviewed By: digantdesai Differential Revision: D96432610
8fff8d2 to
04da7fe
Compare
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. NHWC-safe reshape detection: When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. Redundant permute_copy elimination: Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Handles both 4D (rank>=4, sr>=2) and 3D (rank>=3, sr>=1) permutations. Reviewed By: digantdesai Differential Revision: D96432610
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. NHWC-safe reshape detection: When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. Redundant permute_copy elimination: Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Handles both 4D (rank>=4, sr>=2) and 3D (rank>=3, sr>=1) permutations. For the CC EMG model, this reduces Vela Transpose entries from 75→33 (-56%), Transpose op cycles from 33.4K→6.1K (-82%), and NPU operators from 367→329 (-38). Also removes the failed ReorderToNHWCPass which targeted permute_copy→ view_copy→permute_copy patterns that don't exist in the Edge IR graph. Reviewed By: digantdesai Differential Revision: D96432610
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Pull Request resolved: pytorch#18167 Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. NHWC-safe reshape detection: When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. Redundant permute_copy elimination: Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Handles both 4D (rank>=4, sr>=2) and 3D (rank>=3, sr>=1) permutations. For the CC EMG model, this reduces Vela Transpose entries from 75→33 (-56%), Transpose op cycles from 33.4K→6.1K (-82%), and NPU operators from 367→329 (-38). Also removes the failed ReorderToNHWCPass which targeted permute_copy→ view_copy→permute_copy patterns that don't exist in the Edge IR graph. Reviewed By: digantdesai Differential Revision: D96432610
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Reviewed By: digantdesai Differential Revision: D96432610
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Pull Request resolved: pytorch#18167 Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Reviewed By: digantdesai Differential Revision: D96432610
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Reviewed By: digantdesai Differential Revision: D96432610
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Reviewed By: digantdesai Differential Revision: D96432610
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Pull Request resolved: pytorch#18167 Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Reviewed By: digantdesai Differential Revision: D96432610
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Reviewed By: digantdesai Differential Revision: D96432610
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Pull Request resolved: pytorch#18167 Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Reviewed By: digantdesai Differential Revision: D96432610
…ansposes in ToTosaMemoryFormatPass (pytorch#18167) Summary: Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes: 1. **NHWC-safe reshape detection:** When a 4D→4D view_copy has monotonic shape_indices on the raw shapes and preserves the last dimension (NHWC channel), skip inserting input/output transposes. The view_copy can operate directly on NHWC data. 2. **Redundant permute_copy elimination:** Model-level permute_copy ops whose permutation matches channels_last_order (NCHW→NHWC) or its inverse (NHWC→NCHW) are redundant with the tosa_dim_order annotation that already handles format conversion. Replace them with view_copy (identity reshape) to avoid generating TOSA TRANSPOSE nodes. Reviewed By: digantdesai Differential Revision: D96432610
|
Im investigating the failing tests... |
Summary:
Two optimizations in ToTosaMemoryFormatPass to reduce TOSA TRANSPOSE nodes:
NHWC-safe reshape detection: When a 4D→4D view_copy has monotonic
shape_indices on the raw shapes and preserves the last dimension (NHWC
channel), skip inserting input/output transposes. The view_copy can
operate directly on NHWC data.
Redundant permute_copy elimination: Model-level permute_copy ops whose
permutation matches channels_last_order (NCHW→NHWC) or its inverse
(NHWC→NCHW) are redundant with the tosa_dim_order annotation that
already handles format conversion. Replace them with view_copy (identity
reshape) to avoid generating TOSA TRANSPOSE nodes.
Reviewed By: digantdesai
Differential Revision: D96432610