Summary:
Port the existing convert_conv_to_fc optimization from Vela's TFLite
graph optimiser to the TOSA graph optimiser (Python) and add 1x1 Conv2D
with 1x1 spatial IFM detection to Regor's RewriteFullyConnected (C++).
Targets Vela 5.0.0 (5.0.0._git_0f8edf7), the current active default.
ExecuTorch's DecomposeLinearPass converts nn.Linear to 1x1 Conv2D + views,
which uses ConvolutionMxN NPU block with 0.67% MAC utilization. Converting
to FullyConnected uses VectorProduct block instead, achieving ~5% MAC
utilization (7.35x cycle improvement: 9,858 -> 1,341 cycles per layer).
Python change: Add convert_conv_to_fc to tosa_graph_optimiser.py op_rewrite_list
C++ change: Add is1x1Conv && isSpatial1x1 detection in graphir_optimiser.cpp
Privacy: This diff is a compile-time compiler optimization only. No user
data is collected, shared, processed, or logged. It transforms graph nodes
(Conv2D -> FullyConnected) in ARM's Vela/Regor NPU compiler at build time.
Classification: None of the privacy categories apply.
Differential Revision: D96932767
Summary:
Port the existing convert_conv_to_fc optimization from Vela's TFLite
graph optimiser to the TOSA graph optimiser (Python) and add 1x1 Conv2D
with 1x1 spatial IFM detection to Regor's RewriteFullyConnected (C++).
Targets Vela 5.0.0 (5.0.0._git_0f8edf7), the current active default.
ExecuTorch's DecomposeLinearPass converts nn.Linear to 1x1 Conv2D + views,
which uses ConvolutionMxN NPU block with 0.67% MAC utilization. Converting
to FullyConnected uses VectorProduct block instead, achieving ~5% MAC
utilization (7.35x cycle improvement: 9,858 -> 1,341 cycles per layer).
Python change: Add convert_conv_to_fc to tosa_graph_optimiser.py op_rewrite_list
C++ change: Add is1x1Conv && isSpatial1x1 detection in graphir_optimiser.cpp
Privacy: This diff is a compile-time compiler optimization only. No user
data is collected, shared, processed, or logged. It transforms graph nodes
(Conv2D -> FullyConnected) in ARM's Vela/Regor NPU compiler at build time.
Classification: None of the privacy categories apply.
Differential Revision: D96932767