[IE] Guard ConvertFCToConv against zero-sized channel dimensions#265
Conversation
…galOp Replace addIllegalOp<IE::FullyConnectedOp>() with addDynamicallyLegalOp using a predicate that returns true (legal/exempt) for FC ops with non-rank-2 shapes or any dimension <= 0. Zero-dim FC ops produced by per-group INT4 quantization decomposition now survive the pass unchanged instead of triggering a SIGABRT in ConvolutionOp type-inference. Follows the existing addDynamicallyLegalOp pattern established in AdjustNCEOpsWithI32InputsPass::safeRunOnFunc(). Retains a defense-in-depth guard in matchAndRewrite() as belt-and- suspenders safety net (unreachable under normal conditions). Fixes: openvinotoolkit/openvino#34450
956d5e6 to
ebf5603
Compare
|
Rebased against current develop. This is the primary fix for the SIGABRT crash reported in openvinotoolkit/openvino#34450 (PSE, assigned). The addDynamicallyLegalOp approach follows the existing pattern in AdjustNCEOpsWithI32InputsPass. LIT test included. |
andrey-golubev
left a comment
There was a problem hiding this comment.
Hi, thanks for you contribution! I believe the root cause here stems from the fact that an operation with zero-dim tensor exists at all. This should be prohibited by NPU compiler. (As in the other PR opened - #266).
Similar rationale applies: if it's a problem originating in OpenVINO, I think we have to speak with OpenVINO maintainers to fix this. If it's a problem originating in compiler, it has to be fixed in the place where such a zero-dim tensor appears.
|
@andrey-golubev — I've posted a detailed response on #266 covering the root cause investigation and what I can/cannot determine about the zero-dim origin. Short version: I accept the architectural direction. This guard is defense-in-depth, not the long-term fix. The zero-dim shape is not present at This PR (#265) is the primary crash prevention at the |
[IE] Guard ConvertFCToConv against zero-sized channel dimensions
Summary
ConvertFCToConvPass::safeRunOnFunc()unconditionally marksIE::FullyConnectedOpas illegal viatarget.addIllegalOp, forcing every FC op through conversion toIE::ConvolutionOp. TheFullyConnectedOpConverter::matchAndRewrite()blindly reshapes 2-D FC operands to 4-D{N, C, 1, 1}without validating that the channel dimension is positive. When per-group INT4 quantization decomposition (e.g.group_size=128) producesIE::FullyConnectedOpnodes whose channel dimension is zero, the reshape yieldstensor<Nx0x1x1>, which causesIE::ConvolutionOptype inference to abort with:at location
as_convolution. The SIGABRT kills the process immediately and cannot be caught by the caller.This PR replaces
addIllegalOp<IE::FullyConnectedOp>()withaddDynamicallyLegalOp<IE::FullyConnectedOp>()using a predicate that exempts zero-dim and non-rank-2 FC ops from the illegality constraint. Exempt FC ops are considered "legal" and survive the pass unchanged. Valid FC ops remain "illegal" and are converted to convolutions as before.A defense-in-depth guard in
matchAndRewrite()is also retained as a belt-and-suspenders safety net.Reproduction
Observed when compiling Qwen3-0.6B (28 layers, INT4 grouped quantization,
openvino-int4-npuformat) for the NPU as a speculative decoding draft model in OpenVINO GenAI heterogeneous mode (GPU target + NPU draft):The VPUX compiler aborts during model compilation (before inference) at the
self_attn.v_projlinear layer in transformer block 0:Environment: Intel Core Ultra 7 258V (Lunar Lake), NPU driver 32.0.100.4514, OpenVINO 2026.0, Windows 11 Pro.
Root Cause
The location trail
["fc_decomposed", "matmul_0", "as_convolution"]reveals a multi-pass interaction:GroupWisePatternRewriter(decompose multi-ZP quantization) decomposes per-group INT4 FCs into main/correction branches, producingIE::FullyConnectedOpwithfc_decomposedtag.UnrollFullyConnectedunrolls the decomposed FC, producing sub-FCs withmatmul_0tag — one of which has a zero-dim input.ConvertFCToConvattempts to convert the zero-dim FC toIE::ConvolutionOp, hitting the SIGABRT.The zero-dim FC is a valid intermediate IR artifact from the multi-pass interaction.
ConvertFCToConvmust tolerate it.Changes
src/vpux_compiler/src/dialect/IE/transforms/passes/convert_fc_to_conv.cpp:safeRunOnFunc(): Replacedtarget.addIllegalOp<IE::FullyConnectedOp>()withtarget.addDynamicallyLegalOp<IE::FullyConnectedOp>(predicate). The predicate returnstrue(legal/exempt) for FC ops with non-rank-2 shapes or any dimension ≤ 0, andfalse(illegal/must-convert) for valid 2-D FC ops. This follows the existing pattern inAdjustNCEOpsWithI32InputsPass::safeRunOnFunc().matchAndRewrite(): Retained the dimension guard as defense-in-depth (unreachable under normal conditions due to the predicate, but protects against future changes).tests/lit/NPU/dialect/IE/passes/convert_fc_to_conv_zero_dim_guard.mlir:not vpux-opt, checkingfailed to legalize) to positive test: the zero-dimIE.FullyConnectednow survives the pass unchanged.CHECK-LABEL: @PreserveZeroDimFC/CHECK: IE.FullyConnectedconfirms the op is preserved.Testing
@PreserveZeroDimFC:vpux-optexits with code 0,FileCheckconfirmsIE.FullyConnectedsurvives in the output IR.convert_fc_to_conv.mlirpositive test is unaffected — valid FC ops are still converted to convolutions.AI Usage
-This PR was developed with AI assistance (GitHub Copilot) for source code analysis, fix implementation, and PR description drafting. All code was reviewed and validated by the contributor.
Related