Skip to content

Signal when SPIR-V legalization needs loop unroll#15

Draft
AnastaZIuk wants to merge 6 commits intomainfrom
unroll
Draft

Signal when SPIR-V legalization needs loop unroll#15
AnastaZIuk wants to merge 6 commits intomainfrom
unroll

Conversation

@AnastaZIuk
Copy link
Copy Markdown
Member

@AnastaZIuk AnastaZIuk commented Mar 20, 2026

Summary

  • add dedicated SpirvEmitter bits for legalization-time loop unroll and legalization-time SSA rewrite
  • set those bits only for the known lowering paths that still need those expensive transforms for correctness
  • pass the bits into the narrowed SPIR-V legalization overload
  • update a few CodeGenSPIRV checks so they validate the semantically relevant SPIR-V shape instead of the older inflated IR shape
  • update the nested SPIR-V submodule pointers so this branch materializes the validated companion optimizer state
  • use the DXC trunk Godbolt reproducer, which is the preprocessed output of our path tracer at about 58k LoC

Root cause

The expensive blanket policy lives in the SPIR-V optimizer recipe, not in HLSL parsing or the DXC CLI front-end.

DXC is still the producer that knows when a specific lowering path genuinely needs those expensive legalization transforms for correctness, so the optimizer should not have to guess that globally.

Two confirmed correctness-sensitive cases are:

  • variable image sample offsets, which still need materialized loop unroll to legalize into forms such as ConstOffset
  • combined image sampler conversion, which does not need loop unroll but still needs legalize-time SSA rewrite cleanup to avoid invalid SPIR-V such as storing sampled image types

This patch provides that producer-side signal. The companion optimizer change in KhronosGroup/SPIRV-Tools#6612 then uses it to avoid the broader blanket defaults.

Validation

  • reproducer: godbolt.org/z/o5xf1hq36 (note: Compiler Explorer cache can make repeated runs look much faster than a cold compile)
  • shader payload: preprocessed output of our path tracer at about 58k LoC
  • local machine: AMD Ryzen 5 5600G with Radeon Graphics, 6 physical cores, 12 logical processors, Windows-reported max clock 3901 MHz
  • on the same payload and the same machine, SPIRV-Tools@487ff843bd8a + DXC@bd9a8b1c5365 reduced the workload from 19.161 s to 6.042 s
  • with SPIRV-Tools@57007cf46bb4 + DXC@b02b772e0b50, the same payload measured 4.702 s
  • with the current branch pair SPIRV-Tools@7134be5024ff + DXC@55112e338fd2, the same payload now measures 2.464 s
  • full local CodeGenSPIRV lit/FileCheck passes: 1403 expected passes, 2 expected failures, 0 unexpected

Companion SPIR-V optimizer PR:
KhronosGroup/SPIRV-Tools#6612

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 20, 2026

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 54afd2c6a5f1a5bcbef66967515fa69f5a3650fd 899afef2a8665fa9c861b6b7be8562bf7efb850e -- include/dxc/Support/SPIRVOptions.h lib/DxcSupport/HLSLOptions.cpp tools/clang/lib/Frontend/Rewrite/RewriteObjC.cpp tools/clang/lib/SPIRV/SpirvEmitter.cpp tools/clang/lib/SPIRV/SpirvEmitter.h
View the diff from clang-format here.
diff --git a/lib/DxcSupport/HLSLOptions.cpp b/lib/DxcSupport/HLSLOptions.cpp
index 0c43b9ff..bda90be7 100644
--- a/lib/DxcSupport/HLSLOptions.cpp
+++ b/lib/DxcSupport/HLSLOptions.cpp
@@ -792,9 +792,8 @@ int ReadDxcOpts(const OptTable *optionTable, unsigned flagsToInclude,
   }
 
   opts.DisableOptimizations = false;
-  if (Arg *A =
-          Args.getLastArg(OPT_O0, OPT_O1, OPT_O1experimental, OPT_O2, OPT_O3,
-                          OPT_Od)) {
+  if (Arg *A = Args.getLastArg(OPT_O0, OPT_O1, OPT_O1experimental, OPT_O2,
+                               OPT_O3, OPT_Od)) {
     if (A->getOption().matches(OPT_O0))
       opts.OptLevel = 0;
     if (A->getOption().matches(OPT_O1))
@@ -1267,8 +1266,7 @@ int ReadDxcOpts(const OptTable *optionTable, unsigned flagsToInclude,
       errors << "-Oconfig should not be specified more than once";
       return 1;
     }
-    if (Args.getLastArg(OPT_O0, OPT_O1, OPT_O1experimental, OPT_O2,
-                        OPT_O3)) {
+    if (Args.getLastArg(OPT_O0, OPT_O1, OPT_O1experimental, OPT_O2, OPT_O3)) {
       errors << "-Oconfig should not be used together with -O";
       return 1;
     }
diff --git a/tools/clang/lib/SPIRV/SpirvEmitter.cpp b/tools/clang/lib/SPIRV/SpirvEmitter.cpp
index 82b53abb..f569d42a 100644
--- a/tools/clang/lib/SPIRV/SpirvEmitter.cpp
+++ b/tools/clang/lib/SPIRV/SpirvEmitter.cpp
@@ -593,10 +593,9 @@ SpirvEmitter::SpirvEmitter(CompilerInstance &ci)
       constEvaluator(astContext, spvBuilder), entryFunction(nullptr),
       curFunction(nullptr), curThis(nullptr), seenPushConstantAt(),
       isSpecConstantMode(false), needsLegalization(false),
-      needsLegalizationLoopUnroll(false),
-      needsLegalizationSsaRewrite(false),
-      sawExplicitUnrollHint(false),
-      beforeHlslLegalization(false), mainSourceFile(nullptr) {
+      needsLegalizationLoopUnroll(false), needsLegalizationSsaRewrite(false),
+      sawExplicitUnrollHint(false), beforeHlslLegalization(false),
+      mainSourceFile(nullptr) {
 
   // Get ShaderModel from command line hlsl profile option.
   const hlsl::ShaderModel *shaderModel =
@@ -972,8 +971,7 @@ void SpirvEmitter::HandleTranslationUnit(ASTContext &context) {
       !dsetbindingsToCombineImageSampler.empty() ||
       spirvOptions.signaturePacking;
   needsLegalizationSsaRewrite =
-      needsLegalizationSsaRewrite ||
-      !dsetbindingsToCombineImageSampler.empty();
+      needsLegalizationSsaRewrite || !dsetbindingsToCombineImageSampler.empty();
 
   // Run legalization passes
   if (spirvOptions.codeGenHighLevel) {
  • Check this box to apply formatting changes to this branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant