[SM6.10][HLK] Restructure LinAlg Tests by V-FEXrt · Pull Request #8323 · microsoft/DirectXShaderCompiler

V-FEXrt · 2026-04-01T02:15:41Z

There are several infra improvements in this PR

Use std::variant to not duplicate every variable for every type in the test
Extract common code into several helper functions
Set ELEM_SIZE on every test
BUGFIX: Use ELEM_SIZE as the align parameter for Load/Store
Add a flag that runs an alternative "emulated" shader that has the expected side effects of the real shader. Useful for exercising the infra even though the shader itself can't run
Use F16 type instead of F32/I32 as F16 is the T1 type

Co-authored-by: Alex Sepkowski alexsepkowski@gmail.com

github-actions · 2026-04-01T02:16:30Z

✅ With the latest revision this PR passed the C/C++ code formatter.

alsepkow

LGTM. A few minor comments.

tools/clang/unittests/HLSLExec/LinAlgTests.cpp

damyanp

Superficial review - LGTM.

tools/clang/unittests/HLSLExec/HlslTestDataTypes.h

tools/clang/unittests/HLSLExec/LinAlgTests.cpp

tex3d

There are a lot of changes in this PR.

Some are (mostly) code moves, which I had to do an external comparison on to see changes made during the move (there were some). I wish pure moves and formatting changes could have been made as a separate NFC PR, or at least as a separate commit.

There are also new classes: HLSLNorm_t (and variants), F8E4M3_t, and F8E5M2_t, which are not used anywhere except in isFloatingPointType (which is changed to include these).

I think the new classes should be kept out of this PR, since they are not needed/used by it. Doing so would save some required extra careful review, since these can be tricky.

I reviewed moved code by copy-pasting it into the original locations on a local clone and looking at the diff. This reveals several types of changes:

Fixed spelling in comment: constuctor -> constructor.
inline required for moving to header
Parameter and member name changes:
- ValidationType -> VType
  - Only for some overloads, why?
- ValidationConfig -> Validation
  - If you wanted to avoid the type name collision, why not go with Config, which seems more intuitive?
Deleted comments: // Tolerance is in ULPs. Convert to (int, int64_t) for the comparison.
Deleted extraneous ;
Modified HLSLHalf_t constructor (see comment)

I had some more comments too.

tex3d · 2026-04-01T19:43:31Z

tools/clang/unittests/HLSLExec/HlslTestDataTypes.h

+    return static_cast<bool>(Val) != static_cast<bool>(Other.Val);
+  }
+
+  bool operator<(const HLSLBool_t &Other) const { return Val < Other.Val; }


Whenever we use Val in these operators, shouldn't we cast to bool, since we only ever want to interpret the value as either true or false? Also, where are these comparison operators used?

I realize now that this code has simply been moved. Is there a way to separate moved code changes from other changes?

Moved code yeah. imo the comment should be addressed separately

I don't think the LT, GT, LTE, or GTE are used with HLSLBool_t anymore. Can probably remove them. But I can do that in a subsequent PR.

tex3d · 2026-04-01T20:56:17Z

tools/clang/unittests/HLSLExec/LinAlgTests.cpp

-    WEX::Logging::Log::Result(WEX::Logging::TestResults::Skipped);
+  compileShader(DxcSupport, LoadStoreShader, Target.c_str(), Args, Verbose);
+
+  if (shouldSkipBecauseSM610Unsupported(Device))


Shouldn't we always end the test here, since if we don't have a device in HLK mode, we still can't proceed, and we must have emitted an error earlier when trying to create the device, right?

I think its helpful to talk about when we intend to skip. (Specifically, skip, not end)

Right now there is a useful flow through the test that is basically

try to get a SM6.10 device

fail because non exist yet

keep going anyway and use dxc to compile the SM6.10 shader

if it compiles exit with skip (setup was okay, but we can't run yet)

if it doesn't, exit with fail (can't compile the SM6.10 shader, that's a bug that shouldn't pass CI)

This function is in service of that flow. If we have a device or if we are in HLK mode we should always just continue the test and let something else be the cause of failure

Generally, it's odd to proceed with a null Device with any other operations. That's a setup failure, and if we're on HLK, we should abort at that point, otherwise, we should skip at that point. I understand the desire to still try a shader compile at this time because you don't have a device that supports 6.10 at all, and that's fine for now, but this is a test-development-time convenience, and it doesn't justify executing test code beyond this point.

if we are in HLK mode we should always just continue the test and let something else be the cause of failure

I don't think so. I think we should abort the test with a failure if we failed to create the device (which would have logged an error). You return at this point when not HLK because the usefulness of proceeding has ended. The same goes for HLK, but usefulness of proceeding ended even earlier (there's no point in going through with anything else in the HLK if device creation failed).

Any other failure that happens down the line as a result of a null Device will just add noise to the failing test result that needs to be ignored (the real failure was the device creation failure).

I think you want to return from this point if (!Device) whether or not you're in HLK mode. The difference is whether you log a comment and call this a skip or log an error and abort.

tools/clang/unittests/HLSLExec/HlslTestDataTypes.h

tex3d · 2026-04-01T22:12:08Z

tools/clang/unittests/HLSLExec/LinAlgTests.cpp


+  if (EmulateTest) {
+    hlsl_test::LogWarningFmt(L"EmulateTest flag set. Tests are NOT REAL");
+    return D3D12SDK->createDevice(&D3DDevice, D3D_SHADER_MODEL_6_8, false);


We shouldn't create the device twice when EmulateTest is specified. This will break if the first device create succeeded (device reports SM 6.10 support).

My thoughts on this were that the EmulateTest flag takes precedent. Even if we successfully create the SM6.10 device the downstream tests are still going to run the emulated shaders. If we don't take this branch then it'll look like the tests pass when they didn't actually run at all. Maybe the fix is to put this branch above the one that attempts to create the SM6.10 device.

Thoughts?

Yes, you should put it above the other device creation. Otherwise, the non-null D3DDevice CComPtr will cause the second call to fail because the pointer is already initialized.

I would normally recommend you move this into setupClass and call this createDevice() method instead of embedding the logic inside this createDevice() method, but I see that it's really a convenient way for you to keep going when you have no valid device so you can perform compilation before skipping/aborting, and to use the same emulation path when recreating a device in setupMethod. So perhaps it's ok to just move it earlier.

It's unexpected that your createDevice always returns true, even when it fails, unless it's in HLK mode. That's counterintuitive. I get that you return this from setupClass, which means it would fail on device creation failure, but that specific decision probably belongs in setupClass, not inside createDevice. And by extension, in setupMethod when recreating the device. That way it's more obvious you've chosen to proceed with a device creation failure in this code.

Your createDevice seems to invert the SkipUnsupported parameter when calling D3D12SDK->createDevice. If this flag is true, D3D12SDK->createDevice will register a skip. You do not want that in HLK mode, yet you initialize it to FailIfRequirementsNotMet which is true in HLK mode. If you'd rather control test skips separately, you should just pass false here instead.

Also, regarding FailIfRequirementsNotMet, when in HLK mode, your log message says it's failing because this FailIfRequirementsNotMet is set. But this is a minute internal code detail of this particular method, not visible to the testing interface. I think it should suffice to say that it is a device creation failure for the required shader model support.

Also, in setupMethod, you assume/log that the device was lost if you have a null D3DDevice, or it's been removed. But you also explicitly proceed if device creation fails, resulting in a null D3DDevice, just so you can try compilation before aborting. So you're going to get a bunch of these misleading "device lost" messages, unless you update the method to account for this condition.

V-FEXrt · 2026-04-01T22:35:45Z

I wish pure moves and formatting changes could have been made as a separate NFC PR, or at least as a separate commit.

The vast majority of the code moves were triggered at the 11th hour of a pending deadline to satisfy your requirement of the use of F16 for the smoke test. There wasn't time to split this up into multiple PRs and get the necessary round of reviews and merge by deadlines. Multiple commits may have been possible, but I had a bunch of other pending changes before the last minute request came through. I'll break this up into several PRs now and adjust expectations on when things will land

tex3d · 2026-04-01T22:39:58Z

I wish pure moves and formatting changes could have been made as a separate NFC PR, or at least as a separate commit.

The vast majority of the code moves were triggered at the 11th hour of a pending deadline to satisfy your requirement of the use of F16 for the smoke test. There wasn't time to split this up into multiple PRs and get the necessary round of reviews and merge by deadlines. Multiple commits may have been possible, but I had a bunch of other pending changes before the last minute request came through. I'll break this up into several PRs now and adjust expectations on when things will land

I'm not asking you to break it up now. But I do have some feedback about removing unused additions and a few other changes.

V-FEXrt · 2026-04-02T01:11:19Z

@tex3d

It turns out it was easiest start from "scratch" on the code moves and just move the exact minimal set of code vs trying to go back and handle each case. You can now review by commit where 3eb0d28 does the moves and 70fefce does the rest.

I'm going to pull in a couple quick fixes (align => 128, fix a messy switch I wasn't happy with) but all the changes in the moved code has been reverted.

If there are any of those changes you'd like to see kept, please not them and I'll get them included in a separate PR

tex3d · 2026-04-02T22:03:38Z

tools/clang/unittests/HLSLExec/LinAlgTests.cpp

+                             bool Verbose, HLSLHalf_t Tolerance = 0.0f) {
+  bool Success = true;
+  for (size_t I = 0; I < Count; I++) {
+    HLSLHalf_t Diff = Actual[I] - Expected[I];


Why aren't these using comparison methods from HlslTestUtils.h?

tex3d · 2026-04-02T22:24:53Z

tools/clang/unittests/HLSLExec/LinAlgTests.cpp

-    WEX::Logging::Log::Result(WEX::Logging::TestResults::Skipped);
+  compileShader(DxcSupport, LoadStoreShader, Target.c_str(), Args, Verbose);
+
+  if (shouldSkipBecauseSM610Unsupported(Device))


Generally, it's odd to proceed with a null Device with any other operations. That's a setup failure, and if we're on HLK, we should abort at that point, otherwise, we should skip at that point. I understand the desire to still try a shader compile at this time because you don't have a device that supports 6.10 at all, and that's fine for now, but this is a test-development-time convenience, and it doesn't justify executing test code beyond this point.

if we are in HLK mode we should always just continue the test and let something else be the cause of failure

I don't think so. I think we should abort the test with a failure if we failed to create the device (which would have logged an error). You return at this point when not HLK because the usefulness of proceeding has ended. The same goes for HLK, but usefulness of proceeding ended even earlier (there's no point in going through with anything else in the HLK if device creation failed).

Any other failure that happens down the line as a result of a null Device will just add noise to the failing test result that needs to be ignored (the real failure was the device creation failure).

I think you want to return from this point if (!Device) whether or not you're in HLK mode. The difference is whether you log a comment and call this a skip or log an error and abort.

tex3d · 2026-04-02T22:58:08Z

tools/clang/unittests/HLSLExec/LinAlgTests.cpp


+  if (EmulateTest) {
+    hlsl_test::LogWarningFmt(L"EmulateTest flag set. Tests are NOT REAL");
+    return D3D12SDK->createDevice(&D3DDevice, D3D_SHADER_MODEL_6_8, false);


Yes, you should put it above the other device creation. Otherwise, the non-null D3DDevice CComPtr will cause the second call to fail because the pointer is already initialized.

I would normally recommend you move this into setupClass and call this createDevice() method instead of embedding the logic inside this createDevice() method, but I see that it's really a convenient way for you to keep going when you have no valid device so you can perform compilation before skipping/aborting, and to use the same emulation path when recreating a device in setupMethod. So perhaps it's ok to just move it earlier.

It's unexpected that your createDevice always returns true, even when it fails, unless it's in HLK mode. That's counterintuitive. I get that you return this from setupClass, which means it would fail on device creation failure, but that specific decision probably belongs in setupClass, not inside createDevice. And by extension, in setupMethod when recreating the device. That way it's more obvious you've chosen to proceed with a device creation failure in this code.

Your createDevice seems to invert the SkipUnsupported parameter when calling D3D12SDK->createDevice. If this flag is true, D3D12SDK->createDevice will register a skip. You do not want that in HLK mode, yet you initialize it to FailIfRequirementsNotMet which is true in HLK mode. If you'd rather control test skips separately, you should just pass false here instead.

Also, regarding FailIfRequirementsNotMet, when in HLK mode, your log message says it's failing because this FailIfRequirementsNotMet is set. But this is a minute internal code detail of this particular method, not visible to the testing interface. I think it should suffice to say that it is a device creation failure for the required shader model support.

Also, in setupMethod, you assume/log that the device was lost if you have a null D3DDevice, or it's been removed. But you also explicitly proceed if device creation fails, resulting in a null D3DDevice, just so you can try compilation before aborting. So you're going to get a bunch of these misleading "device lost" messages, unless you update the method to account for this condition.

tex3d

Approving assuming some follow-up adjustments are coming to address some concerns in a subsequent PR, since none should be blocking for the merge now.

V-FEXrt · 2026-04-03T01:42:55Z

Thanks Tex! Here is the tracking issue! I'll be working on this feedback first thing tomorrow #8332

github-project-automation bot added this to HLSL Roadmap Apr 1, 2026

github-project-automation bot moved this to New in HLSL Roadmap Apr 1, 2026

V-FEXrt marked this pull request as ready for review April 1, 2026 04:23

alsepkow approved these changes Apr 1, 2026

View reviewed changes

tools/clang/unittests/HLSLExec/LinAlgTests.cpp Show resolved Hide resolved

tools/clang/unittests/HLSLExec/LinAlgTests.cpp Show resolved Hide resolved

tools/clang/unittests/HLSLExec/LinAlgTests.cpp Outdated Show resolved Hide resolved

damyanp reviewed Apr 1, 2026

View reviewed changes

tools/clang/unittests/HLSLExec/HlslTestDataTypes.h Outdated Show resolved Hide resolved

tools/clang/unittests/HLSLExec/HlslTestDataTypes.h Outdated Show resolved Hide resolved

tools/clang/unittests/HLSLExec/LinAlgTests.cpp Outdated Show resolved Hide resolved

V-FEXrt mentioned this pull request Apr 1, 2026

Remove EmulateTest Flag #8327

Open

alsepkow approved these changes Apr 1, 2026

View reviewed changes

tex3d reviewed Apr 1, 2026

View reviewed changes

V-FEXrt added 2 commits April 1, 2026 18:37

[NFC] Move common HLSL types

3eb0d28

[SM6.10][HLK] Restructure LinAlg Tests

70fefce

V-FEXrt force-pushed the linalg-hlk-infra branch from 5c10f18 to 70fefce Compare April 2, 2026 01:03

V-FEXrt added 3 commits April 1, 2026 19:21

Simplify extra switch verify to use existing helpers

62fffa6

align to 128 per feedback

70f5fcf

format

6627072

tex3d reviewed Apr 2, 2026

View reviewed changes

tex3d approved these changes Apr 3, 2026

View reviewed changes

V-FEXrt mentioned this pull request Apr 3, 2026

Basic LinAlg HLK - Framework fixes #8332

Open

V-FEXrt merged commit eb67a90 into microsoft:main Apr 3, 2026
12 checks passed

github-project-automation bot moved this from New to Done in HLSL Roadmap Apr 3, 2026

V-FEXrt deleted the linalg-hlk-infra branch April 3, 2026 01:43

Conversation

V-FEXrt commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alsepkow left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

damyanp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tex3d left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

V-FEXrt commented Apr 1, 2026

Uh oh!

tex3d commented Apr 1, 2026

Uh oh!

V-FEXrt commented Apr 2, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tex3d left a comment

Choose a reason for hiding this comment

Uh oh!

V-FEXrt commented Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

V-FEXrt commented Apr 1, 2026 •

edited

Loading

github-actions bot commented Apr 1, 2026 •

edited

Loading