Skip to content

LLVM and SPIRV-LLVM-Translator pulldown (WW08 2026)#21335

Draft
iclsrc wants to merge 3723 commits intosyclfrom
llvmspirv_pulldown
Draft

LLVM and SPIRV-LLVM-Translator pulldown (WW08 2026)#21335
iclsrc wants to merge 3723 commits intosyclfrom
llvmspirv_pulldown

Conversation

@iclsrc
Copy link
Collaborator

@iclsrc iclsrc commented Feb 20, 2026

bababuck and others added 30 commits February 9, 2026 14:48
Compressing to a single shuffle doesn't remove any information and the backend can better apply specific optimizations to a single shuffle.

Addresses #176218.

---------

Co-authored-by: Luke Lau <luke_lau@igalia.com>
…9985)

Extend operands when computing ub - lb to avoid overflow in signed
arithmetic. E.g., i8: ub=127, lb=-128 yields 255, which overflows
without extension.
Adds missing test coverage for reductions with intermediate stores,
including partial reductions with intermediate stores, as well as
chained min/max reductions with intermediate stores.
Clinger fast path bloats baremetal targets which are constrained in
binary size. Disabling it for baremetal libc builds.
The code seems to have considered the potential problem but did not
quite succeed in solving it ;)
Use new UTC support to re-generate check lines.
An AI told me these were missing and helped me add them.
…296)

The logic of `is_x86_64_non_windows` looks unnecessarily complicated and
is only used at one site... clean up the unused targets and refactor
x86_64 BLAKE3 asm sources into a separate filegroup. And then
`is_x86_64_non_windows` can be put inside a default condition.
Add set of FindLast tests where the selected expression is based on an
IV and could be sunk.
…serves X15 (#179738)

The target function to be checked by the Control Flow Guard Check
function is stored in `X15` on AArch64. This register is guaranteed to
be preserved by that function (on success), thus after it returns `X15`
can be used to branch to the target function instead of having to load
it from another register or the stack.
…180218)

Fixes error of handing constant integers with width in (64; 128) range.
Found during review of
llvm/llvm-project#180182
…n (#180347)

Update stale links and remove duplication in table.
  CONFLICT (content): Merge conflict in llvm/lib/Transforms/IPO/IPO.cpp
…ared-libsan` (#164842)

This PR contains two commits:
- Add required dependencies when using `-shared-libsan` and fuzzer.
Since libFuzzer is a static library we need to make sure that we add its
dependencies when building with `-shared-libsan`. E.g libFuzzer uses
`ceilf()` from `libm.so` when building on Gnu toolchain.
Previously, the resulting command did not contain the required link
libraries, giving build failures
(only a static sanitizer runtime would trigger the call to
`linkSanitizerRuntimeDeps`).
    
- Correcting dependency order when using fuzzer.
When building using `-shared-libsan` the sanitizer library needs to be
first in link order.
Since the fuzzer requires `-lstdc++` we have to make sure that the
sanitizer library is added before `-lstdc++`.

---------

Signed-off-by: Björn Svensson <bjorn.a.svensson@est.tech>
Get the shared cache filepath and uuid that the inferior process is
using from debugserver, try to open that shared cache on the lldb host
mac and if the UUID matches, index all of the binaries in that shared
cache. When looking for binaries loaded in the process, get them from
the already-indexed shared cache.

Every time a binary is loaded, PlatformMacOSX may query the shared cache
filepath and uuid from the Process, and pass that to
HostInfo::GetSharedCacheImageInfo() if available (else fall back to the
old HostInfo::GetSharedCacheImageInfo method which only looks at lldb's
own shared cache), to get the file being requested.

ProcessGDBRemote caches the shared cache filepath and uuid from the
inferior, once it has a non-zero UUID. I added a lock for this ivar
specifically, so I don't have 20 threads all asking for the shared cache
information from debugserver and updating the cached answer. If we never
get back a non-zero UUID shared cache reply, we will re-query at every
library loaded notification. debugserver has been providing the shared
cache UUID since 2013, although I only added the shared cache filepath
field last November.

Note that a process will not report its shared cache filepath or uuid at
initial launch. As dyld gets a chance to execute a bit, it will start
returning binaries -- it will be available at the point when libraries
start loading. (it won't be available yet when the binary & dyld are the
only two binaries loaded in the process)

I tested this by disabling lldb's scan of its own shared cache
pre-execution -- only loading the system shared cache when the inferior
process reports that it is using that. I got 6-7 additional testsuite
failures running lldb like that, because no system binaries were loaded
before exeuction start, and the tests assumed they would be.

rdar://148939795

---------

Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
…se notes (#180299) (#180650)

We were using one token for both pushing to the llvmbot fork and for
creating a pull request against the www-releases repository, since the
fork and the repository have different owners, we were using a classic
access token which has very coarse-grained permissions. By using two
separate tokens, we limit the permissions to just what we need to do the
task.

This is a re-commit of b6ee085 minus
the environment changes which were causing the workflow to fail.
This adds atomicrmw `uinc_wrap` and `udec_wrap` operations support for
SPIR-V. Since SPIR-V doesn't provide dedicated instructions for those
two operations, we have to use the `AtomicExpand` pass to expand the
operations into CAS forms.

Closes #177204.
The test was manually generated and out-of-sync with
`cross-project-tests/debuginfo-tests/clang_llvm_roundtrip/Inputs/simplified_template_names.cpp`.

We update the test such that:

1. Automate the test generation process by using
`llvm/utils/update_test_body.py`
2. Remove host machine info when updating the tests

Predecessor of #178986, the PR was split per reviewer's request, since
that change disturbed this test a lot.
…terType (#178345)

This patch introduces `get(T)` and `set(T, Val)` functions for Waitcnt
and removes getCounterRef() and getWait(). For this to work we also need
to move InstrCounterType to AMDGPUBaseInfo.h.

Please note that the member variables are still public to keep this
patch small.
They will be replaced in the follow-up patch.
When a type test has two phases and is used by llvm.cond.loop to
implement a conditional trap, it is more efficient for two infinite
loops to be generated. Arrange for this by having the pass detect the
typical IR pattern used for conditional CFI traps and generate the second
llvm.cond.loop if found.

Part of this RFC:
https://discourse.llvm.org/t/rfc-optimizing-conditional-traps/89456

Reviewers: fmayer, vitalybuka

Reviewed By: vitalybuka

Pull Request: llvm/llvm-project#177687
This PR is mainly to address review suggestions in #179705.
…paces (#180661)

Since #171876, -amdgpu-to-rocdl (the pass) is now set up to handle
address spaces like `#gpu.address_space<global>`. Update the tests
accordingly.
Add S16 rules for G_FSQRT. S32 and S64 are expanded by the legalizer.
…e (#180496)

It should be a tune feature just like others.
…exception-escape` (#168324)

As of AI Usage: Gemini 3 was used for rephrasing the documentation.

Closes llvm/llvm-project#164795

---------

Co-authored-by: EugeneZelenko <eugene.zelenko@gmail.com>
Co-authored-by: Baranov Victor <bar.victor.2002@gmail.com>
jsji added 2 commits February 24, 2026 22:18
Regression caused by 0dd21ad which removed address space casts
from CreateIRTemp assuming temporaries are only used for loads/stores.

However, on SPIR targets, regcall functions (like invoke_simd helpers)
may call spir_func functions with different calling conventions. When
the callee uses sret with generic address space (ptr addrspace(4)),
the ReturnValue temporary must have the correct address space:

- invoke_simd's simd_func_call_helper uses x86_regcallcc
- It calls user functions with spir_func that expect sret with ptr addrspace(4)
- Without address space cast, ReturnValue is ptr (private AS 0)
- This causes: "Calling a function with a bad signature!"

This patch adds a targeted fix: only apply address space cast for
regcall functions on SPIR/SPIRV targets, implementing CreateIRTemp
inline to avoid modifying headers.

Fixes: sycl/test/invoke_simd/return-type-struct.cpp
Should use l32 similar to clang/test/Driver/sycl-libspirv-toolchain.cpp

"
  // Select remangled libclc variant.
// Decide long size based on host triple, because offloading targets are
going
  // to match that.
  **// All known windows environments except Cygwin use 32-bit long.**
"

This is a bug when resolving conflicts.

commit ed037fb
Merge: 5934308 c5cb48c
Author: Wenju He <wenju.he@intel.com>
Date:   Tue Jan 27 05:24:14 2026 +0100

    Merge from 'main' to 'sycl-web' (45 commits)

CONFLICT (content): Merge conflict in
libclc/cmake/modules/AddLibclc.cmake

diff --git a/clang/test/Driver/sycl-nvptx-link.cpp
b/clang/test/Driver/sycl-nvptx-link.cpp
index 9f3d0bf..b51401d 100644
--- a/clang/test/Driver/sycl-nvptx-link.cpp
+++ b/clang/test/Driver/sycl-nvptx-link.cpp
@@ -42,7 +42,7 @@

 // CHECK: llvm-link
 // CHECK-SAME: -only-needed
-// CHECK-SAME: libspirv-nvptx64-nvidia-cuda.bc
+// CHECK-SAME: remangled-l64-signed_char.libspirv.bc
 // LIBDEVICE10-SAME: libdevice.10.bc
 // LIBDEVICE30-SAME: libdevice.compute_30.10.bc
 // LIBDEVICE35-SAME: libdevice.compute_35.10.bc
@jsji jsji force-pushed the llvmspirv_pulldown branch from de4d278 to 346832c Compare February 25, 2026 00:58
@jsji
Copy link
Contributor

jsji commented Feb 25, 2026

This is ready for review.

Others are cherry-picks.

CGM.getTarget().getTriple().isSPIRV()) &&
CurFnInfo->getCallingConvention() == llvm::CallingConv::X86_RegCall;

if (NeedsAddrSpaceCast) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix itself seems fine but is this a bug in the original upstream change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem only happens with ESIMD tests here, so not sure whether upstream would consider such case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the ESIMD IR is valid then I would argue they have to consider such a case. If it is invalid or breaking some rule then of course they shouldn't.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. We probably can try to upstream this fix later.

@premanandrao
Copy link
Contributor

The two CFE items that @jsji asked me to look at seem okay to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

disable-lint Skip linter check step and proceed with build jobs

Projects

None yet

Development

Successfully merging this pull request may close these issues.