Skip to content

fix(cuda_std): pack warp shuffle/match return into i64 for LLVM 19 verifier#391

Draft
brandonros wants to merge 6 commits intoRust-GPU:mainfrom
brandonros:llvm-19-fixes
Draft

fix(cuda_std): pack warp shuffle/match return into i64 for LLVM 19 verifier#391
brandonros wants to merge 6 commits intoRust-GPU:mainfrom
brandonros:llvm-19-fixes

Conversation

@brandonros
Copy link
Copy Markdown
Contributor

@brandonros brandonros commented May 3, 2026

if we land #386 this becomes smaller, opening it for now

fixes

Compiling cuda_std v0.2.2 (https://github.com/brandonros/Rust-CUDA.git?rev=38212ab745b6d257992e37bc0bc2b7b659bebed4#38212ab7)
  error: LLVM module verification failed for cuda_std.ee6756d2761d070-cgu.0: Attribute 'align 4' applied to incompatible type!
           %6 = call align 4 { i32, i8 } @__nvvm_warp_shuffle(i32 %mask, i32 %mode, i32 %value, i32 %b, i32 %5)
         Attribute 'align 4' applied to incompatible type!
         ptr @__nvvm_warp_shuffle
         

  error: could not compile `cuda_std` (lib) due to 1 previous error

  thread 'main' (192541) panicked at cli/build.rs:33:10:
  called `Result::unwrap()` on an `Err` value: BuildFailed
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

brandonros and others added 6 commits April 29, 2026 09:12
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collapse the per-LLVM duplication in the build script into a single
`LlvmFlavor` struct with two const instances (LLVM7, LLVM19). One
`find_llvm_config`, `find_llvm_as`, `configure_libintrinsics`, and
`rustc_llvm_build` now drive both toolchain paths;
`required_major_llvm_version`, `find_llvm_config_llvm7`,
`find_llvm_config_llvm19`, and `find_llvm_as_llvm19` are gone.

Functional changes that fall out of the refactor:

- Prebuilt LLVM download now works for the `llvm19` feature too, gated on
  `USE_PREBUILT_LLVM=1` or as an automatic fallback when no LLVM 19
  toolchain is found locally. New `PREBUILT_LLVM_URL_LLVM19` points at
  the `llvm-19.1.7` release tag.
- Prebuilt download now supports `linux-x86_64` and `linux-aarch64` in
  addition to `windows-x86_64`. The "currently disabled because of
  segfaults" note on Linux x86_64 is gone — the prebuild repos that
  produce these archives have been refactored to fix the underlying
  issue.
- `PREBUILT_LLVM_URL_LLVM7` retagged to lowercase `llvm-7.1.0/` to match
  the new release-tag scheme used by the prebuild repos.
- `libintrinsics.bc` is no longer checked in; both LLVM versions now
  assemble `libintrinsics.ll` on the fly using the `llvm-as` that ships
  next to the resolved `llvm-config`. Removes the only remaining
  version-specific branch and means the LLVM 7 path can no longer drift
  silently when the `.ll` changes.

The LLVM 7 candidate-search behavior is also slightly stricter:
previously `LLVM_CONFIG` only had to literally start with "7" (matching
7, 70, 700...) and a mismatched env var skipped straight to download;
now major-version match is exact and PATH `llvm-config` is tried as a
fallback before downloading. `USE_PREBUILT_LLVM=1` still forces direct
download.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rifier

LLVM 19's verifier rejects the `align N` return-attribute that rustc's
C ABI lowering attaches to calls returning small aggregates like
{ i32, i8 } (align is only valid on pointer returns). Three intrinsic
wrappers in libintrinsics.ll triggered this:
  - __nvvm_warp_shuffle
  - __nvvm_warp_match_all_32
  - __nvvm_warp_match_all_64

Switch their return type from { i32, i8 } to a packed i64 (low 32 bits
= value, bit 32 = predicate). Primitive integer return ⇒ no struct ABI
⇒ no spurious return-attribute. Uses only LLVM 1.0-era IR primitives
(zext/shl/or), so it's safe under both LLVM 7 (CUDA 12.x libnvvm) and
LLVM 19 (CUDA 13.x libnvvm). Removes the now-redundant
WarpShuffleResult struct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant