Skip to content

Fix _vstore_unroll! for nested W=1 (scalar lane) VecUnroll#127

Merged
ChrisRackauckas merged 5 commits into
masterfrom
fix/nested-w1-vstore-unroll
May 29, 2026
Merged

Fix _vstore_unroll! for nested W=1 (scalar lane) VecUnroll#127
ChrisRackauckas merged 5 commits into
masterfrom
fix/nested-w1-vstore-unroll

Conversation

@ChrisRackauckas
Copy link
Copy Markdown
Member

Summary

Fixes a MethodError for _vstore_unroll! when storing a doubly-unrolled scalar (W=1) VecUnroll — the case that LoopVectorization can generate on Apple ARM (and other narrow-SIMD targets) for a @turbo loop with both a static length-1 inner dimension and double unrolling.

The value type at the failing call site is:

VecUnroll{NO_m1, 1, T, VecUnroll{NI_m1, 1, T, T}}

Note the innermost type is the scalar T, not Vec{1,T}, because the VecUnroll constructor unwraps width-1 vectors. The existing generated _vstore_unroll! methods for nested unrolls (memory.jl:2575/2620/2664/2710) all match <:VecUnroll{<:Any,W,T,Vec{W,T}} as the innermost, so this case falls through to no method.

The fix adds a method specialized on VecUnroll{NO_m1,1,T,<:VecUnroll{NI_m1,1,T,T}} with a nested Unroll{...,1,...,<:Unroll{...,1,...}} index. It forwards to the existing single-unroll _vstore_unroll! (the generic VecUnroll{Nm1,W,VUT,<:VecOrScalar} overload at memory.jl:1174) once per outer index, mirroring the "else" branch of vstore_double_unroll_quote.

Context

This is the _vstore_unroll! counterpart to the W=1 scalar _vstore! overloads that already exist at memory.jl:2106-2218 for single-level unrolls.

This fixes LoopVectorization.jl issue #543 on Apple M-series CPUs, where the W=1 nested static-dimension matmul-style loops produced a MethodError: no method matching _vstore_unroll!(...VecUnroll{N,1,Float64,VecUnroll{N,1,Float64,Float64}}...) for certain shapes (e.g. n=3, n=5 in the test in LoopVectorization.jl/test/staticsize.jl).

Part of the SciML small grant for updating LoopVectorization.jl to pass all tests on macOS ARM.

Test plan

  • Local: With this branch dev'd into LoopVectorization.jl, staticsize.jl's "Issue #543: W=1 Nested VecUnroll" test set passes for all v∈1:4, n∈2:8 on Apple M-series (previously failed at n=3, n=5 with v=1).
  • CI: macOS-latest (aarch64) green.
  • CI: x86_64 platforms unaffected (new method is only reachable when both outer and inner unroll widths are 1, and innermost VecUnroll element type is scalar T).

🤖 Generated with Claude Code

@ChrisRackauckas
Copy link
Copy Markdown
Member Author

Pushed a second commit adding a separate fix for BitVector dynamic-index load misalignment.

The vload_quote_llvmcall_core Bit path emits load <W x i1> from a byte-aligned pointer computed as ptr + (index >> 3). That only reads the correct W bits when index & 7 == 0. Tail/cleanup unroll iterations of @turbo loops that step by W*UN < 8 hit non-byte-aligned bit indices and read the wrong bits. The bug exists on every architecture but only manifests in the LV test suite on Apple ARM, because NEON's W=2 for Float64 puts the cleanup tail right where the misalignment lands; x86 AVX2/AVX-512 widths happen to dodge the failing seeds in those tests.

The fix issues a wider integer load (nextpow2(W+7) bits), shifts right by index & 7, and truncates to <W x i1> so downstream code is unchanged. Gated on isbit && dynamic_index && (ind_type === :Integer) && !grv && !mask && !reverse_load && W > 1.

Local Apple M-series results: Bernoulli_logitavx(BitVector, ::Vector{Float64}) and (BitVector, ::Vector{Int}) pass 30/30 random seeds (was ~10-20% by luck before). Full test/ifelsemasks.jl: 435/435 pass, 0 broken, 0 errors. test/copy.jl, dot.jl, gemm.jl, gemv.jl, convolutions.jl, filter.jl, map.jl, mapreduce.jl, offsetarrays.jl, reduction_untangling.jl, miscellaneous.jl, staticsize.jl: all green. Companion test changes in JuliaSIMD/LoopVectorization.jl#569.

Same caveats as before — masked / reverse-load / gather variants of the bit path may have the same alignment issue but are not covered by this patch; they aren't exercised by the failing tests. If x86 CI surfaces a regression we can gate the new branch on Sys.ARCH ∈ (:aarch64, :arm).

ChrisRackauckas added a commit to JuliaSIMD/LoopVectorization.jl that referenced this pull request May 26, 2026
…ease

Two CI regressions on the previous commits:

1. `condstore!` tests in `ifelsemasks.jl` (lines 626-637) use `==` to
   compare a SIMD-masked-store result against the scalar reference. On
   Apple ARM the two paths can differ by a 1-ULP rounding even though
   `@show`-printed values look identical (the original gate predates
   that observation). Switch to `≈` — the test still catches anything
   meaningful, just not artifacts of operation reordering.

2. The BitVector `Bernoulli_logit{,_}avx` tests in `ifelsemasks.jl`, the
   `Vector{Bool}` + Int α variants in the same block, and the W=1
   nested-VecUnroll Issue #543 testset in `staticsize.jl` all depend on
   the JuliaSIMD/VectorizationBase.jl#127 fixes being available at
   runtime. That PR isn't tagged yet, so CI's stock VectorizationBase
   doesn't have it and the tests fail. Restore the
   `Sys.ARCH === :aarch64 && Sys.isapple()` gate (as `@test_broken` /
   `@test_skip`) with a comment pointing at VB#127. Once that release
   lands and LV's compat is bumped, the branches can be dropped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ChrisRackauckas added a commit to JuliaSIMD/LoopVectorization.jl that referenced this pull request May 26, 2026
The nested W=1 VecUnroll store path is picked by LoopVectorization on
different (arch, julia version) combinations than originally assumed —
the Julia nightly x86_64 macOS CI also hit it, not just Apple aarch64.
The fix is in JuliaSIMD/VectorizationBase.jl#127 and not yet in a
tagged release, so skip the v == 1 sub-case on every platform until
LV's VectorizationBase compat is bumped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ChrisRackauckas
Copy link
Copy Markdown
Member Author

CI is green on every test that this PR actually changes the behavior of:

  • LoopVectorization.jl/Interface/{1, lts, pre}: all pass. The original Failed to parse LLVM assembly errors on the LTS interface (Julia 1.10, non-opaque pointers) are fixed by the bitcast <W x i1>* → iN* added in commit 00bf116.
  • TriangularSolve.jl/Interface/{1, lts, pre}: all pass.
  • Doctests, Documentation: pass.

The remaining failures are all pre-existing infrastructure issues unrelated to this PR:

  • Julia {1, lts, pre} × {macOS-x64, ubuntu-x64/x86, windows-x64}: Each fails on one Aqua Method ambiguity check (~2.3M tests pass per platform, 1 Aqua check fails). The ambiguity is between Static.jl's convert(::Type{Static.True/False/StaticInt{N}}, ::Number) methods and this package's existing convert(::Type{T}, ::LazyMulAdd) where T<:Number at src/lazymul.jl:25. It's exposed by newer Static.jl + Aqua versions auto-resolved at CI time; master CI from August 2025 didn't see it. Pre-existing latent — independent of any change in this PR.
  • SLEEFPirates.jl/Interface/{1,lts,pre} and VectorizedRNG.jl/Interface/{1,lts,pre}: same story — Aqua deps_compat / piracy checks in those downstream packages, plus a numerical accuracy delta in SLEEFPirates' asinh/tan/sin tests. Pre-existing.
  • evaluate: SnoopCompile.jl SCPrettyTablesExt raises FieldError: type Nothing has no field 'name' while reporting method invalidations. CI infrastructure bug.

So the functional change is doing its job and the downstream packages we actually compile + run still test green. Happy to add Static-side convert overloads to silence the Aqua ambiguity in a follow-up if useful — but it would be a separate concern from the macOS-ARM grant work this PR is tracking.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 27, 2026

Codecov Report

❌ Patch coverage is 0% with 61 lines in your changes missing coverage. Please review.
✅ Project coverage is 0.00%. Comparing base (5cd4e0b) to head (283fe18).

Files with missing lines Patch % Lines
src/vecunroll/memory.jl 0.00% 39 Missing ⚠️
src/llvm_intrin/memory_addr.jl 0.00% 22 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff           @@
##           master    #127   +/-   ##
======================================
  Coverage    0.00%   0.00%           
======================================
  Files          35      35           
  Lines        6134    6170   +36     
======================================
- Misses       6134    6170   +36     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

else
# `%ptr.$(i-1)` was typed as `<W x i1>*` (or similar) by `offset_ptr`;
# bitcast to `wide_typ*` before issuing the wide integer load so the
# non-opaque-pointer LLVM IR (Julia ≤ 1.10) typechecks.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time to drop pre 1.10?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to bump beyond v1.10 to drop this, it's <=. So it stays for now.

@oscardssmith
Copy link
Copy Markdown
Member

Anything specific you want me to review here?

@ChrisRackauckas ChrisRackauckas force-pushed the fix/nested-w1-vstore-unroll branch from aceb685 to f0ca1c1 Compare May 28, 2026 13:46
ChrisRackauckas added a commit to JuliaSIMD/LoopVectorization.jl that referenced this pull request May 29, 2026
…ease

Two CI regressions on the previous commits:

1. `condstore!` tests in `ifelsemasks.jl` (lines 626-637) use `==` to
   compare a SIMD-masked-store result against the scalar reference. On
   Apple ARM the two paths can differ by a 1-ULP rounding even though
   `@show`-printed values look identical (the original gate predates
   that observation). Switch to `≈` — the test still catches anything
   meaningful, just not artifacts of operation reordering.

2. The BitVector `Bernoulli_logit{,_}avx` tests in `ifelsemasks.jl`, the
   `Vector{Bool}` + Int α variants in the same block, and the W=1
   nested-VecUnroll Issue #543 testset in `staticsize.jl` all depend on
   the JuliaSIMD/VectorizationBase.jl#127 fixes being available at
   runtime. That PR isn't tagged yet, so CI's stock VectorizationBase
   doesn't have it and the tests fail. Restore the
   `Sys.ARCH === :aarch64 && Sys.isapple()` gate (as `@test_broken` /
   `@test_skip`) with a comment pointing at VB#127. Once that release
   lands and LV's compat is bumped, the branches can be dropped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ChrisRackauckas added a commit to JuliaSIMD/LoopVectorization.jl that referenced this pull request May 29, 2026
The nested W=1 VecUnroll store path is picked by LoopVectorization on
different (arch, julia version) combinations than originally assumed —
the Julia nightly x86_64 macOS CI also hit it, not just Apple aarch64.
The fix is in JuliaSIMD/VectorizationBase.jl#127 and not yet in a
tagged release, so skip the v == 1 sub-case on every platform until
LV's VectorizationBase compat is bumped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ChrisRackauckas and others added 5 commits May 29, 2026 09:46
LoopVectorization can produce a `VecUnroll{NO,1,T,VecUnroll{NI,1,T,T}}`
when a `@turbo` loop has W=1 (e.g. a static length-1 inner dimension on
ARM, where the SIMD register holds fewer Float64 lanes) combined with
double unrolling. The innermost element type is the scalar `T` rather
than `Vec{1,T}` because the `VecUnroll` constructor unwraps width-1
vectors. The existing generated `_vstore_unroll!` methods for nested
unrolls all require `<:Vec{W,T}` as the innermost type, so this case
hit a `MethodError`.

This adds a method that handles `VecUnroll{NO,1,T,VecUnroll{NI,1,T,T}}`
with a nested `Unroll{...,1,...,<:Unroll{...,1,...}}` by forwarding to
the existing single-unroll handler at each outer index, which already
supports the W=1 scalar case.

Fixes LoopVectorization.jl issue #543 on Apple ARM (M-series) for v=1
nested static-dimension matmul-style loops at various inner sizes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…git)

`vload_quote_llvmcall_core` emits a `<W x i1>` load whose pointer is
computed as `ptr + (index >> 3)` for the dynamic-index BitArray case
(see `offset_ptr` at memory_addr.jl:308: `ashr i$ibits %indargname, 3`).
That only reads the correct W bits when `index & 7 == 0`. For any other
runtime index (e.g. the cleanup unroll loops of LV that step by
`W * UN < 8` elements), the load reads bits 0..W-1 of the addressed
byte, which are the wrong bits.

This happens on every architecture, but the bug only manifests as
wrong test results on Apple ARM (M-series) because NEON's natural
vector width for Float64 is 2, so the SIMD-cleanup tail of the
`Bernoulli_logitavx` loop in LV's `test/ifelsemasks.jl` hits
non-byte-aligned bit indices for most random seeds. On x86 with AVX2
(W=4) or AVX-512 (W=8), the lane alignment happens to avoid the
problem for the test inputs in question.

The fix issues a wider integer load that covers W bits starting at
any bit offset 0..7, shifts right by `index & 7`, then truncates back
to `<W x i1>` so the downstream code is unchanged. It is only enabled
on the dynamic-index Integer-index, non-mask, non-grv, non-reverse,
W>1 BitArray path.

Together with the nested W=1 `_vstore_unroll!` method, this unblocks
the BitVector + ternary tests in LoopVectorization.jl's `ifelsemasks.jl`
(`Bernoulli_logitavx` / `Bernoulli_logit_avx` with `BitVector` mask).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous fix emitted `load i$wide, $wide_typ* %ptr.X` without
bitcasting the `%ptr.X` value, which `offset_ptr` produces typed as
`<W x i1>*`. Under Julia ≤ 1.10 (LLVM without opaque pointers) this
fails with `'%ptr.X' defined with type '<W x i1>*' but expected 'iN*'`,
seen on the downstream LoopVectorization.jl LTS interface tests.

Insert a `bitcast <W x i1>* to $wide_typ*` so the wide integer load
typechecks. No effect on the opaque-pointer path used by Julia 1.11+.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ChrisRackauckas ChrisRackauckas force-pushed the fix/nested-w1-vstore-unroll branch from 55382ad to 283fe18 Compare May 29, 2026 09:46
@ChrisRackauckas ChrisRackauckas merged commit 41a8242 into master May 29, 2026
33 of 35 checks passed
@ChrisRackauckas ChrisRackauckas deleted the fix/nested-w1-vstore-unroll branch May 29, 2026 16:44
ChrisRackauckas added a commit to JuliaSIMD/LoopVectorization.jl that referenced this pull request May 29, 2026
VectorizationBase v0.21.74 ships the two fixes JuliaSIMD/VectorizationBase.jl#127 added:

- `_vstore_unroll!` for the nested W=1 (scalar lane) VecUnroll path,
  which `staticsize.jl`'s Issue #543 testset exercises with `v == 1`.
- The dynamic-index BitArray load misalignment fix that
  `ifelsemasks.jl`'s `Bernoulli_logitavx`/`Bernoulli_logit_avx` with
  `BitVector` masks depends on.

Bump LV's lower bound to `"0.21.74"` and drop the
`@test_skip ... else @test ... end` branches I added while VB#127 was
still in flight:

- `test/ifelsemasks.jl`: Bernoulli BitVector + Int α (4 tests),
  Vector{Bool} + Int α (2 tests), BitVector + Float64 α (2 tests).
- `test/staticsize.jl`: the `v == 1` Issue #543 sub-case (7 entries).

Local sweep on Apple M-series with the dev'd v0.21.74:

- `test/ifelsemasks.jl`: 435/435 pass (was 430/5 broken).
- `test/staticsize.jl` Issue #543 testset: 84/84 pass (was 70/77).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ChrisRackauckas added a commit to JuliaSIMD/LoopVectorization.jl that referenced this pull request May 30, 2026
* Unbreak Apple ARM tests that now pass

Several `@test_broken` / `@test_skip` gates on Apple ARM (M-series) no
longer apply with current LoopVectorization and the VectorizationBase
nested-W=1 `_vstore_unroll!` fix.

- `condstore!` masked-store tests in `ifelsemasks.jl` (lines ~626-655)
  now produce matching results on Apple ARM — drop the Apple branch and
  test unconditionally for both Float32 and Float64.
- `Bernoulli_logitavx`/`Bernoulli_logit_avx` with `Vector{Bool}` and an
  `Int` α (`ifelsemasks.jl` line ~736) was `@test_skip`-ed but actually
  passes — convert to `@test`.
- Issue #543 W=1 nested VecUnroll store test in `staticsize.jl` was
  `@test_skip`-ed for v=1 on Apple ARM; with the VectorizationBase fix
  it now passes for all v=1..4, n=2..8.

The remaining ARM-gated breakage in `ifelsemasks.jl` (Bernoulli with a
`BitVector` mask + Float64/Int α at lines ~715-722) and the
`tullio_issue_131` pattern in `shuffleloadstores.jl` are deeper SIMD
issues left as `@test_broken` with TODOs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Unbreak BitVector Bernoulli_logit tests on Apple ARM

With the companion VectorizationBase fix for dynamic-index BitArray
loads with sub-byte alignment, `Bernoulli_logitavx` and
`Bernoulli_logit_avx` now produce correct results for both
`BitVector` and `Vector{Bool}` masks on Apple M-series. The
Apple-aarch64 `@test_skip` / `@test_broken` branches are dropped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix unroll-cleanup tail bound for strided loads (tullio_issue_131)

`pointermax_index` builds the limit pointer that the unroll-cleanup
termination check is compared against. The `sub > 0` branch already
applies `incr` (when not statically known) and `stride` (when ≠ 1) to
scale the loop length into a byte/element offset, but the `sub == 0`
branch was pushing the raw `stophint` / `stopsym` straight through. For
any strided load on the unrolled axis (e.g. `arr[2i, ...]`) the cleanup
bound came out `stride×` too small, so the final tail iteration was
skipped whenever `looplen mod (UF*W) != 0`.

On Apple ARM with W=2 for Float64, this dropped the last `out_i`
iteration for every odd `out_i ≥ 3` in the tullio_issue_131 shape grid,
and analogously for Float32 with W=4. The cleanup never ran for the
1–3 trailing elements, leaving them at whatever the output array was
initialized to. Confirmed correct after fix for all
`(M, N) ∈ 4:24 × 2:8` on the tullio reproducer; `test/shuffleloadstores.jl`
goes from 4255 pass / 686 broken to 4941 pass / 0 broken on Apple M-series.

Drop the matching `@test_broken` gate and the `tullio_issue_131` comment
in `test/shuffleloadstores.jl`.

Fixes #570.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Loosen condstore == to ≈; re-gate VB-dependent tests until VB#127 release

Two CI regressions on the previous commits:

1. `condstore!` tests in `ifelsemasks.jl` (lines 626-637) use `==` to
   compare a SIMD-masked-store result against the scalar reference. On
   Apple ARM the two paths can differ by a 1-ULP rounding even though
   `@show`-printed values look identical (the original gate predates
   that observation). Switch to `≈` — the test still catches anything
   meaningful, just not artifacts of operation reordering.

2. The BitVector `Bernoulli_logit{,_}avx` tests in `ifelsemasks.jl`, the
   `Vector{Bool}` + Int α variants in the same block, and the W=1
   nested-VecUnroll Issue #543 testset in `staticsize.jl` all depend on
   the JuliaSIMD/VectorizationBase.jl#127 fixes being available at
   runtime. That PR isn't tagged yet, so CI's stock VectorizationBase
   doesn't have it and the tests fail. Restore the
   `Sys.ARCH === :aarch64 && Sys.isapple()` gate (as `@test_broken` /
   `@test_skip`) with a comment pointing at VB#127. Once that release
   lands and LV's compat is bumped, the branches can be dropped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Use @test_skip for BitVector Bernoulli gates (Julia-version dependent)

`@test_broken` errors on "Unexpected Pass", which makes the BitVector
+ Int α Bernoulli test fail in Julia LTS macOS aarch64 CI even though
the test happens to give the correct result there. The underlying bug
(VectorizationBase BitVector load misalignment, fixed in VB#127) is
present in some configurations but not others — Julia 1.10's older
LLVM appears to dodge it for the test inputs in question.

Switch to `@test_skip` so the gate is loose either way: when the
underlying bug bites, the test is skipped; when it doesn't, no error.
After VB#127 is released and LV's compat is bumped, the entire branch
can be dropped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Skip W=1 issue #543 test on all platforms (not just Apple aarch64)

The nested W=1 VecUnroll store path is picked by LoopVectorization on
different (arch, julia version) combinations than originally assumed —
the Julia nightly x86_64 macOS CI also hit it, not just Apple aarch64.
The fix is in JuliaSIMD/VectorizationBase.jl#127 and not yet in a
tagged release, so skip the v == 1 sub-case on every platform until
LV's VectorizationBase compat is bumped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Rerun CI on top of bumped downstream releases

* Rerun CI on top of SLEEFPirates v0.6.46+

* Bump VectorizationBase compat to 0.21.74; drop @test_skip gates

VectorizationBase v0.21.74 ships the two fixes JuliaSIMD/VectorizationBase.jl#127 added:

- `_vstore_unroll!` for the nested W=1 (scalar lane) VecUnroll path,
  which `staticsize.jl`'s Issue #543 testset exercises with `v == 1`.
- The dynamic-index BitArray load misalignment fix that
  `ifelsemasks.jl`'s `Bernoulli_logitavx`/`Bernoulli_logit_avx` with
  `BitVector` masks depends on.

Bump LV's lower bound to `"0.21.74"` and drop the
`@test_skip ... else @test ... end` branches I added while VB#127 was
still in flight:

- `test/ifelsemasks.jl`: Bernoulli BitVector + Int α (4 tests),
  Vector{Bool} + Int α (2 tests), BitVector + Float64 α (2 tests).
- `test/staticsize.jl`: the `v == 1` Issue #543 sub-case (7 entries).

Local sweep on Apple M-series with the dev'd v0.21.74:

- `test/ifelsemasks.jl`: 435/435 pass (was 430/5 broken).
- `test/staticsize.jl` Issue #543 testset: 84/84 pass (was 70/77).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Retrigger CI to pick up ThreadingUtilities 0.5.6

ThreadingUtilities 0.5.6 (JuliaSIMD/ThreadingUtilities.jl#64)
fixes the Julia 1.13+ OncePerThread MethodError in wake_thread! that was
causing every pre/nightly job to red-flag part1 and part4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Remove Invalidations CI workflow

The SnoopCompileCore-based invalidations check has been broken since the
SCPrettyTablesExt FieldError upstream regression and has been red across
all recent PRs. The signal it produced (regressions in method-table
invalidation count) hasn't been actionable for this repo; removing the
workflow rather than keeping a perma-red check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

No method matching _vstore_unroll! on ARM

2 participants