Skip to content

Conversation

@hazzlim
Copy link
Contributor

@hazzlim hazzlim commented Feb 3, 2026

This PR adds a Neon implementation of adjacent_find ⏭️ 🕵️ 🚀

Performance numbers (values are speedup figures relative to existing code; values greater than 1 indicate that the new code is faster):

  MSVC Speedup Clang Speedup
bm<AlgType::Std, int8_t>/2525/1142 16.97 10.455
bm<AlgType::Std, int16_t>/2525/1142 9.773 5.75
bm<AlgType::Std, int32_t>/2525/1142 5.06 3.373
bm<AlgType::Std, int64_t>/2525/1142 3.077 1.793
bm<AlgType::Rng, int8_t>/2525/1142 10.222 10.719
bm<AlgType::Rng, int16_t>/2525/1142 5.619 5.5
bm<AlgType::Rng, int32_t>/2525/1142 3.289 3.439
bm<AlgType::Rng, int64_t>/2525/1142 1.755 1.875

@hazzlim hazzlim requested a review from a team as a code owner February 3, 2026 10:12
@github-project-automation github-project-automation bot moved this to Initial Review in STL Code Reviews Feb 3, 2026
@StephanTLavavej StephanTLavavej self-assigned this Feb 3, 2026
@StephanTLavavej StephanTLavavej added performance Must go faster ARM64 Related to the ARM64 architecture labels Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ARM64 Related to the ARM64 architecture performance Must go faster

Projects

Status: Initial Review

Development

Successfully merging this pull request may close these issues.

2 participants