Skip to content

Conversation

@Alexhuszagh
Copy link
Owner

This applies all the upstream patches for bug fixes, etc. that can be added to our repo (that is, those without the const changes and those with external dependencies).

VoidStarKat and others added 13 commits September 21, 2025 10:39
Fix bug #126: handle NaNs correctly in min() and max()
LoongArch is a RISC instruction set architecture and currently a Tier-2
(with host-tools) target [^1] in the Rust upstream community.

This patch introduces FP16 conversion functions based on the LoongArch
SIMD extension to improve performance.

Benchmarks:

```
HalfFloatSliceExt::convert_from_f32_slice/constants
                        time:   [10.816 ns 10.823 ns 10.831 ns]
                        change: [-63.769% -63.728% -63.693%] (p = 0.00 < 0.05)
                        Performance has improved.

HalfFloatSliceExt::convert_from_f32_slice/large
                        time:   [137.68 ns 137.77 ns 137.88 ns]
                        change: [-94.847% -94.841% -94.834%] (p = 0.00 < 0.05)
                        Performance has improved.

HalfFloatSliceExt::convert_from_f64_slice/constants
                        time:   [12.656 ns 12.669 ns 12.684 ns]
                        change: [-78.455% -78.418% -78.367%] (p = 0.00 < 0.05)
                        Performance has improved.

HalfFloatSliceExt::convert_from_f64_slice/large
                        time:   [544.15 ns 544.49 ns 544.91 ns]
                        change: [-89.799% -89.791% -89.781%] (p = 0.00 < 0.05)
                        Performance has improved.

HalfFloatSliceExt::convert_to_f32_slice/constants
                        time:   [6.0412 ns 6.0442 ns 6.0482 ns]
                        change: [-74.100% -74.068% -74.042%] (p = 0.00 < 0.05)
                        Performance has improved.

HalfFloatSliceExt::convert_to_f32_slice/large
                        time:   [512.78 ns 513.08 ns 513.45 ns]
                        change: [-77.628% -77.526% -77.422%] (p = 0.00 < 0.05)
                        Performance has improved.

HalfFloatSliceExt::convert_to_f64_slice/constants
                        time:   [10.779 ns 10.784 ns 10.792 ns]
                        change: [-49.028% -48.922% -48.813%] (p = 0.00 < 0.05)
                        Performance has improved.

HalfFloatSliceExt::convert_to_f64_slice/large
                        time:   [923.19 ns 923.77 ns 924.50 ns]
                        change: [-80.876% -80.862% -80.849%] (p = 0.00 < 0.05)
                        Performance has improved.
```

[^1]: https://doc.rust-lang.org/stable/rustc/platform-support/loongarch-linux.html
LoongArch64 FP16 hardware support
@Alexhuszagh Alexhuszagh self-assigned this Sep 21, 2025
@Alexhuszagh Alexhuszagh added the upstream-patches Add patches from upstream. label Sep 21, 2025
@Alexhuszagh Alexhuszagh merged commit c2e254e into main Sep 21, 2025
63 of 65 checks passed
@Alexhuszagh Alexhuszagh deleted the rebase branch September 21, 2025 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

upstream-patches Add patches from upstream.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants