perf: Improve Huffman leaf sorting with bucket sort#244
Merged
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
ce4d8dd to
bbef4a7
Compare
6cfa267 to
cd3c97a
Compare
Replaces the generic `qsort` with a specialized bucket sort and insertion sort hybrid for `pm_leaf_t` arrays. The previous `qsort`'s indirect comparator calls were a performance bottleneck. This custom implementation exploits the natural clustering of frequency weights, leading to short intra-bucket lists where insertion sort is branch-friendly and significantly faster.
The `ZXC_QSORT` macro, previously used to abstract the standard library `qsort`, is no longer required. It was superseded by a specialized bucket sort implementation for Huffman leaf arrays, making the generic `qsort` abstraction redundant.
96d1a02 to
1efaa9b
Compare
Enhances the NEON64 path within `zxc_lz77_find_best_match` to process 32 bytes per iteration. This is achieved by performing two 16-byte vector comparisons within the loop, reducing overhead and improving the speed of extending LZ77 matches. The NEON32 logic is also explicitly separated into its own block.
1efaa9b to
d83bd07
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replaces the generic
qsortwith a specialized bucket sort and insertion sort hybrid forpm_leaf_tarrays. The previousqsort's indirect comparator calls were a performance bottleneck. This custom implementation exploits the natural clustering of frequency weights, leading to short intra-bucket lists where insertion sort is branch-friendly and significantly faster.