Skip to content

Add trailing_nans parameter to control NaN placement in sort results#235

Open
r-devulap wants to merge 11 commits into
numpy:mainfrom
r-devulap:nan
Open

Add trailing_nans parameter to control NaN placement in sort results#235
r-devulap wants to merge 11 commits into
numpy:mainfrom
r-devulap:nan

Conversation

@r-devulap
Copy link
Copy Markdown
Member

@r-devulap r-devulap commented May 11, 2026

Summary of changes in nan branch

trailing_nans parameter

Added bool trailing_nans = true to all sort/select routines (qsort, qselect, partial_qsort, argsort, argselect). When hasnan=true, this controls where NaN values land in the result:

  • trailing_nans=true (default): NaNs at the end, regardless of sort direction
  • trailing_nans=false: NaNs at the beginning, regardless of sort direction

Previously the library replaced NaNs with ±inf as a workaround; now it partitions them directly to the correct end. Bit-exact NaN payloads are preserved.

descending parameter for argselect

Added bool descending = false to argselect, completing its API to match argsort and qselect. When descending=true:

  • The k-th element is the k-th largest
  • All elements before index k are ≥ it
  • SIMD path partitions at mirror position arrsize-1-k then reverses; scalar path selects a std::greater comparator

Tests

  • Added trailing/leading NaN coverage for qsort, argsort, and qselect
  • Replaced the single test_argselect with four tests: ascending, descending, trailing NaNs, leading NaNs
  • New utils/custom-compare.h with NaN-aware comparators shared across tests

All sorting and selection routines (qsort, qselect, partial_qsort,
argsort, argselect) now accept an optional `bool trailing_nans`
parameter (default: true). When hasnan=true:

- trailing_nans=true  (default): NaN values appear at the end of the
  result, independent of sort direction.
- trailing_nans=false: NaN values appear at the beginning of the
  result, independent of sort direction.

Previously, NaN placement was coupled to the sort direction: ascending
always put NaN at the end, descending always put NaN at the beginning.
This change decouples the two.

Implementation notes:
- qsort path: replace_inf_with_nan now takes both `descending` and
  `trailing_nans`; when they differ, the real-value portion is rotated
  before NaN slots are written back.
- qselect path: NaN is moved to the desired end before partitioning,
  keeping the existing move_nans_to_{end,start}_of_array helpers.
- argsort path: after std_argsort_withnan + optional reverse, a
  std::rotate moves NaN indices to the desired end if needed.
- argselect path: comparator lambda respects trailing_nans.
- keyvalue_qsort/select path: descending reverse now only reverses
  the real portion [0, index_last_elem], leaving trailing NaN in place.
- Scalar fallback: four NaN-aware comparators (compare_nan_end/begin,
  compare_arg_nan_end/begin) added to utils/custom-compare.h.

Tests updated to use compare_nan_end for descending reference sorts
and to skip value-comparison checks when arr[k] is NaN.
@seberg
Copy link
Copy Markdown
Member

seberg commented May 11, 2026

Nice, I have to take a slightly closer look. But should it be nans_largest=true or so (I dunno what the best name is). Because I think with the current choice passing descending=true, has_nan=true would default to trailing_nans=true and thus change the result?
(Just a very minor translation of where you use the direct value vs. the "if they differ".)

@MaanasArora
Copy link
Copy Markdown

MaanasArora commented May 11, 2026

Thanks, this is nice! Going to take a deeper look soon - but, I was wondering if it's actually easier to just change replace_nan_with_inf when possible, to use negative (versus positive) infinity based on trailing_nans? Rotation seems a bit roundabout and could have some overhead, I suppose. Though not entirely familiar with how possible that is (though I did try it locally it and it seemed to work).

@r-devulap
Copy link
Copy Markdown
Member Author

Thanks, this is nice! Going to take a deeper look soon - but, I was wondering if it's actually easier to just change replace_nan_with_inf when possible, to use negative (versus positive) infinity based on trailing_nans? Rotation seems a bit roundabout and could have some overhead, I suppose. Though not entirely familiar with how possible that is (though I did try it locally it and it seemed to work).

It might easier to just reuse existing helper functions: move_nans_to_end_of_array and move_nans_to_start_of_array to move the nan's at the start and then only sort the remaining elements. I will make that change.

r-devulap added 9 commits May 25, 2026 04:01
- Add bool descending = false parameter to all argselect APIs and
  internal implementations (avx512, avx2, scalar, static, dispatch)
- In xss_argselect: reverse the index array after partitioning when
  descending is true; pass descending to std_argselect_withnan for
  NaN-array fast path
- std_argselect_withnan: honour descending flag in the comparator
- scalar argselect: pick comparator once based on descending and
  trailing_nans, then call nth_element once
- Tests: replace test_argselect with four typed tests covering
  ascending, descending, trailing NaNs and leading NaNs
- IS_ARG_PARTITIONED: add descending parameter forwarded to
  IS_ARR_PARTITIONED
argselect_ always partitions in ascending order. For descending, we
must select at position arrsize-1-k so the k-th largest lands at that
mirror index; reversing then moves it to position k with the correct
left/right partition invariant.
@r-devulap
Copy link
Copy Markdown
Member Author

@seberg @MaanasArora This looks good to me - happy to merge once you sign off.

@r-devulap
Copy link
Copy Markdown
Member Author

Nice, I have to take a slightly closer look. But should it be nans_largest=true or so (I dunno what the best name is). Because I think with the current choice passing descending=true, has_nan=true would default to trailing_nans=true and thus change the result?

NumPy, as of today, does not use a descending flag. I’m not sure which other libraries rely on this flag, and I’m not particularly concerned about that change. The newly introduced trailing_nans flag handles NaN placement independently of sort order—it simply lets users specify where NaNs should appear, regardless of whether the sort is ascending or descending. Personally, I don’t find assigning an order to NaNs meaningful. In that sense, this flag mainly provides users with a clear preference for where they want their NaNs.

@seberg
Copy link
Copy Markdown
Member

seberg commented May 27, 2026

Sounds good, I prefer the preference to be "at the end" or "at the front", rather than "NaNs are considered larger/smaller".
(IMO, the larger/smaller convention mostly makes sense if you think of it in terms of implementing a comparison function.)

@MaanasArora
Copy link
Copy Markdown

MaanasArora commented May 29, 2026

Thanks a lot, I had a brief look and this makes sense to me! I like this way of moving nans to the start/end and sorting the rest; within the nans we don't have an order so hopefully this approach could even have performance benefits, may be nice to run the benchmarks!

Agreed that thinking of NaNs as smaller/larger is confusing in most cases, so position is better. Would trailing_nans be a bit confusing still, compared to nans_last for example? Just a nit sorry, probably fine, given the docs...

Copy link
Copy Markdown
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM. The one ask that I think would be nice is to add all-NaN tests since codex has a point that the logic around it works out but it is a bit fishy.
The compare_arg_* feels like it could be refactored for pretty code, but that doesn't seem too important.

You seem to change the approach for partitioning from NaN replacement to NaN moving. That seems sensible to me, I am just a bit curious what that means speed-wise (not vital either to me, we may also see it in NumPy when bumping)?

Comment thread tests/.test-qsort.cpp.swp
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidental commit, please remove.

if (UNLIKELY(hasnan)) {
if (descending) {
if (!trailing_nans) {
index_first_elem = move_nans_to_start_of_array(arr, arrsize);
Copy link
Copy Markdown
Member

@seberg seberg May 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex noticed that index_first_elem can overflow to MAX_SIZE_T here and in similar paths. That seems to work out just fine with qselect_ overflowing to consider the range as empty.
(Since this is an index and not a size, -1/MAX_SIZE_T is effectively invalid after all.)

It might be clearer to have inclusive indexes, but honestly, I think it is OK, it would be nice to add an "all_nan" test case.

Comment thread lib/x86simdsort-scalar.h
else {
cmp = compare_arg_nan_begin<T, std::less<T>>(arr);
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be nice to put this into a get_argcmp_func to avoid duplicating this if that works.
Might be possible to refactor it more so that the arg comparisons use the get_cmp_func() to avoid the compare_arg_* family, but I don't think it's vital.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants