Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 21 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,9 +37,9 @@ how fast this is relative to `std::sort`.

## Sort an array of built-in integers and floats
```cpp
void x86simdsort::qsort(T* arr, size_t size, bool hasnan, bool descending);
void x86simdsort::qselect(T* arr, size_t k, size_t size, bool hasnan, bool descending);
void x86simdsort::partial_qsort(T* arr, size_t k, size_t size, bool hasnan, bool descending);
void x86simdsort::qsort(T* arr, size_t size, bool hasnan, bool descending, bool trailing_nans);
void x86simdsort::qselect(T* arr, size_t k, size_t size, bool hasnan, bool descending, bool trailing_nans);
void x86simdsort::partial_qsort(T* arr, size_t k, size_t size, bool hasnan, bool descending, bool trailing_nans);
```
Supported datatypes: `T` $\in$ `[_Float16, uint16_t, int16_t, float, uint32_t,
int32_t, double, uint64_t, int64_t]`
Expand All @@ -56,8 +56,8 @@ data types.

## Arg sort routines on arrays
```cpp
std::vector<size_t> arg = x86simdsort::argsort(const T* arr, size_t size, bool hasnan, bool descending);
std::vector<size_t> arg = x86simdsort::argselect(const T* arr, size_t k, size_t size, bool hasnan);
std::vector<size_t> arg = x86simdsort::argsort(const T* arr, size_t size, bool hasnan, bool descending, bool trailing_nans);
std::vector<size_t> arg = x86simdsort::argselect(const T* arr, size_t k, size_t size, bool hasnan, bool descending, bool trailing_nans);
```
Supported datatypes: `T` $\in$ `[_Float16, uint16_t, int16_t, float, uint32_t, int32_t, double,
uint64_t, int64_t]` Note that argsort and argselect are not accelerated with SIMD when using 16-bit
Expand Down Expand Up @@ -174,13 +174,22 @@ Supported datatypes: `uint16_t, int16_t, _Float16, uint32_t, int32_t, float,
uint64_t, int64_t, double`. Note that `_Float16` will require building this
library with g++ >= 12.x. All the functions have an optional argument `bool
hasnan` set to `false` by default (these are relevant to floating point data
types only). If your array has NAN's, the the behaviour of the sorting routine
is undefined. If `hasnan` is set to true, NAN's are always sorted to the end of
the array. In addition to that, qsort will replace all your NAN's with
`std::numeric_limits<T>::quiet_NaN`. The original bit-exact NaNs in
the input are not preserved. Also note that the arg methods (argsort and
argselect) will not use the SIMD based algorithms if they detect NAN's in the
array. You can read details of all the implementations
types only). If your array has NaN values, the behaviour of the sorting routine
is undefined unless `hasnan` is set to `true`. When `hasnan=true`, NaN placement
is controlled by the optional `bool trailing_nans` parameter (default `true`):

- `trailing_nans=true` (default): NaN values are placed at the **end** of the
result, regardless of sort direction.
- `trailing_nans=false`: NaN values are placed at the **beginning** of the
result, regardless of sort direction.

All routines accept an optional `bool descending` parameter (default `false`).
When `descending=true`, results are in descending order. For `argselect`, the
k-th element becomes the k-th **largest**, with all elements before index k
being greater than or equal to it.

Note that the arg methods (argsort and argselect) will not use the SIMD based
algorithms if they detect NaN values in the array. You can read details of all the implementations
[here](https://github.com/intel/x86-simd-sort/blob/main/src/README.md).

## Performance comparison on AVX-512: `object_qsort` v/s `std::sort`
Expand Down
52 changes: 38 additions & 14 deletions lib/x86simdsort-avx2.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,33 +5,57 @@

#define DEFINE_ALL_METHODS(type) \
template <> \
void qsort(type *arr, size_t arrsize, bool hasnan, bool descending) \
void qsort(type *arr, \
size_t arrsize, \
bool hasnan, \
bool descending, \
bool trailing_nans) \
{ \
x86simdsortStatic::qsort(arr, arrsize, hasnan, descending); \
x86simdsortStatic::qsort( \
arr, arrsize, hasnan, descending, trailing_nans); \
} \
template <> \
void qselect( \
type *arr, size_t k, size_t arrsize, bool hasnan, bool descending) \
void qselect(type *arr, \
size_t k, \
size_t arrsize, \
bool hasnan, \
bool descending, \
bool trailing_nans) \
{ \
x86simdsortStatic::qselect(arr, k, arrsize, hasnan, descending); \
x86simdsortStatic::qselect( \
arr, k, arrsize, hasnan, descending, trailing_nans); \
} \
template <> \
void partial_qsort( \
type *arr, size_t k, size_t arrsize, bool hasnan, bool descending) \
void partial_qsort(type *arr, \
size_t k, \
size_t arrsize, \
bool hasnan, \
bool descending, \
bool trailing_nans) \
{ \
x86simdsortStatic::partial_qsort(arr, k, arrsize, hasnan, descending); \
x86simdsortStatic::partial_qsort( \
arr, k, arrsize, hasnan, descending, trailing_nans); \
} \
template <> \
std::vector<size_t> argsort( \
const type *arr, size_t arrsize, bool hasnan, bool descending) \
std::vector<size_t> argsort(const type *arr, \
size_t arrsize, \
bool hasnan, \
bool descending, \
bool trailing_nans) \
{ \
return x86simdsortStatic::argsort(arr, arrsize, hasnan, descending); \
return x86simdsortStatic::argsort( \
arr, arrsize, hasnan, descending, trailing_nans); \
} \
template <> \
std::vector<size_t> argselect( \
const type *arr, size_t k, size_t arrsize, bool hasnan) \
std::vector<size_t> argselect(const type *arr, \
size_t k, \
size_t arrsize, \
bool hasnan, \
bool descending, \
bool trailing_nans) \
{ \
return x86simdsortStatic::argselect(arr, k, arrsize, hasnan); \
return x86simdsortStatic::argselect( \
arr, k, arrsize, hasnan, descending, trailing_nans); \
}

#define DEFINE_KEYVALUE_METHODS_BASE(type1, type2) \
Expand Down
60 changes: 42 additions & 18 deletions lib/x86simdsort-icl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8,76 +8,100 @@
namespace xss {
namespace avx512 {
template <>
void qsort(uint16_t *arr, size_t size, bool hasnan, bool descending)
void qsort(uint16_t *arr,
size_t size,
bool hasnan,
bool descending,
bool trailing_nans)
{
x86simdsortStatic::qsort(arr, size, hasnan, descending);
x86simdsortStatic::qsort(arr, size, hasnan, descending, trailing_nans);
}
template <>
void qselect(uint16_t *arr,
size_t k,
size_t arrsize,
bool hasnan,
bool descending)
bool descending,
bool trailing_nans)
{
x86simdsortStatic::qselect(arr, k, arrsize, hasnan, descending);
x86simdsortStatic::qselect(
arr, k, arrsize, hasnan, descending, trailing_nans);
}
template <>
void partial_qsort(uint16_t *arr,
size_t k,
size_t arrsize,
bool hasnan,
bool descending)
bool descending,
bool trailing_nans)
{
x86simdsortStatic::partial_qsort(arr, k, arrsize, hasnan, descending);
x86simdsortStatic::partial_qsort(
arr, k, arrsize, hasnan, descending, trailing_nans);
}
template <>
void qsort(int16_t *arr, size_t size, bool hasnan, bool descending)
void qsort(int16_t *arr,
size_t size,
bool hasnan,
bool descending,
bool trailing_nans)
{
x86simdsortStatic::qsort(arr, size, hasnan, descending);
x86simdsortStatic::qsort(arr, size, hasnan, descending, trailing_nans);
}
template <>
void qselect(int16_t *arr,
size_t k,
size_t arrsize,
bool hasnan,
bool descending)
bool descending,
bool trailing_nans)
{
x86simdsortStatic::qselect(arr, k, arrsize, hasnan, descending);
x86simdsortStatic::qselect(
arr, k, arrsize, hasnan, descending, trailing_nans);
}
template <>
void partial_qsort(int16_t *arr,
size_t k,
size_t arrsize,
bool hasnan,
bool descending)
bool descending,
bool trailing_nans)
{
x86simdsortStatic::partial_qsort(arr, k, arrsize, hasnan, descending);
x86simdsortStatic::partial_qsort(
arr, k, arrsize, hasnan, descending, trailing_nans);
}
} // namespace avx512
namespace fp16_icl {
#ifdef __FLT16_MAX__
template <>
void qsort(_Float16 *arr, size_t size, bool hasnan, bool descending)
void qsort(_Float16 *arr,
size_t size,
bool hasnan,
bool descending,
bool trailing_nans)
{
x86simdsortStatic::qsort(arr, size, hasnan, descending);
x86simdsortStatic::qsort(arr, size, hasnan, descending, trailing_nans);
}
template <>
void qselect(_Float16 *arr,
size_t k,
size_t arrsize,
bool hasnan,
bool descending)
bool descending,
bool trailing_nans)
{
x86simdsortStatic::qselect(arr, k, arrsize, hasnan, descending);
x86simdsortStatic::qselect(
arr, k, arrsize, hasnan, descending, trailing_nans);
}
template <>
void partial_qsort(_Float16 *arr,
size_t k,
size_t arrsize,
bool hasnan,
bool descending)
bool descending,
bool trailing_nans)
{
x86simdsortStatic::partial_qsort(arr, k, arrsize, hasnan, descending);
x86simdsortStatic::partial_qsort(
arr, k, arrsize, hasnan, descending, trailing_nans);
}
#endif
} // namespace fp16_icl
Expand Down
20 changes: 14 additions & 6 deletions lib/x86simdsort-internal.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@
XSS_HIDE_SYMBOL void qsort(T *arr, \
size_t arrsize, \
bool hasnan = false, \
bool descending = false); \
bool descending = false, \
bool trailing_nans = true); \
template <typename T1, typename T2> \
XSS_HIDE_SYMBOL void keyvalue_qsort(T1 *key, \
T2 *val, \
Expand All @@ -22,7 +23,8 @@
size_t k, \
size_t arrsize, \
bool hasnan = false, \
bool descending = false); \
bool descending = false, \
bool trailing_nans = true); \
template <typename T1, typename T2> \
XSS_HIDE_SYMBOL void keyvalue_select(T1 *key, \
T2 *val, \
Expand All @@ -35,7 +37,8 @@
size_t k, \
size_t arrsize, \
bool hasnan = false, \
bool descending = false); \
bool descending = false, \
bool trailing_nans = true); \
template <typename T1, typename T2> \
XSS_HIDE_SYMBOL void keyvalue_partial_sort(T1 *key, \
T2 *val, \
Expand All @@ -47,10 +50,15 @@
XSS_HIDE_SYMBOL std::vector<size_t> argsort(const T *arr, \
size_t arrsize, \
bool hasnan = false, \
bool descending = false); \
bool descending = false, \
bool trailing_nans = true); \
template <typename T> \
XSS_HIDE_SYMBOL std::vector<size_t> \
argselect(const T *arr, size_t k, size_t arrsize, bool hasnan = false); \
XSS_HIDE_SYMBOL std::vector<size_t> argselect(const T *arr, \
size_t k, \
size_t arrsize, \
bool hasnan = false, \
bool descending = false, \
bool trailing_nans = true); \
}

namespace xss {
Expand Down
Loading
Loading