Skip to content

Conversation

@shreyas-omkar
Copy link

Fix for #635

Base.issorted on AbstractGPUArray currently falls back to scalar iteration, which triggers scalar indexing errors and prevents correct usage on GPU-backed arrays.

This PR introduces a GPU-native implementation of issorted using a kernel-based approach, ensuring the operation executes entirely on the device.

Tests

(GPUArrays with base as test_args)

image

(GPUArrays)
Screenshot 2026-01-15 181343

@shreyas-omkar
Copy link
Author

@kshyatt please review.

if order isa Base.Order.ReverseOrdering
rev = !rev
elseif !(order isa Base.Order.ForwardOrdering)
throw(ArgumentError("custom orderings are not supported on GPU"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC Ordering types are supposed to implement lt(ord, a, b), so we could probably use that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Base.Order.lt is not GPU-safe. It calls methods that throws. I'm getting jl_f_throw_methoderror if used. Am I missing something??

@kshyatt
Copy link
Member

kshyatt commented Jan 16, 2026

Thanks for getting to this so quickly! I'm at a conference today so may not get to it immediately but perhaps @maleadt has spare time? If not I'll look on Monday.

@shreyas-omkar
Copy link
Author

shreyas-omkar commented Jan 16, 2026

@maleadt I have removed the unnecessary using KernelAbstractions import as suggested

I initially tried using Base.Order.lt(ord, a, b), but this results in invalid GPU IR.
To avoid this, the current implementation resolves order and rev on the host and passes plain lt / by functions into the kernel, which keeps the device code simple and GPU-safe.

Please let me know if I'm missing something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants