-
Notifications
You must be signed in to change notification settings - Fork 483
GPU: Improvements for sorting / thrust external allocator #14103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
REQUEST FOR PRODUCTION RELEASES: This will add The following labels are available |
|
@mconcas : For using the thrust external allocator, see https://github.com/davidrohr/AliceO2/blob/c9e82dd6c8452852ea2e2eadfff5c4ac9887c901/GPU/Common/GPUCommonAlgorithmThrust.h#L96 The way it works is:
|
c9e82dd to
9ceb34e
Compare
|
performance problem fixed |
|
Error while checking build/O2/fullCI_slc9 for c9e82dd at 2025-03-24 18:55: Full log here. |
|
Error while checking build/O2/fullCI_slc9 for 9ceb34e at 2025-03-24 20:27: Full log here. |
9ceb34e to
471fdbf
Compare
…s not working any more
…on device from host
…ze the last kernel
471fdbf to
1627d1a
Compare
Unfortunately, this PR causes a major performance regression both on my NVIDIA GPU and on EPNs. I don't understand why yet. Just want to check it in the CI, and show @mconcas how to use the external allocator.