-
Notifications
You must be signed in to change notification settings - Fork 483
GPU: Possibility to use NO_FAST_MATH and deterministic mode for RTC #14109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
REQUEST FOR PRODUCTION RELEASES: This will add The following labels are available |
|
After trying a bit more, fully deterministic via only a runtime switch with RTC is impossible, since the host code preprocesses some values that are used for the initialization, thus if the host code is compiled with -ffast-math, it is already non-reproducible. Nothing we can do about it. |
|
Error while checking build/O2/fullCI_slc9 for 0c652eb at 2025-03-25 20:24: Full log here. |
…erministic mode in one place
|
Error while checking build/O2/fullCI_slc9 for 7cb35d8 at 2025-03-26 04:25: Full log here. |
Fixes a related bug in constexpr optimization, that is very minor but was breaking the deterministic mode.
Full deterministic mode not yet possible with runtime switch, since the thrust sort kernels are not recompiled with RTC, need to check what to do.
This setup will now also allow a follow-up PR to set NO_FAST_MATH for compression / decompression kernels, fixing the float inconsistency in the track model when used with RTC.