-
Notifications
You must be signed in to change notification settings - Fork 130
Description
Hi! I have a question about something I was considering using accelerate for scientific computation that we wish to be reproducible.
The goal is to be able to run the same computation again on a different setup and get the same result, e.g. with CUDA on GPU or on X86 CPU.
I found in https://github.com/AccelerateHS/accelerate/blob/master/accelerate.cabal that it is possible to eliminate a large source of platform/optimisation-dependant behavior with -fno-fast-math. That's great. I also saw a -fno-fast-permute-const which I'm not sure what it does but I'm curious about.
Would you say disabling fast-math is enough to have IEEE754 floating point behavior with deterministic order of operations, so that the calculation result can be the same no matter the platform?
(I can make a PR afterwards to try to summarise this discussion in the README or wherever applicable.)
So:
- Can I get reproducible floating point results?
- (additionally) What's the effect of
-fno-fast-permute-const?
Thanks!