Port stress/response calculations to the GPU#1187
Port stress/response calculations to the GPU#1187mfherbst merged 9 commits intoJuliaMolSim:masterfrom
Conversation
|
Merged master. Adapted tests to the refactoring brought by #1182. Additionally, removed this problematic bit of code in It turns out that comparison of Duals does not take place on the GPU, as long as all XC operations are done on the CPU. This might become a concern again in the future, once |
| copyto!(y, _mul(p, x)) | ||
| end | ||
| function Base.:*(p::AbstractFFTs.Plan, x::AbstractArray{<:Complex{<:Dual{Tg}}}) where {Tg} | ||
| function _mul(p::AbstractFFTs.Plan, x::AbstractArray{<:Complex{<:Dual{Tg}}}) where {Tg} |
There was a problem hiding this comment.
Again this feels strange and is surprising to me. Why did you need this ?
There was a problem hiding this comment.
Without this workaround, the GPU compiler throws an invalid LLVM IR error during stress calculations. I think there is confusion around which method of Base.:* to use, but I don't understand why.
There was a problem hiding this comment.
Ok, this we need to understand.
@niklasschmitz I recall we anyway only needed this because on the AbstractFFT side this was not properly supported. Could it be that now it is and we can drop our type piracy workaround alltogether ?
There was a problem hiding this comment.
I made progress there by properly reading the error message, and it turns out that a more specific definition of Base.:* in cufft takes priority: https://github.com/JuliaGPU/CUDA.jl/blob/44cde93bf03812012da5c883b6532d80a5226268/lib/cufft/fft.jl#L359-L377. While I understand the problem now, I don't see a better way to deal with it than the current solution. Any help/suggestion is welcome.
CUDA.jl overloads Base.:* for more specific types than AbstractFFTs.Plan and AbstractArray, but it breaks with (complex) Duals.
There was a problem hiding this comment.
So it's a bug in CUDA.jl, effectively ? Their typing is too broad as it covers Duals, which they don't support ?
There was a problem hiding this comment.
That's my understanding, yes.
There was a problem hiding this comment.
Once there is an issue opened and referenced here (please @mention me) this is fine.
|
I reorganized the ForwardDiff tests:
I additionally addressed the various concerns of @Technici4n on comments in I believe the last remaining issue is the overload of the which fails to compile with CUDA. I made progress there by properly reading the error message, and it turns out that a more specific definition of |
| copyto!(y, _mul(p, x)) | ||
| end | ||
| function Base.:*(p::AbstractFFTs.Plan, x::AbstractArray{<:Complex{<:Dual{Tg}}}) where {Tg} | ||
| function _mul(p::AbstractFFTs.Plan, x::AbstractArray{<:Complex{<:Dual{Tg}}}) where {Tg} |
There was a problem hiding this comment.
Ok, this we need to understand.
@niklasschmitz I recall we anyway only needed this because on the AbstractFFT side this was not properly supported. Could it be that now it is and we can drop our type piracy workaround alltogether ?
|
| copyto!(y, _mul(p, x)) | ||
| end | ||
| function Base.:*(p::AbstractFFTs.Plan, x::AbstractArray{<:Complex{<:Dual{Tg}}}) where {Tg} | ||
| function _mul(p::AbstractFFTs.Plan, x::AbstractArray{<:Complex{<:Dual{Tg}}}) where {Tg} |
There was a problem hiding this comment.
Once there is an issue opened and referenced here (please @mention me) this is fine.
This PR enables ForwardDiff calculations (stress and response) on the GPU. Main changes are:
src/workarounds/forwarddiff_rules.jl)DftFunctionals.jlpackageWith this PR, all ForwardDiff workflows currently tested on the CPU successfully run on both NVIDIA and AMD GPUs.
Future improvements will come with:
PlaneWaveBasisinstantiation