-
Notifications
You must be signed in to change notification settings - Fork 5
Reverse rules for Enzyme #86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
... and 1 file with indirect coverage changes 🚀 New features to boost your workflow:
|
|
Your PR no longer requires formatting changes. Thank you for your contribution! |
4408849 to
ec8354b
Compare
7478c90 to
9091c56
Compare
6ab2c5a to
7e76c77
Compare
|
OK, I think this is ready for review! |
|
1.12 failures really seem unrelated/Enzyme internal, not sure what to do about it other than disable the Enzyme tests on 1.12 until it becomes more stable there. |
f478efa to
830c82c
Compare
|
OK, for unknown and likely extremely brain melting reasons, adding a |
|
Culprit seems to be inlining for |
d4de7b3 to
718ddda
Compare
f48b4d6 to
b86559d
Compare
| vdU = view(∂USVᴴ[1], :, 1:minmn) | ||
| vdS = Diagonal(diagview(∂USVᴴ[2])[1:minmn]) | ||
| vdVᴴ = view(∂USVᴴ[3], 1:minmn, :) | ||
| svd_pullback!(A.dval, Aval, (vU, vS, vVᴴ), (vdU, vdS, vdVᴴ)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I keep forgetting about this. There is some effect from the adjoints of U and Vᴴ in the extra columns that affects the adjoint of A. Maybe I can try to include this into svd_pullback!.
|
OK, I have now gone through and added support for running Enzyme with the GLA algorithms + |
|
No idea what's up with the Windows test... |
Co-authored-by: Jutho <Jutho@users.noreply.github.com>
|
@Jutho would it be helpful for me to leave a big comment in the extension explaining how putting |
|
Yes, though I should probably read the Enzyme manual at some point. But some executive summary at the top of the extension could indeed be a useful reminder for future interventions 😄 |
|
Added! Hopefully that helps a bit... |
|
Figured out why the |
|
FWIW I think the Mac runners are very underpowered. I may just run the tests on linux for now if we're in CI mode. On my Mac laptop they run no problem. |
|
It's a bit odd to me that coverage isn't picking up the new files but ok. In either case I think this is ready for more review. |
| (left_polar!, left_polar_pullback!), | ||
| (right_polar!, right_polar_pullback!), | ||
| ) | ||
| @eval begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still very much confused by the logic below. Probably this is easier via a conversation, but let me already write down the questions so that I don't forget:
cache_argis always set tonothinginaugmented_primal; is it then useful to do theargval = something(cache_arg, ...)logic inreverse?- Why is
A.valonly copied intocache_Awhenisa(arg, Const)? - What is the case
TA == Nothing && TB == nothingcovering? Do we have an implementation sowewhere whereinitalize_outputcreates(nothing,nothing)because we cannot allocate the output beforehand? Is that a likely case forargto be marked asConst? How can we in that case compute the pullback since we didn't copyA.val? - Could the
TA == Nothing && TB == nothingcase be similarly handled by checkingret === arg.val? Or are there case where the latter is also not satisfied, but we still want the behaviorshadow = dret = arg.dval? - Is
∂arg = isa(arg, Const) ? nothing : dargcompatible with our pullbacks? I think the pullbacks accepts individual factors in the factorization to have anothingadjoint, so should that be∂arg = isa(arg, Const) ? (nothing, nothing) : darg?
Probably more questions, but as these already illustrate, I have a very limited understanding of what is going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least one or two of these are typos, so it's good to ask because now.
For 3, TA == Nothing && TB == Nothing covers some of the GenericLinearAlgebra cases that @lkdvos and I were discussing above. The output is allocated in the augmented primal and placed on the tape so that we can reference it later, as A's value will be once I've fixed the typo. Some of these caches/tapes are more useful in more complex computational graphs than we're yet testing in our basic "does the rule work" tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For 4, that could likely work as well, though I'm not completely sold on whether it's easier to interpret for humans than simply using TA and TB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For 5, if arg is a Const the pullback is never entered. I think Enzyme's activity analysis should protect us by marking the arg as non-constant but I agree it might make sense to protect ourselves at first and remove the !isa(arg, Const) part of the if statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 1 & 2 are now resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were genuine questions, so I don't know if they can be "resolved" in the usual sense, rather than by improving my understanding 😄 .
I am also trying to understand whether these Enzyme "bindings" are completely generic, and would thus automatically deal with the TensorMap case as well, provided our pullback functions are overloaded. They mostly seem to be, except for maybe the occasional broadcast add .+=. Maybe VectorInterface.jl could help with that (in a later stage).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't mean to imply they were not genuine! Only some arose because of foolish things I did, I think.
No description provided.