Skip to content

Conversation

@kshyatt
Copy link
Member

@kshyatt kshyatt commented Oct 22, 2025

No description provided.

@codecov
Copy link

codecov bot commented Oct 22, 2025

Codecov Report

❌ Patch coverage is 3.08642% with 157 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...ixAlgebraKitEnzymeExt/MatrixAlgebraKitEnzymeExt.jl 0.63% 157 Missing ⚠️
Files with missing lines Coverage Δ
src/common/initialization.jl 100.00% <100.00%> (ø)
src/common/safemethods.jl 100.00% <100.00%> (ø)
...ixAlgebraKitEnzymeExt/MatrixAlgebraKitEnzymeExt.jl 0.63% <0.63%> (ø)

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link

github-actions bot commented Oct 22, 2025

Your PR no longer requires formatting changes. Thank you for your contribution!

@kshyatt kshyatt force-pushed the ksh/enzyme branch 2 times, most recently from 4408849 to ec8354b Compare October 24, 2025 12:26
@kshyatt
Copy link
Member Author

kshyatt commented Dec 16, 2025

OK, I think this is ready for review!

@kshyatt
Copy link
Member Author

kshyatt commented Dec 17, 2025

1.12 failures really seem unrelated/Enzyme internal, not sure what to do about it other than disable the Enzyme tests on 1.12 until it becomes more stable there.

@kshyatt kshyatt force-pushed the ksh/enzyme branch 4 times, most recently from f478efa to 830c82c Compare December 19, 2025 15:18
@kshyatt
Copy link
Member Author

kshyatt commented Dec 19, 2025

OK, for unknown and likely extremely brain melting reasons, adding a repr(a) call specifically for ComplexF32 fixes all the problems with NaN we were seeing. It's upsetting and I'm not sure why, maybe inlining? Either way, while I investigate that, this should help unblock CI a little

@kshyatt
Copy link
Member Author

kshyatt commented Dec 19, 2025

Culprit seems to be inlining for ComplexF32 specifically, which is weird, and makes me unhappy, but at least we don't have to print to an unused string now

@kshyatt kshyatt force-pushed the ksh/enzyme branch 2 times, most recently from d4de7b3 to 718ddda Compare December 22, 2025 17:32
Comment on lines +179 to +186
vdU = view(∂USVᴴ[1], :, 1:minmn)
vdS = Diagonal(diagview(∂USVᴴ[2])[1:minmn])
vdVᴴ = view(∂USVᴴ[3], 1:minmn, :)
svd_pullback!(A.dval, Aval, (vU, vS, vVᴴ), (vdU, vdS, vdVᴴ))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep forgetting about this. There is some effect from the adjoints of U and Vᴴ in the extra columns that affects the adjoint of A. Maybe I can try to include this into svd_pullback!.

@kshyatt
Copy link
Member Author

kshyatt commented Jan 11, 2026

OK, I have now gone through and added support for running Enzyme with the GLA algorithms + BigFloat. The rules for left_polar and right_polar have a segfault I will look into more but I'm not sure if that should be a blocker for this. Sadly, we cannot use test_reverse with BigFloat due to an internal Enzyme issue.

@kshyatt
Copy link
Member Author

kshyatt commented Jan 11, 2026

No idea what's up with the Windows test...

Co-authored-by: Jutho <Jutho@users.noreply.github.com>
@kshyatt
Copy link
Member Author

kshyatt commented Jan 12, 2026

@Jutho would it be helpful for me to leave a big comment in the extension explaining how putting dret on the tape/cache works and what it accomplishes?

@Jutho
Copy link
Member

Jutho commented Jan 12, 2026

Yes, though I should probably read the Enzyme manual at some point. But some executive summary at the top of the extension could indeed be a useful reminder for future interventions 😄

@kshyatt
Copy link
Member Author

kshyatt commented Jan 12, 2026

Added! Hopefully that helps a bit...

@kshyatt
Copy link
Member Author

kshyatt commented Jan 13, 2026

Figured out why the left_polar and right_polar pullbacks are failing with BigFloat -- sylvester segfaults currently.

@kshyatt
Copy link
Member Author

kshyatt commented Jan 13, 2026

FWIW I think the Mac runners are very underpowered. I may just run the tests on linux for now if we're in CI mode. On my Mac laptop they run no problem.

@kshyatt
Copy link
Member Author

kshyatt commented Jan 13, 2026

It's a bit odd to me that coverage isn't picking up the new files but ok. In either case I think this is ready for more review.

(left_polar!, left_polar_pullback!),
(right_polar!, right_polar_pullback!),
)
@eval begin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still very much confused by the logic below. Probably this is easier via a conversation, but let me already write down the questions so that I don't forget:

  • cache_arg is always set to nothing in augmented_primal; is it then useful to do the argval = something(cache_arg, ...) logic in reverse?
  • Why is A.val only copied into cache_A when isa(arg, Const)?
  • What is the case TA == Nothing && TB == nothing covering? Do we have an implementation sowewhere where initalize_output creates (nothing,nothing) because we cannot allocate the output beforehand? Is that a likely case for arg to be marked as Const? How can we in that case compute the pullback since we didn't copy A.val?
  • Could the TA == Nothing && TB == nothing case be similarly handled by checking ret === arg.val? Or are there case where the latter is also not satisfied, but we still want the behavior shadow = dret = arg.dval?
  • Is ∂arg = isa(arg, Const) ? nothing : darg compatible with our pullbacks? I think the pullbacks accepts individual factors in the factorization to have a nothing adjoint, so should that be ∂arg = isa(arg, Const) ? (nothing, nothing) : darg ?

Probably more questions, but as these already illustrate, I have a very limited understanding of what is going on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least one or two of these are typos, so it's good to ask because now.

For 3, TA == Nothing && TB == Nothing covers some of the GenericLinearAlgebra cases that @lkdvos and I were discussing above. The output is allocated in the augmented primal and placed on the tape so that we can reference it later, as A's value will be once I've fixed the typo. Some of these caches/tapes are more useful in more complex computational graphs than we're yet testing in our basic "does the rule work" tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 4, that could likely work as well, though I'm not completely sold on whether it's easier to interpret for humans than simply using TA and TB

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 5, if arg is a Const the pullback is never entered. I think Enzyme's activity analysis should protect us by marking the arg as non-constant but I agree it might make sense to protect ourselves at first and remove the !isa(arg, Const) part of the if statement

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 1 & 2 are now resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were genuine questions, so I don't know if they can be "resolved" in the usual sense, rather than by improving my understanding 😄 .

I am also trying to understand whether these Enzyme "bindings" are completely generic, and would thus automatically deal with the TensorMap case as well, provided our pullback functions are overloaded. They mostly seem to be, except for maybe the occasional broadcast add .+=. Maybe VectorInterface.jl could help with that (in a later stage).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't mean to imply they were not genuine! Only some arose because of foolish things I did, I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants