Some Wonderings about leveraging deq model to new applications

Regarding the code logic:

1. As I understand it, the DEQ model uses a neural network to perform the entire iterative optimization process. The f_solver should provide the equilibrium point, as I understand it. Why is there another forward pass during training, and why is the resulting new_z1s and the original z1s used to calculate the JAC_loss? I'm currently working on a vehicle state estimation task. Doesn't this mean that the previous estimated value z1s is used to solve the fixed-point equilibrium equation to obtain new_z1s, and then the dynamic prediction model is run again?

<img width="2134" height="944" alt="Image" src="https://github.com/user-attachments/assets/bee3dcad-0408-422e-8eda-b1f683fa5a68" />

2. I see the item() and detach() operations in the Anderson and Broyden solvers. I don't quite understand the basic principles and functions of these two functions. If item() and detach() are used, the gradient is broken. How can the network be effectively trained?

<img width="2132" height="1507" alt="Image" src="https://github.com/user-attachments/assets/dc0969ba-7cf5-4732-97a4-24f561edfe56" />

3. How should the overall loss function be designed? For a state estimation task, can a weighted fusion of the estimated state MSE and JAC_loss be used? How should the deep equilibrium model be designed to effectively utilize its capabilities?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some Wonderings about leveraging deq model to new applications #35

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Some Wonderings about leveraging deq model to new applications #35

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions