Skip to content

Some Wonderings about leveraging deq model to new applications #35

@Alisa742

Description

@Alisa742

Regarding the code logic:

  1. As I understand it, the DEQ model uses a neural network to perform the entire iterative optimization process. The f_solver should provide the equilibrium point, as I understand it. Why is there another forward pass during training, and why is the resulting new_z1s and the original z1s used to calculate the JAC_loss? I'm currently working on a vehicle state estimation task. Doesn't this mean that the previous estimated value z1s is used to solve the fixed-point equilibrium equation to obtain new_z1s, and then the dynamic prediction model is run again?
Image
  1. I see the item() and detach() operations in the Anderson and Broyden solvers. I don't quite understand the basic principles and functions of these two functions. If item() and detach() are used, the gradient is broken. How can the network be effectively trained?
Image
  1. How should the overall loss function be designed? For a state estimation task, can a weighted fusion of the estimated state MSE and JAC_loss be used? How should the deep equilibrium model be designed to effectively utilize its capabilities?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions