Question
In the paper, you mentioned "The System 1 model is trained with three major objectives: embedding
alignment among different goals, noise prediction for the diffusion policy, and critic prediction."
But I don't find the training details in the code, is it not prepared?