Hello,
I have followed your steps exactly, and starting from the pretrained checkpoint, and i started training from that checkpoint, and the best RMSE achieved was 0.91 , which is not the same as directly using your final checkpoint. Is there any reason for that behaviour? I suspect the disp_loss for the supervision from the teacher, as when i decreased its weight, the RMSE was actually improving?
Best.
Hello,
I have followed your steps exactly, and starting from the pretrained checkpoint, and i started training from that checkpoint, and the best RMSE achieved was 0.91 , which is not the same as directly using your final checkpoint. Is there any reason for that behaviour? I suspect the disp_loss for the supervision from the teacher, as when i decreased its weight, the RMSE was actually improving?
Best.