Hello! This is a great study. I find HunyuanDiT especially stable compared to DiTs without LSCs, but this model is a bit outdated, so I want to add LSCs to other DiTs myself.
I managed to add some LSC adapters to a pre-trained DiT. Interestingly, I notice that although the generated image is largely corrupted, I can still observe some patterns of the overall object contour.
Anyway, my major concern now is how to decide when to start the stage 2 training, a.k.a. when you train the entire model. Should I wait until the generated image quality is almost restored before going into stage 2?