It's a lot of code to set up those pretraining/finetunine scripts. Since they're fairly mature we should move them into the core library's API.