Update modernbert-base-pretrain.yaml with a few comments#213
Update modernbert-base-pretrain.yaml with a few comments#213ahxxm wants to merge 1 commit intoAnswerDotAI:pretraining_documentationfrom
Conversation
and enable a few monitors
|
Hello, I am not sure those kind of comments belong to a config file; any other configs won't have it. As to what's blocking the doc branch being merged, mainly us having a lot of stuff to do and trying to focus on blocking stuff and wanting to make sure everything is correct before merging (as well as adding stuff that are missing, notably for the reproduction on all the experiments). |
|
At least "legacy reasons" deserves some explaination? Agree that sequence_packing one As for |
|
Slightly related question to reproduction: did you pre-tokenize all samples? I found that resume training from a checkpoint takes much longer than expected, and I guess it's because that it has to tokenize text training dataset again to deterministically(hopefully) skip exact tokens already used for training -- which might better be a section in README? Dataset format determines unit choice of |
Changes
a few comments and enable a few monitors
Discussions
wondering what's blocking this doc branch being merged..
Tests
Not needed? I tested with my own dataset.