I haven't yet gone through and carefully checked the runs for problem sizes. For flux we can get this programatically from the jobspec, and for others I will do manual checks of the submission configs. I am leaving a note here because I meant to do it before adding the data and forgot. We don't want to be comparing 🍎 and 🟠
To be clear, I think most are run at 256 256 128, and for those where it originally didn't work I was careful to check using the right output directory. But we should also check directly from the data.