Skip to content

Track 10min_16mb: PR #287 family rerun at 585s wallclock (mean val_bpb=1.1346)#483

Closed
tmustier wants to merge 1 commit intoopenai:mainfrom
tmustier:tmustier/pr287-585s-3seed
Closed

Track 10min_16mb: PR #287 family rerun at 585s wallclock (mean val_bpb=1.1346)#483
tmustier wants to merge 1 commit intoopenai:mainfrom
tmustier:tmustier/pr287-585s-3seed

Conversation

@tmustier
Copy link
Copy Markdown

Summary

This adds a 3-seed 8×H100 SXM rerun of the PR #287 family with FA3 under a 585s wallclock cap so all seeds stay within the 16,000,000-byte artifact limit.

Record folder:

  • records/track_10min_16mb/2026-03-23_PR287_XSA4_EMA_FA3_585s/

Results

Seed val_bpb Artifact bytes
1337 1.13467834 15,399,663
42 1.13567697 15,580,753
2025 1.13337713 15,398,721
Mean 1.13457748
  • Max artifact size: 15,580,753
  • One-sided t-test vs merged 1.1428: p = 0.0032

Files included

  • train_gpt.py
  • train.log
  • train_seed42.log
  • train_seed2025.log
  • README.md
  • submission.json

@tmustier
Copy link
Copy Markdown
Author

Closing this because, while the 3-seed rerun is numerically strong and valid, it does not meet the intended novelty bar for submission beyond prior PR families.

@tmustier tmustier closed this Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant