Commit b4d202c
Warmup changes: only warm a few batches; extract to separate method in trainer class (#43)
* apply optimizer every batch, not every epoch; unscale gradients before clipping
* trainer tweaks
* apply optimizer every batch, not every epoch; unscale gradients before clipping
* extract warmup to separate method; switch to warming up set number of batches (user configurable)
* whitespace; num_workers revert
* ruff
* make parallelstrategy, spatial_mesh, ddp_placements attrs of trainer; other small tweaks
* remove deprecated config attrs
* ruff
* get device mesh from ps class attr
* ruff
* missing self. on some ps accesses
* Fix imports and missing self.ps
* rm legacy warmup_epochs
* Move attributes to base class for clarity
* remove warmup_epochs -- not useful to keep support for this
* call cleanup_or_resume trainer method directly
* rm unused vars
---------
Co-authored-by: Patrick Miles <miles30@tioga.llnl.gov>
Co-authored-by: Michael McKinsey <michaelmckinsey1@gmail.com>1 parent 875f2fd commit b4d202c
6 files changed
Lines changed: 174 additions & 172 deletions
File tree
- ScaFFold
- configs
- utils
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
140 | 140 | | |
141 | 141 | | |
142 | 142 | | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
143 | 148 | | |
144 | 149 | | |
145 | 150 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
32 | | - | |
| 32 | + | |
33 | 33 | | |
34 | | - | |
| 34 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
32 | | - | |
| 32 | + | |
33 | 33 | | |
34 | 34 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
66 | 66 | | |
67 | 67 | | |
68 | 68 | | |
69 | | - | |
| 69 | + | |
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
| |||
0 commit comments