feat: add SFT entropy logging and validation loss monitoring by none0663 · Pull Request #1925 · THUDM/slime

none0663 · 2026-05-19T12:46:38Z

Add two monitoring features for SFT training to detect overfitting:

Training entropy (--log-sft-entropy):
- Computes token-level entropy under no_grad to avoid OOM
- Logged as train/entropy to TensorBoard/WandB
Validation loss (--val-data + --val-interval):
- Full DP-parallel val loss computation with dynamic batching
- Token-weighted aggregation across ranks (not rank-mean)
- CP-correct reduction via get_sum_of_sample_mean
- Deadlock-safe: all ranks synchronize before collective ops
- Runs initial val before training for baseline
- Also logs val/entropy when --log-sft-entropy is set

Add two monitoring features for SFT training to detect overfitting: 1. Training entropy (--log-sft-entropy): - Computes token-level entropy under no_grad to avoid OOM - Logged as train/entropy to TensorBoard/WandB 2. Validation loss (--val-data + --val-interval): - Full DP-parallel val loss computation with dynamic batching - Token-weighted aggregation across ranks (not rank-mean) - CP-correct reduction via get_sum_of_sample_mean - Deadlock-safe: all ranks synchronize before collective ops - Runs initial val before training for baseline - Also logs val/entropy when --log-sft-entropy is set Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

1. Divide local_tokens by cp_size before all_reduce to avoid overcounting when loss_masks are replicated across CP ranks. 2. Use max(0, start_rollout_id - 1) for baseline val step to avoid discontinuity and step collision on training resume. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

1. When val_data has fewer samples than dp_size, replicate to all ranks instead of leaving empty shards (which would skip val entirely). 2. Skip baseline val when it would collide with the first periodic val at the same step (val_interval=1 + start_rollout_id=0). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

none0663 and others added 3 commits May 19, 2026 20:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add SFT entropy logging and validation loss monitoring#1925

feat: add SFT entropy logging and validation loss monitoring#1925
none0663 wants to merge 3 commits into
THUDM:mainfrom
none0663:feature/sft-entropy-and-val-loss

none0663 commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

none0663 commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant