fix: warmup uses full token budget for DP by ZhangLirong-amd · Pull Request #1024 · ROCm/ATOM

ZhangLirong-amd · 2026-06-02T02:56:29Z

Warmup now uses the full max_num_batched_tokens instead of dividing by dp_size. Under DP attention each rank's MoE sees up to dp_size * local_tokens after the all-gather, so warmup must exercise the full token budget to capture the true peak activation / CUDA-graph footprint; dividing by dp_size under-sized warmup and let decode OOM later. Also updated the warning message accordingly.

Copilot

Pull request overview

Updates model warmup behavior so it exercises the full configured batch token budget (max_num_batched_tokens) rather than scaling it down by data-parallel size, aiming to better match peak activation / CUDA-graph memory seen during real DP-attention decode workloads.

Changes:

Set warmup_max_tokens to max_num_batched_tokens (no longer divided by dp_size).
Update the warmup warning message to reflect the new sizing behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+                f"{self.label}: warmup_max_tokens={warmup_max_tokens} (=max_num_batched_tokens) "
+                f"< max_model_len={max_model_len}. "
                f"Using {num_seqs} seq with length {seq_len} for warmup."


Copilot AI review requested due to automatic review settings June 2, 2026 02:56

Copilot started reviewing on behalf of ZhangLirong-amd June 2, 2026 02:56 View session

Copilot AI reviewed Jun 2, 2026

View reviewed changes

Comment thread atom/model_engine/model_runner.py

Comment on lines +1069 to 1071

f"{self.label}: warmup_max_tokens={warmup_max_tokens} (=max_num_batched_tokens) "

f"< max_model_len={max_model_len}. "

f"Using {num_seqs} seq with length {seq_len} for warmup."

ZhangLirong-amd changed the title ~~fix: warmup uses full token budget~~ fix: warmup uses full token budget for DP Jun 2, 2026

fix: warmup uses full token budget

c84a3b1

ZhangLirong-amd force-pushed the dp_mem branch from f438c6f to c84a3b1 Compare June 2, 2026 06:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: warmup uses full token budget for DP#1024

fix: warmup uses full token budget for DP#1024
ZhangLirong-amd wants to merge 1 commit into
mainfrom
dp_mem

ZhangLirong-amd commented Jun 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ZhangLirong-amd commented Jun 2, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants