Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions tests/integration/test_lists/test-db/l0_dgx_b200.yml
Original file line number Diff line number Diff line change
Expand Up @@ -385,3 +385,5 @@ l0_dgx_b200:
tests:
- accuracy/test_llm_api_autodeploy.py::TestModelRegistryAccuracy::test_autodeploy_from_registry[deepseek-ai_DeepSeek-R1-0528-True]
- accuracy/test_llm_api_autodeploy.py::TestQwen3_5_397B_MoE::test_nvfp4[8]
# ------------- AutoDeploy Perf Sanity ---------------
- perf/test_perf_sanity.py::test_e2e[aggr_upload-deepseek_r1_fp8_ad_blackwell-r1_fp8_ad_ws8_1k1k] TIMEOUT (120)
Comment on lines +388 to +389
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check if DeepSeek-R1 FP8 perf tests exist in QA scheduled lists

# Search for deepseek_r1 or DeepSeek-R1 perf test entries in QA lists
rg -i 'deepseek.r1.*fp8|r1_fp8' tests/integration/test_lists/qa/ -C2

# Also check the QA README for guidance on which list file to use
cat tests/integration/test_lists/qa/README.md

Repository: NVIDIA/TensorRT-LLM

Length of output: 5866


🏁 Script executed:

cat tests/integration/test_lists/qa/llm_perf_sanity.yml

Repository: NVIDIA/TensorRT-LLM

Length of output: 4125


Add this test to tests/integration/test_lists/qa/llm_perf_sanity.yml for QA scheduled runs.

The new AutoDeploy perf-sanity test for DeepSeek-R1 FP8 on 8x B200 is correctly added to test-db (l0_dgx_b200.yml) for pre-merge CI. However, it is missing from the QA scheduled perf test list. Per guidelines, performance-sensitive new configurations (AutoDeploy + Blackwell optimization) should establish a baseline and catch regressions through QA scheduled runs. Add this test to llm_perf_sanity.yml with appropriate conditions (e.g., B200+ GPUs, 8+ GPU count) to ensure it runs regularly on qualifying hardware.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/test_lists/test-db/l0_dgx_b200.yml` around lines 388 - 389,
Add the new AutoDeploy perf-sanity entry for DeepSeek-R1 FP8 to the QA scheduled
list file llm_perf_sanity.yml: include the exact test identifier
perf/test_perf_sanity.py::test_e2e[aggr_upload-deepseek_r1_fp8_ad_blackwell-r1_fp8_ad_ws8_1k1k]
and ensure it has selection criteria to run only on machines with B200 or newer
GPUs and at least 8 GPUs (gpu_count >= 8) so it is included in regular QA
perf_sanity runs; mirror the naming and tags used in
tests/integration/test_lists/test-db/l0_dgx_b200.yml and add any necessary
scheduling metadata consistent with other entries in llm_perf_sanity.yml.

Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
metadata:
model_name: deepseek_r1_0528_fp8
supported_gpus:
- B200
hardware:
gpus_per_node: 8
server_configs:
# 1k1k config - AutoDeploy backend, 8 GPUs (DeepSeek-R1 0528 FP8 on DGX B200)
- name: "r1_fp8_ad_ws8_1k1k"
model_name: "deepseek_r1_0528_fp8"
backend: "_autodeploy"
extra_llm_api_config_path: "examples/auto_deploy/model_registry/configs/deepseek-r1.yaml"
world_size: 8
client_configs:
- name: "con64_iter10_1k1k"
concurrency: 64
iterations: 10
isl: 1024
osl: 1024
backend: "openai"
Loading