[None][test] Split verl tests into 19 fine-grained per-case wrappers by Superjomn · Pull Request #14037 · NVIDIA/TensorRT-LLM

Superjomn · 2026-05-12T06:59:36Z

Summary

Replace 3 file-level verl wrappers with 19 fine-grained per-case wrappers in tests/integration/defs/verl/test_verl_cases.py (one wrapper per upstream verl pytest case)
Bump verl pin: 4cda6af -> d324b01
Update l0_verl test list to enumerate the 19 case-level entries
Drop the 3 blanket nvbugs/5981833 waives that were skipping the old file-level wrappers

Why

File-level wrappers force us to skip an entire upstream test file when a single case is flaky. Per-case wrappers let us waive precisely.

Test plan

Local: pytest --collect-only tests/integration/defs/verl/test_verl_cases.py -> 19 collected
Local: 4 pure HTTP-mock adapter cases pass when run directly against a verl clone at d324b01 (test_make_async_request_{get,post,http_error,max_attempts_exceeded}_method)
Verified all 19 case names are unique across the 5 verl test files - no -k substring collisions
CI: /bot run --extra-stage "DGX_B200-4_GPUs-AutoDeploy-1" to exercise the 4xB200 l0_verl post-merge stage

Notes

The verl_setup fixture builds gdrcopy + nvshmem + DeepEP and needs root for /usr/local/lib. It runs cleanly in CI containers but cannot run on a non-root dev shell, so full validation depends on CI.

Summary by CodeRabbit

Tests
- Refactored integration test suite to use individual test functions for finer-grained control and coverage.
- Expanded test cases with additional scenarios for placement groups, async operations, multimodal generation, and inter-node rollout.
Chores
- Updated test infrastructure configuration and removed obsolete test entries.

Replace the 3 file-level verl wrappers (test_async_server, test_adapter, test_rollout_utils) with 19 wrappers that map 1-to-1 to upstream verl pytest cases. Each wrapper shells out to `pytest <file> -k <case>` so individual cases can be waived without skipping a whole file. Also: - Bump verl pin: 4cda6af -> d324b01 - Expand l0_verl test list to the 19 case-level entries - Remove the 3 blanket waives under nvbugs/5981833 Local smoke (2xH800 NVL dev host): - pytest --collect-only: 19 / 19 collected - 4 pure HTTP-mock adapter cases run directly in a verl clone: 4/4 PASS - All 19 case names verified unique across the 5 verl test files (no -k substring collision) The verl_setup fixture (gdrcopy/nvshmem/DeepEP build) needs root and will only run cleanly in CI containers; full validation depends on /bot run on 4xB200. Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

coderabbitai · 2026-05-12T07:03:20Z

📝 Walkthrough

Walkthrough

The PR refactors the verl integration test infrastructure to enable fine-grained test waiving. Test wrapper functions now map 1-to-1 to individual pytest cases via a new _run_single helper, replacing coarse module-level wrappers. The verl repository pinning is updated, test lists expanded with all new granular cases, and obsolete waivers removed.

Changes

Verl Test Infrastructure Refactoring

Layer / File(s)	Summary
Test wrapper refactoring for fine-grained waiving `tests/integration/defs/verl/test_verl_cases.py`	Module docstring now documents 1-to-1 mapping between wrapper functions and individual pytest cases. Path constants are added for verl test file locations. New `_run_single(verl_file, case_name, timeout)` helper executes a single case via `-k` filtering. Three coarse wrappers (`test_async_server`, `test_adapter`, `test_rollout_utils`) are replaced by nineteen granular test functions covering placement groups, async generation, request methods, multimodal generation, wake/sleep cycles, inter-node rollout, and abort scenarios, with extended timeouts (900s) for slower multimodal and multi-node cases.
Configuration and test synchronization `tests/integration/defs/verl/verl_config.yml`, `tests/integration/test_lists/test-db/l0_verl.yml`, `tests/integration/test_lists/waives.txt`	Verl repository is pinned to commit `d324b01`. Integration test list (`l0_verl`) is expanded from three coarse tests to nineteen granular test cases matching the new wrapper structure. Three obsolete waive entries for the previous wrapper functions are removed.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main change: splitting verl tests from 3 file-level wrappers into 19 fine-grained per-case wrappers.
Description check	✅ Passed	The description includes all required sections (Summary, Why, Test plan) with clear explanations of what changed, why, and how it was verified.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/integration/defs/verl/test_verl_cases.py`:
- Around line 139-142: The wrapper _run_single currently uses "-k" which matches
substrings; change it to invoke pytest with the exact node id instead: build a
node id by joining the test file and case name (e.g. "{verl_file}::{case_name}"
or "{verl_file}::ClassName::test_name" when needed) and pass that as a
positional arg via _run_verl_test's extra_args instead of ["-k", case_name], so
pytest runs exactly that single test; update _run_single to construct and pass
the full node id to _run_verl_test.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 910098c5-8e32-40a3-b5e2-611f7480c33b

📥 Commits

Reviewing files that changed from the base of the PR and between 7bc328f and e0907d5.

📒 Files selected for processing (4)

tests/integration/defs/verl/test_verl_cases.py
tests/integration/defs/verl/verl_config.yml
tests/integration/test_lists/test-db/l0_verl.yml
tests/integration/test_lists/waives.txt

💤 Files with no reviewable changes (1)

tests/integration/test_lists/waives.txt

coderabbitai · 2026-05-12T07:03:23Z

+def _run_single(verl_file, case_name, timeout=600):
+    """Run exactly one verl pytest case by name."""
+    _run_verl_test(verl_file, extra_args=["-k", case_name], timeout=timeout)



⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use exact pytest node IDs instead of -k for single-case wrappers.

Line 141 uses -k, which can select multiple tests via substring/expression matching. That breaks the “exactly one case” guarantee and can silently reduce waive precision as upstream adds similarly named tests.

Suggested patch (exact-case execution)

def _run_verl_test(test_path, extra_args=None, timeout=600): """Run a test from the verl repo via subprocess.""" - full_path = os.path.join(VERL_ROOT, test_path) - assert os.path.exists(full_path), f"Verl test not found: {full_path}" - cmd = [sys.executable, "-m", "pytest", full_path, "-v", "--tb=short"] + test_file = test_path.split("::", 1)[0] + full_test_file = os.path.join(VERL_ROOT, test_file) + assert os.path.exists(full_test_file), f"Verl test not found: {full_test_file}" + node_id = os.path.join(VERL_ROOT, test_path) + cmd = [sys.executable, "-m", "pytest", node_id, "-v", "--tb=short"] if extra_args: cmd.extend(extra_args) @@ def _run_single(verl_file, case_name, timeout=600): """Run exactly one verl pytest case by name.""" - _run_verl_test(verl_file, extra_args=["-k", case_name], timeout=timeout) + _run_verl_test(f"{verl_file}::{case_name}", timeout=timeout)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/defs/verl/test_verl_cases.py` around lines 139 - 142, The wrapper _run_single currently uses "-k" which matches substrings; change it to invoke pytest with the exact node id instead: build a node id by joining the test file and case name (e.g. "{verl_file}::{case_name}" or "{verl_file}::ClassName::test_name" when needed) and pass that as a positional arg via _run_verl_test's extra_args instead of ["-k", case_name], so pytest runs exactly that single test; update _run_single to construct and pass the full node id to _run_verl_test.

Superjomn · 2026-05-12T07:17:47Z

/bot --help

github-actions · 2026-05-12T07:18:01Z

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

Details

run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental) --high-priority]

Launch build/test pipelines. All previously running jobs will be killed.

--reuse-test (optional)pipeline-id (OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.

--disable-reuse-test (OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-PyTorch-1, xxx" (OPTIONAL) : Only run the specified test stages. Supports wildcard * for pattern matching (e.g., "*PerfSanity*" matches all stages containing PerfSanity). Examples: "A10-PyTorch-1, xxx", "PerfSanity". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--test-backend "pytorch, cpp" (OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Supports wildcard * for pattern matching. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx", --extra-stage "Post-Merge".

--detailed-log (OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.

--debug (OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in the stage-list parameter to access the appropriate container environment. Note: Does NOT update GitHub check status.

--high-priority (OPTIONAL) : Run the pipeline with high priority. This option is restricted to authorized users only and will route the job to a high-priority queue.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

Superjomn · 2026-05-12T07:18:57Z

/bot run --extra-stage "DGX_B200-4_GPUs-AutoDeploy-1"

tensorrt-cicd · 2026-05-12T07:24:30Z

PR_Github #47921 [ run ] triggered by Bot. Commit: e0907d5 Link to invocation

tensorrt-cicd · 2026-05-12T12:20:56Z

PR_Github #47921 [ run ] completed with state SUCCESS. Commit: e0907d5
/LLM/main/L0_MergeRequest_PR pipeline #37768 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

github-actions Bot assigned Superjomn May 12, 2026

coderabbitai Bot reviewed May 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[None][test] Split verl tests into 19 fine-grained per-case wrappers#14037

[None][test] Split verl tests into 19 fine-grained per-case wrappers#14037
Superjomn wants to merge 1 commit into
NVIDIA:mainfrom
Superjomn:fine-grained-verl-test

Superjomn commented May 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 12, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 12, 2026

Uh oh!

Superjomn commented May 12, 2026

Uh oh!

github-actions Bot commented May 12, 2026

kill

skip

reuse-pipeline

Uh oh!

Superjomn commented May 12, 2026

Uh oh!

tensorrt-cicd commented May 12, 2026

Uh oh!

tensorrt-cicd commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Superjomn commented May 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Test plan

Notes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 12, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Superjomn commented May 12, 2026

Uh oh!

github-actions Bot commented May 12, 2026

GitHub Bot Help

kill

skip

reuse-pipeline

Uh oh!

Superjomn commented May 12, 2026

Uh oh!

tensorrt-cicd commented May 12, 2026

Uh oh!

tensorrt-cicd commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Superjomn commented May 12, 2026 •

edited by coderabbitai Bot

Loading