[TRTLLM-12627][ci] Narrow tensorrt_llm/serve/ MGPU trigger to disagg-only files by QiJune · Pull Request #14022 · NVIDIA/TensorRT-LLM

QiJune · 2026-05-12T02:56:37Z

Summary

Replace the tensorrt_llm/serve/ directory blanket in getMultiGpuFileChanged's relatedFileList with the 8 files under serve/ that have active disagg / multi-process integration (verified by from tensorrt_llm.llmapi.disagg_utils import ...).
Drops trigger for tool parsers, benchmark scripts, multimodal /protocol / response-handler code under serve/ — none of which have multi-rank semantics.
Files kept: openai_server.py, openai_disagg_server.py, openai_disagg_service.py, router.py, openai_client.py, cluster_storage.py, disagg_auto_scaling.py, metadata_server.py.

Why

tensorrt_llm/serve/ is the first broad pattern in the multi-GPU rule by 12-week hit volume: 53 commits / 19 solo-via-serve. Of those 19 solo commits, only 3 touched genuinely multi-process code paths (2× openai_client.py "fix lost requests", 1× router.py threading optimization). The other 16 were tool parsers, benchmark clients, multimodal API, harmony parsers, and PNG encoding — all single-process work that does not need MGPU coverage.

openai_server.py is in the keep list because — despite being mostly HTTP-layer code — it owns the disagg cluster worker
lifecycle (register / heartbeat / deregister via DisaggClusterWorker). Dropping it would create a real coverage gap for disagg cluster regressions.

Measured impact (12 weeks, 1017 commits)

Metric	Before	After	Δ
Solo-via-serve commits	19	10	−9
Total MGPU triggers	346	337	−9
MGPU hit rate	34.0 %	33.1 %	−0.88 pp

All 3 historical disagg-relevant solo commits still trigger MGPU under the narrow set (they touch router.py or openai_client.py). Zero measured loss of true-positive coverage.

Coverage preservation argument

A pattern is removable when (a) the dropped files lack multi-rank semantics, and (b) any commit that would have legitimately needed MGPU coverage still matches via some other pattern or the narrow serve subset.

Dropped sub-bucket	12w hits	12w solo	Why no MGPU
`tool_parser/` (16 files)	9	3	Pure string parsing
`scripts/` (7 files)	7	2	Client-side benchmark scripts
Server API non-disagg (e.g., `openai_protocol.py`, `chat_utils.py`, `harmony_adapter.py`, `postprocess_handlers.py`, `resource_governor.py`, `visual_gen_utils.py`)	14	3	Single-process API / response handling

Caveats

Future risk: if a maintainer adds a new disagg-related file under serve/, it will not auto-trigger MGPU until added to this list. The blanket prefix today is "automatic" but indiscriminate; the narrow list trades that automation for false-positive reduction. All current disagg / cluster files (verified by disagg_utils import grep) are in the keep list.
perf_metrics.py also imports ServerRole from disagg_utils but uses it only as a metric label — no orchestration. Left out intentionally to maintain the "active multi-process integration" criterion.

Summary by CodeRabbit

Chores
- Updated CI/CD pipeline configuration to expand multi-GPU testing triggers for additional serving module files.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

coderabbitai · 2026-05-12T02:58:15Z

📝 Walkthrough

Walkthrough

This PR extends the Jenkins CI pipeline's multi-GPU test trigger detection to include eight additional serve-related Python modules from the tensorrt_llm package. Changes to cluster storage, auto-scaling, metadata, OpenAI client, server, and router implementations will now automatically trigger multi-GPU test execution.

Changes

Multi-GPU Test Trigger Expansion

Layer / File(s)	Summary
Multi-GPU test trigger files `jenkins/L0_MergeRequest.groovy`	The `getMultiGpuFileChanged` function's `relatedFileList` is expanded to recognize eight additional `tensorrt_llm/serve/*.py` modules (`cluster_storage.py`, `disagg_auto_scaling.py`, `metadata_server.py`, `openai_client.py`, `openai_disagg_server.py`, `openai_disagg_service.py`, `openai_server.py`, `router.py`) as multi-GPU related, broadening when the pipeline forces multi-GPU test execution based on PR file diffs.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: narrowing the MGPU trigger scope for tensorrt_llm/serve/ files to disagg-only files, with proper ticket reference and CI type prefix.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	PR description comprehensively explains the change, rationale, and impact with detailed analysis and caveats.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

jenkins/L0_MergeRequest.groovy (1)

941-948: ⚡ Quick win

Add an inline maintenance note for this curated serve allowlist.

Given this is intentionally selective, a short comment here about “update this list when new disagg/multi-process serve files are added” would reduce future missed-trigger risk.

💡 Suggested diff

         "tensorrt_llm/parameter.py",
+        // Keep this serve allowlist in sync with disagg/multi-process integration.
+        // If a new serve file participates in disagg flows, add it here.
         "tensorrt_llm/serve/cluster_storage.py",
         "tensorrt_llm/serve/disagg_auto_scaling.py",
         "tensorrt_llm/serve/metadata_server.py",

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@jenkins/L0_MergeRequest.groovy` around lines 941 - 948, Add a short inline
maintenance comment immediately above this curated serve allowlist (the block
containing "tensorrt_llm/serve/cluster_storage.py",
"tensorrt_llm/serve/disagg_auto_scaling.py",
"tensorrt_llm/serve/metadata_server.py", "tensorrt_llm/serve/openai_client.py",
"tensorrt_llm/serve/openai_disagg_server.py",
"tensorrt_llm/serve/openai_disagg_service.py",
"tensorrt_llm/serve/openai_server.py", "tensorrt_llm/serve/router.py")
indicating that the list is intentionally selective and must be updated whenever
new disagg or multi-process serve files are added to avoid missed CI triggers;
keep the note brief and actionable so reviewers and future editors see it when
maintaining the allowlist.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@jenkins/L0_MergeRequest.groovy`:
- Around line 941-948: Add a short inline maintenance comment immediately above
this curated serve allowlist (the block containing
"tensorrt_llm/serve/cluster_storage.py",
"tensorrt_llm/serve/disagg_auto_scaling.py",
"tensorrt_llm/serve/metadata_server.py", "tensorrt_llm/serve/openai_client.py",
"tensorrt_llm/serve/openai_disagg_server.py",
"tensorrt_llm/serve/openai_disagg_service.py",
"tensorrt_llm/serve/openai_server.py", "tensorrt_llm/serve/router.py")
indicating that the list is intentionally selective and must be updated whenever
new disagg or multi-process serve files are added to avoid missed CI triggers;
keep the note brief and actionable so reviewers and future editors see it when
maintaining the allowlist.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9e888c85-0c80-450f-94a3-bf41721e7063

📥 Commits

Reviewing files that changed from the base of the PR and between 64260ba and dcdb288.

📒 Files selected for processing (1)

jenkins/L0_MergeRequest.groovy

JunyiXu-nv

LGTM

Narrow tensorrt_llm/serve/ MGPU trigger to disagg-only files

dcdb288

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

QiJune requested review from a team as code owners May 12, 2026 02:56

QiJune requested review from dpitman-nvda and yiqingy0 May 12, 2026 02:56

github-actions Bot assigned QiJune May 12, 2026

QiJune requested review from JunyiXu-nv, Shixiaowei02 and chuangz0 May 12, 2026 02:56

coderabbitai Bot reviewed May 12, 2026

View reviewed changes

JunyiXu-nv approved these changes May 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TRTLLM-12627][ci] Narrow tensorrt_llm/serve/ MGPU trigger to disagg-only files#14022

[TRTLLM-12627][ci] Narrow tensorrt_llm/serve/ MGPU trigger to disagg-only files#14022
QiJune wants to merge 1 commit into
NVIDIA:mainfrom
QiJune:serve-mg

QiJune commented May 12, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 12, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Uh oh!

JunyiXu-nv left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

QiJune commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Measured impact (12 weeks, 1017 commits)

Coverage preservation argument

Caveats

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

coderabbitai Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

JunyiXu-nv left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

QiJune commented May 12, 2026 •

edited

Loading

coderabbitai Bot commented May 12, 2026 •

edited

Loading