Skip to content

[#13697][feat] Add model YAML defaults loader#13898

Open
guan404ming wants to merge 1 commit intoNVIDIA:mainfrom
guan404ming:feat/model-config-yaml-loader
Open

[#13697][feat] Add model YAML defaults loader#13898
guan404ming wants to merge 1 commit intoNVIDIA:mainfrom
guan404ming:feat/model-config-yaml-loader

Conversation

@guan404ming
Copy link
Copy Markdown

@guan404ming guan404ming commented May 8, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced model configuration loading to merge both architecture-level YAML defaults and model-provided defaults for more comprehensive configuration.
  • Tests

    • Added comprehensive test coverage for model configuration loading and merging functionality.

Description

related to #13697

I add a shared loader for per-model default YAML configs.

  • support loading <model>.yaml defaults for PyTorch flow and adds the merge behavior needed for a follow-up AutoDeploy PR to load <model>_ad_default.yaml before <model>.yaml.
  • YAML lookup currently uses the HF architectures[0] name as the model key.

Test Coverage

Added unit tests for:

  • missing config files
  • <model>.yaml loading
  • optional <model>_ad_default.yaml loading
  • merge precedence
  • nested dict merge behavior
  • invalid top-level YAML types

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: Guan-Ming (Wesley) Chiu <105915352+guan404ming@users.noreply.github.com>
@svc-trtllm-gh-bot svc-trtllm-gh-bot added the Community want to contribute PRs initiated from Community label May 8, 2026
@guan404ming guan404ming marked this pull request as ready for review May 8, 2026 09:42
@guan404ming guan404ming requested review from a team as code owners May 8, 2026 09:42
@guan404ming guan404ming requested a review from syuoni May 8, 2026 09:42
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR introduces a new YAML-based model configuration loader that computes merged defaults from architecture-level YAML files and optional model-code-provided defaults. The loader is integrated into model initialization and includes comprehensive test coverage for validation, precedence, and error cases.

Changes

Model Configuration Defaults Loading

Layer / File(s) Summary
YAML Config Loading Infrastructure
tensorrt_llm/llmapi/model_config_loader.py
New module exports DEFAULT_MODEL_CONFIGS_DIR, load_model_defaults() to conditionally load and merge <arch>.yaml and optional <arch>_ad_default.yaml files across search directories, and merge_model_defaults() to deep-merge arbitrary dictionary layers with right-side precedence. Includes _load_yaml_if_exists() helper that validates YAML content is a mapping and treats missing/empty files as empty dicts.
Model Loader Integration
tensorrt_llm/_torch/pyexecutor/model_loader.py
load_config_and_apply_defaults() now extracts architecture name from config.pretrained_config.architectures, loads YAML defaults via load_model_defaults(), retrieves code defaults from model_cls.get_model_defaults() when available, merges both sets via merge_model_defaults(), and applies the merged result while logging applied defaults.
Tests and Validation
tests/unittest/llmapi/test_model_config_loader.py
Tests verify empty-file and missing-file handling, conditional include_ad_default layer inclusion, YAML-over-AD-default precedence, deep-merge behavior for nested dicts, search-directory override ordering, DEFAULT_MODEL_CONFIGS_DIR path resolution, empty YAML interpretation, non-mapping YAML error handling, merge_model_defaults right-wins semantics, and graceful handling of missing search directories.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main feature being added: a model YAML defaults loader with proper type and ticket reference.
Description check ✅ Passed The description explains the motivation and implementation details, includes test coverage information, but lacks explicit clarity on specific sections and leaves the PR checklist unchecked.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/unittest/llmapi/test_model_config_loader.py (1)

33-162: QA list updates are unnecessary for this PR scope.

This PR adds/updates unit tests only; no integration test-list entry under tests/integration/test_lists/qa/ is needed.

As per coding guidelines: “If the PR only touches unittest/ or narrow unit scope, say explicitly whether QA list updates are unnecessary or optional.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unittest/llmapi/test_model_config_loader.py` around lines 33 - 162,
Update the PR metadata to explicitly state QA list updates are unnecessary for
this change: add a short note to the PR description (or commit message) saying
this PR only modifies unit tests (e.g. references
test_returns_empty_when_no_files /
test_merge_model_defaults_right_wins_and_skips_empty in
tests/unittest/llmapi/test_model_config_loader.py) and therefore no integration
QA list entry under tests/integration/test_lists/qa/ is required. Keep the note
concise and place it in the PR description near the changelog or QA checklist so
reviewers see the justification immediately.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tensorrt_llm/llmapi/model_config_loader.py`:
- Around line 55-71: Sanitize the incoming arch string before using it to
construct file paths: in the code that checks if not arch and then iterates
search_dirs, validate/normalize arch (e.g., ensure it matches a safe pattern
like r'^[A-Za-z0-9_.-]+$' or use Path(arch).name and reject if arch contains
path separators or '..'); if invalid, return {} or raise ValueError. Then use
the sanitized name (e.g., sanitized_arch) when calling _load_yaml_if_exists for
f"{sanitized_arch}.yaml" and f"{sanitized_arch}_ad_default.yaml" so
_load_yaml_if_exists and _deep_merge only operate on safe filenames.

---

Nitpick comments:
In `@tests/unittest/llmapi/test_model_config_loader.py`:
- Around line 33-162: Update the PR metadata to explicitly state QA list updates
are unnecessary for this change: add a short note to the PR description (or
commit message) saying this PR only modifies unit tests (e.g. references
test_returns_empty_when_no_files /
test_merge_model_defaults_right_wins_and_skips_empty in
tests/unittest/llmapi/test_model_config_loader.py) and therefore no integration
QA list entry under tests/integration/test_lists/qa/ is required. Keep the note
concise and place it in the PR description near the changelog or QA checklist so
reviewers see the justification immediately.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4f5be979-12a5-4eb9-a71a-49786e665ea0

📥 Commits

Reviewing files that changed from the base of the PR and between 2e4b05c and 4d71474.

📒 Files selected for processing (3)
  • tensorrt_llm/_torch/pyexecutor/model_loader.py
  • tensorrt_llm/llmapi/model_config_loader.py
  • tests/unittest/llmapi/test_model_config_loader.py

Comment on lines +55 to +71
if not arch:
return {}

if search_dirs is None:
search_dirs = (DEFAULT_MODEL_CONFIGS_DIR,)

merged: Dict[str, Any] = {}
for directory in search_dirs:
directory = Path(directory)
if include_ad_default:
ad_layer = _load_yaml_if_exists(directory / f"{arch}_ad_default.yaml")
if ad_layer:
merged = _deep_merge(merged, ad_layer)

model_layer = _load_yaml_if_exists(directory / f"{arch}.yaml")
if model_layer:
merged = _deep_merge(merged, model_layer)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sanitize arch before building YAML file paths to prevent path traversal.

arch is used directly in f"{arch}.yaml" / f"{arch}_ad_default.yaml". A malicious value containing path separators or .. can make the loader read files outside intended config directories.

🔒 Proposed hardening patch
+import re
 from pathlib import Path
 from typing import Any, Dict, Optional, Sequence
@@
 DEFAULT_MODEL_CONFIGS_DIR: Path = Path(__file__).resolve().parent.parent / "model_configs"
+_ARCH_NAME_PATTERN = re.compile(r"^[A-Za-z0-9_]+$")
@@
 def load_model_defaults(
@@
-    if not arch:
+    if not arch:
         return {}
+    if not _ARCH_NAME_PATTERN.fullmatch(arch):
+        raise ValueError(f"Invalid architecture name for model defaults: {arch!r}")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tensorrt_llm/llmapi/model_config_loader.py` around lines 55 - 71, Sanitize
the incoming arch string before using it to construct file paths: in the code
that checks if not arch and then iterates search_dirs, validate/normalize arch
(e.g., ensure it matches a safe pattern like r'^[A-Za-z0-9_.-]+$' or use
Path(arch).name and reject if arch contains path separators or '..'); if
invalid, return {} or raise ValueError. Then use the sanitized name (e.g.,
sanitized_arch) when calling _load_yaml_if_exists for f"{sanitized_arch}.yaml"
and f"{sanitized_arch}_ad_default.yaml" so _load_yaml_if_exists and _deep_merge
only operate on safe filenames.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Community want to contribute PRs initiated from Community

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants