[None][fix] Fix and unwaive AutoDeploy accuracy tests#13925
[None][fix] Fix and unwaive AutoDeploy accuracy tests#13925bmarimuthu-nv wants to merge 3 commits intoNVIDIA:mainfrom
Conversation
Signed-off-by: Balamurugan Marimuthu <246387390+bmarimuthu-nv@users.noreply.github.com>
62866f2 to
107756e
Compare
Signed-off-by: Balamurugan Marimuthu <246387390+bmarimuthu-nv@users.noreply.github.com>
Signed-off-by: Balamurugan Marimuthu <246387390+bmarimuthu-nv@users.noreply.github.com>
|
@coderabbitai summary |
✅ Actions performedSummary regeneration triggered. |
|
/bot run --stage-list "A10-Build_Docs, A10-PackageSanityCheck-PY310-UB2204, A100X-PackageSanityCheck-PY312-UB2404, A30-AutoDeploy-1, H100_PCIe-AutoDeploy-1, DGX_B200-AutoDeploy-1, A100X-PyTorch-1, DGX_H100-4_GPUs-AutoDeploy-1, DGX_B200-4_GPUs-AutoDeploy-1, DGX_H100-4_GPUs-AutoDeploy-Post-Merge-1, DGX_B200-8_GPUs-AutoDeploy-Post-Merge-1" --disable-fail-fast |
📝 WalkthroughWalkthroughThis PR contains three independent model and test updates: Gemma4 switches from a custom tokenizer wrapper to ChangesGemma4 AutoTokenizer Migration
Qwen3.5 MoE Token-Type Field Cleanup
Test Reference and Configuration Updates
🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
| - accuracy: 93.75 | ||
| - accuracy: 89.045 | ||
| - quant_algo: FP8_BLOCK_SCALES | ||
| accuracy: 93.75 |
There was a problem hiding this comment.
no this was registered by pytorch backend. AD flow doesn't use quant_algo row.
The difference could be some setup changes I think
|
PR_Github #47450 [ run ] triggered by Bot. Commit: |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tensorrt_llm/_torch/auto_deploy/models/custom/modeling_gemma4.py (1)
2291-2299:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUpdate transformers requirement to ≥5.5.0 or add version guard before AutoTokenizer.from_pretrained() calls.
The code uses
AutoTokenizer.from_pretrained()at lines 2296 and 2635 to load Gemma4 tokenizers, but the repo is pinned to transformers==5.3.0. Gemma4 was introduced in Transformers v5.5.0; attempting to load it with 5.3.0 will fail with "Transformers does not recognize this architecture." Either updaterequirements.txttotransformers>=5.5.0throughout the repo, or add a version check (similar to the pattern intransformers_causal_mask.py) to guard these calls and fail fast with a clear error message if the requirement is not met.Also applies to: 2632-2635
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tensorrt_llm/_torch/auto_deploy/models/custom/modeling_gemma4.py` around lines 2291 - 2299, The Gemma4 tokenizer calls in ADGemma4Processor.from_pretrained (and the similar call near the end of modeling_gemma4.py) rely on Transformers >=5.5.0; either update the repo requirement to transformers>=5.5.0 or add a runtime version guard before calling AutoTokenizer.from_pretrained() that checks transformers.__version__ (or uses packaging.version.parse) and raises a clear, fast-failing error indicating the minimum version required. Locate the two AutoTokenizer.from_pretrained usages in modeling_gemma4.py (inside ADGemma4Processor.from_pretrained and the other Gemma4 loader) and implement the version check there so the code never calls AutoTokenizer when the installed transformers is <5.5.0.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@tensorrt_llm/_torch/auto_deploy/models/custom/modeling_gemma4.py`:
- Around line 2291-2299: The Gemma4 tokenizer calls in
ADGemma4Processor.from_pretrained (and the similar call near the end of
modeling_gemma4.py) rely on Transformers >=5.5.0; either update the repo
requirement to transformers>=5.5.0 or add a runtime version guard before calling
AutoTokenizer.from_pretrained() that checks transformers.__version__ (or uses
packaging.version.parse) and raises a clear, fast-failing error indicating the
minimum version required. Locate the two AutoTokenizer.from_pretrained usages in
modeling_gemma4.py (inside ADGemma4Processor.from_pretrained and the other
Gemma4 loader) and implement the version check there so the code never calls
AutoTokenizer when the installed transformers is <5.5.0.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4e8b3ad0-f423-48b0-a7ea-a9494d07e7bc
📒 Files selected for processing (6)
tensorrt_llm/_torch/auto_deploy/models/custom/modeling_gemma4.pytensorrt_llm/_torch/auto_deploy/models/custom/modeling_qwen3_5_moe.pytensorrt_llm/_torch/auto_deploy/models/custom/modeling_qwen3_5_moe_ir.pytests/integration/defs/accuracy/references/gsm8k.yamltests/integration/defs/accuracy/test_llm_api_autodeploy.pytests/integration/test_lists/waives.txt
💤 Files with no reviewable changes (1)
- tests/integration/test_lists/waives.txt
|
PR_Github #47450 [ run ] completed with state
|
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Description
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.