[https://nvbugs/6163033][fix] Guard q_a_proj.weight dict access behind nvfp4_fused_a; update test to `chec#14033
Conversation
…ut issues Three issues blocked TestMistralLarge3_675B::test_fp8[latency_moe_deepgemm]: 1. In modeling_deepseekv3.py, the NVFP4-fused-A gate used `&=` which is not short-circuited on bools. For FP8 checkpoints, `has_nvfp4()` is False and `nvfp4_fused_a` is already False via the earlier `and`, but the `&=` still dereferenced `weights[".../q_a_proj.weight"]`, raising KeyError. Guard the dict access behind `nvfp4_fused_a` with an `if`. 2. The FP8 variant of the Mistral Large 3 test used `checkpoint_format="mistral"` while the NVFP4 sibling uses `"mistral_large_3"`. With "mistral" the weight mapper resolves to `MistralWeightMapper` (no MLA renames for `wq_a`, `wkv_a_with_mqa`, etc.), so the renamed keys the DeepSeek loader expects never exist. Align the test with its sibling. 3. `MistralCommonImageProcessor.__call__` required `images` positionally and always wrapped the prompt in a multimodal chat conversation. Plain-text callers (the `text_processor` path and accuracy evaluators like MMLU/GSM8K) must not apply a chat template over raw continuation prompts. Make `images` default to None, and when absent tokenize directly via the underlying transformers tokenizer. Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com>
📝 WalkthroughWalkthroughThis PR updates model loading and input processing across DeepSeek v3 and Mistral implementations. DeepSeek v3 FP4 dtype validation is guarded to avoid accessing weights when FP4 fusion is disabled. Mistral's image processor now supports text-only inputs by making images optional. A test checkpoint format is corrected for the Mistral Large 3 variant. ChangesModel Loading and Input Processing Updates
🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tensorrt_llm/_torch/models/modeling_mistral.py (1)
312-312: ⚡ Quick winAdd type annotations to
__call__signature.The updated method is untyped; please annotate
text,images, and the return type to keep this path consistent with the project’s typing rules.Proposed patch
- def __call__(self, text, images=None, **kwargs): + def __call__( + self, + text: str, + images: list[Image.Image] | None = None, + **kwargs, + ) -> dict[str, torch.Tensor]:As per coding guidelines "Python code should use type annotations for all function arguments and return types; return type
Noneif function does not return".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tensorrt_llm/_torch/models/modeling_mistral.py` at line 312, The __call__ method currently lacks type annotations; update its signature in modeling_mistral.py to annotate the parameters and return type (e.g., text: str | Sequence[str], images: Optional[Any] = None, -> Any) or, if it returns nothing, -> None; import or use typing primitives (Optional, Sequence, Any) or more specific project types (e.g., torch.Tensor, PIL.Image.Image) as appropriate so the signature reads something like def __call__(self, text: str | Sequence[str], images: Optional[Any] = None, **kwargs) -> Any and adjust imports accordingly to satisfy the project's typing rules.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tensorrt_llm/_torch/models/modeling_mistral.py`:
- Line 312: The __call__ method currently lacks type annotations; update its
signature in modeling_mistral.py to annotate the parameters and return type
(e.g., text: str | Sequence[str], images: Optional[Any] = None, -> Any) or, if
it returns nothing, -> None; import or use typing primitives (Optional,
Sequence, Any) or more specific project types (e.g., torch.Tensor,
PIL.Image.Image) as appropriate so the signature reads something like def
__call__(self, text: str | Sequence[str], images: Optional[Any] = None,
**kwargs) -> Any and adjust imports accordingly to satisfy the project's typing
rules.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: bf7162af-00ae-4fee-b792-e5f82248efe8
📒 Files selected for processing (3)
tensorrt_llm/_torch/models/modeling_deepseekv3.pytensorrt_llm/_torch/models/modeling_mistral.pytests/integration/defs/accuracy/test_llm_api_pytorch.py
Summary
&=probedq_a_proj.weight; test usedcheckpoint_format="mistral"so wrong weight mapper was selected and MLA keys never renamed;MistralCommonImageProcessor.__call__requiredimagespositionally and always applied chat template over plain-text promptsq_a_proj.weightdict access behindnvfp4_fused_a; update test tocheckpoint_format="mistral_large_3"; makeimagesdefaultNoneand tokenize directly when there are no imagesTest plan
Links
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests