[https://nvbugs/6163033][fix] Guard `q_a_proj.weight` dict access behind `nvfp4_fused_a`; update test to `chec by tensorrt-cicd · Pull Request #14033 · NVIDIA/TensorRT-LLM

tensorrt-cicd · 2026-05-12T05:48:18Z

Summary

Root cause: Non-short-circuit &= probed q_a_proj.weight; test used checkpoint_format="mistral" so wrong weight mapper was selected and MLA keys never renamed; MistralCommonImageProcessor.__call__ required images positionally and always applied chat template over plain-text prompts
Fix: Guard q_a_proj.weight dict access behind nvfp4_fused_a; update test to checkpoint_format="mistral_large_3"; make images default None and tokenize directly when there are no images
Automated fix generated by repair-bot

Test plan

Verify fix on the same GPU type as the original failure
Check for regressions in related tests

Links

Bug: https://nvbugs/6163033

Summary by CodeRabbit

Release Notes

New Features
- Mistral model now supports text-only input processing without requiring images.
Bug Fixes
- Improved DeepSeekV3 FP4-fused loading by refining validation conditions and control flow logic.
Tests
- Updated Mistral checkpoint format in accuracy test suite.

…ut issues Three issues blocked TestMistralLarge3_675B::test_fp8[latency_moe_deepgemm]: 1. In modeling_deepseekv3.py, the NVFP4-fused-A gate used `&=` which is not short-circuited on bools. For FP8 checkpoints, `has_nvfp4()` is False and `nvfp4_fused_a` is already False via the earlier `and`, but the `&=` still dereferenced `weights[".../q_a_proj.weight"]`, raising KeyError. Guard the dict access behind `nvfp4_fused_a` with an `if`. 2. The FP8 variant of the Mistral Large 3 test used `checkpoint_format="mistral"` while the NVFP4 sibling uses `"mistral_large_3"`. With "mistral" the weight mapper resolves to `MistralWeightMapper` (no MLA renames for `wq_a`, `wkv_a_with_mqa`, etc.), so the renamed keys the DeepSeek loader expects never exist. Align the test with its sibling. 3. `MistralCommonImageProcessor.__call__` required `images` positionally and always wrapped the prompt in a multimodal chat conversation. Plain-text callers (the `text_processor` path and accuracy evaluators like MMLU/GSM8K) must not apply a chat template over raw continuation prompts. Make `images` default to None, and when absent tokenize directly via the underlying transformers tokenizer. Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com>

coderabbitai · 2026-05-12T05:55:32Z

📝 Walkthrough

Walkthrough

This PR updates model loading and input processing across DeepSeek v3 and Mistral implementations. DeepSeek v3 FP4 dtype validation is guarded to avoid accessing weights when FP4 fusion is disabled. Mistral's image processor now supports text-only inputs by making images optional. A test checkpoint format is corrected for the Mistral Large 3 variant.

Changes

Model Loading and Input Processing Updates

Layer / File(s)	Summary
DeepSeek v3 FP4 dtype validation guard `tensorrt_llm/_torch/models/modeling_deepseekv3.py`	The `q_a_proj` dtype refinement of `nvfp4_fused_a` is now guarded by both `not is_lite` and the prior `nvfp4_fused_a` flag, avoiding weight access when FP4 fusion is already disabled.
Mistral text-only input support `tensorrt_llm/_torch/models/modeling_mistral.py`	`MistralCommonImageProcessor.__call__` accepts optional images; text-only inputs bypass multimodal chat-template construction and return only token ids, while image inputs follow the existing base64 + multimodal path.
Mistral checkpoint format test correction `tests/integration/defs/accuracy/test_llm_api_pytorch.py`	`TestMistralLarge3_675B.test_fp8` now uses `checkpoint_format="mistral_large_3"` to match the model variant.

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title is partially related to the changeset—it mentions guarding `q_a_proj.weight` and updating the test, which are real aspects of the change, but it is incomplete/cut off mid-sentence and does not capture all three main fixes (DeepSeek fix, Mistral checkpoint format fix, and Mistral image processor signature change).	Complete the title and consider whether it should cover all fixes or be more concise; e.g., 'Guard q_a_proj.weight access, update Mistral test checkpoint format, make images optional in processor.'

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description check	✅ Passed	The PR description clearly explains the root causes, fixes, and test plan, but the required PR Checklist section is missing its confirmation checkbox, and the description does not explicitly address all template sections like PR Follows CODING GUIDELINES and CODEOWNERS updates.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

tensorrt_llm/_torch/models/modeling_mistral.py (1)

312-312: ⚡ Quick win

Add type annotations to __call__ signature.

The updated method is untyped; please annotate text, images, and the return type to keep this path consistent with the project’s typing rules.

Proposed patch

-    def __call__(self, text, images=None, **kwargs):
+    def __call__(
+        self,
+        text: str,
+        images: list[Image.Image] | None = None,
+        **kwargs,
+    ) -> dict[str, torch.Tensor]:

As per coding guidelines "Python code should use type annotations for all function arguments and return types; return type None if function does not return".

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tensorrt_llm/_torch/models/modeling_mistral.py` at line 312, The __call__
method currently lacks type annotations; update its signature in
modeling_mistral.py to annotate the parameters and return type (e.g., text: str
| Sequence[str], images: Optional[Any] = None, -> Any) or, if it returns
nothing, -> None; import or use typing primitives (Optional, Sequence, Any) or
more specific project types (e.g., torch.Tensor, PIL.Image.Image) as appropriate
so the signature reads something like def __call__(self, text: str |
Sequence[str], images: Optional[Any] = None, **kwargs) -> Any and adjust imports
accordingly to satisfy the project's typing rules.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tensorrt_llm/_torch/models/modeling_mistral.py`:
- Line 312: The __call__ method currently lacks type annotations; update its
signature in modeling_mistral.py to annotate the parameters and return type
(e.g., text: str | Sequence[str], images: Optional[Any] = None, -> Any) or, if
it returns nothing, -> None; import or use typing primitives (Optional,
Sequence, Any) or more specific project types (e.g., torch.Tensor,
PIL.Image.Image) as appropriate so the signature reads something like def
__call__(self, text: str | Sequence[str], images: Optional[Any] = None,
**kwargs) -> Any and adjust imports accordingly to satisfy the project's typing
rules.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: bf7162af-00ae-4fee-b792-e5f82248efe8

📥 Commits

Reviewing files that changed from the base of the PR and between defc515 and 15010d7.

📒 Files selected for processing (3)

tensorrt_llm/_torch/models/modeling_deepseekv3.py
tensorrt_llm/_torch/models/modeling_mistral.py
tests/integration/defs/accuracy/test_llm_api_pytorch.py

tensorrt-cicd requested review from a team as code owners May 12, 2026 05:48

tensorrt-cicd requested review from hlu1, jdebache and symphonylyh May 12, 2026 05:48

github-actions Bot assigned tensorrt-cicd May 12, 2026

coderabbitai Bot reviewed May 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[https://nvbugs/6163033][fix] Guard `q_a_proj.weight` dict access behind `nvfp4_fused_a`; update test to `chec#14033

[https://nvbugs/6163033][fix] Guard `q_a_proj.weight` dict access behind `nvfp4_fused_a`; update test to `chec#14033
tensorrt-cicd wants to merge 1 commit into
NVIDIA:mainfrom
tensorrt-cicd:repair-bot-bug6163033

tensorrt-cicd commented May 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 12, 2026

Walkthrough

Changes

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tensorrt-cicd commented May 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Links

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented May 12, 2026

Walkthrough

Changes

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tensorrt-cicd commented May 12, 2026 •

edited by coderabbitai Bot

Loading