Skip to content

[https://nvbugs/6163033][fix] Guard q_a_proj.weight dict access behind nvfp4_fused_a; update test to `chec#14033

Open
tensorrt-cicd wants to merge 1 commit into
NVIDIA:mainfrom
tensorrt-cicd:repair-bot-bug6163033
Open

[https://nvbugs/6163033][fix] Guard q_a_proj.weight dict access behind nvfp4_fused_a; update test to `chec#14033
tensorrt-cicd wants to merge 1 commit into
NVIDIA:mainfrom
tensorrt-cicd:repair-bot-bug6163033

Conversation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

@tensorrt-cicd tensorrt-cicd commented May 12, 2026

Summary

  • Root cause: Non-short-circuit &= probed q_a_proj.weight; test used checkpoint_format="mistral" so wrong weight mapper was selected and MLA keys never renamed; MistralCommonImageProcessor.__call__ required images positionally and always applied chat template over plain-text prompts
  • Fix: Guard q_a_proj.weight dict access behind nvfp4_fused_a; update test to checkpoint_format="mistral_large_3"; make images default None and tokenize directly when there are no images
  • Automated fix generated by repair-bot

Test plan

  • Verify fix on the same GPU type as the original failure
  • Check for regressions in related tests

Links

Summary by CodeRabbit

Release Notes

  • New Features

    • Mistral model now supports text-only input processing without requiring images.
  • Bug Fixes

    • Improved DeepSeekV3 FP4-fused loading by refining validation conditions and control flow logic.
  • Tests

    • Updated Mistral checkpoint format in accuracy test suite.

Review Change Stack

…ut issues

Three issues blocked TestMistralLarge3_675B::test_fp8[latency_moe_deepgemm]:

1. In modeling_deepseekv3.py, the NVFP4-fused-A gate used `&=` which is not
   short-circuited on bools. For FP8 checkpoints, `has_nvfp4()` is False and
   `nvfp4_fused_a` is already False via the earlier `and`, but the `&=` still
   dereferenced `weights[".../q_a_proj.weight"]`, raising KeyError. Guard the
   dict access behind `nvfp4_fused_a` with an `if`.

2. The FP8 variant of the Mistral Large 3 test used `checkpoint_format="mistral"`
   while the NVFP4 sibling uses `"mistral_large_3"`. With "mistral" the weight
   mapper resolves to `MistralWeightMapper` (no MLA renames for `wq_a`,
   `wkv_a_with_mqa`, etc.), so the renamed keys the DeepSeek loader expects
   never exist. Align the test with its sibling.

3. `MistralCommonImageProcessor.__call__` required `images` positionally and
   always wrapped the prompt in a multimodal chat conversation. Plain-text
   callers (the `text_processor` path and accuracy evaluators like MMLU/GSM8K)
   must not apply a chat template over raw continuation prompts. Make `images`
   default to None, and when absent tokenize directly via the underlying
   transformers tokenizer.

Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

📝 Walkthrough

Walkthrough

This PR updates model loading and input processing across DeepSeek v3 and Mistral implementations. DeepSeek v3 FP4 dtype validation is guarded to avoid accessing weights when FP4 fusion is disabled. Mistral's image processor now supports text-only inputs by making images optional. A test checkpoint format is corrected for the Mistral Large 3 variant.

Changes

Model Loading and Input Processing Updates

Layer / File(s) Summary
DeepSeek v3 FP4 dtype validation guard
tensorrt_llm/_torch/models/modeling_deepseekv3.py
The q_a_proj dtype refinement of nvfp4_fused_a is now guarded by both not is_lite and the prior nvfp4_fused_a flag, avoiding weight access when FP4 fusion is already disabled.
Mistral text-only input support
tensorrt_llm/_torch/models/modeling_mistral.py
MistralCommonImageProcessor.__call__ accepts optional images; text-only inputs bypass multimodal chat-template construction and return only token ids, while image inputs follow the existing base64 + multimodal path.
Mistral checkpoint format test correction
tests/integration/defs/accuracy/test_llm_api_pytorch.py
TestMistralLarge3_675B.test_fp8 now uses checkpoint_format="mistral_large_3" to match the model variant.

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title is partially related to the changeset—it mentions guarding q_a_proj.weight and updating the test, which are real aspects of the change, but it is incomplete/cut off mid-sentence and does not capture all three main fixes (DeepSeek fix, Mistral checkpoint format fix, and Mistral image processor signature change). Complete the title and consider whether it should cover all fixes or be more concise; e.g., 'Guard q_a_proj.weight access, update Mistral test checkpoint format, make images optional in processor.'
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The PR description clearly explains the root causes, fixes, and test plan, but the required PR Checklist section is missing its confirmation checkbox, and the description does not explicitly address all template sections like PR Follows CODING GUIDELINES and CODEOWNERS updates.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tensorrt_llm/_torch/models/modeling_mistral.py (1)

312-312: ⚡ Quick win

Add type annotations to __call__ signature.

The updated method is untyped; please annotate text, images, and the return type to keep this path consistent with the project’s typing rules.

Proposed patch
-    def __call__(self, text, images=None, **kwargs):
+    def __call__(
+        self,
+        text: str,
+        images: list[Image.Image] | None = None,
+        **kwargs,
+    ) -> dict[str, torch.Tensor]:

As per coding guidelines "Python code should use type annotations for all function arguments and return types; return type None if function does not return".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tensorrt_llm/_torch/models/modeling_mistral.py` at line 312, The __call__
method currently lacks type annotations; update its signature in
modeling_mistral.py to annotate the parameters and return type (e.g., text: str
| Sequence[str], images: Optional[Any] = None, -> Any) or, if it returns
nothing, -> None; import or use typing primitives (Optional, Sequence, Any) or
more specific project types (e.g., torch.Tensor, PIL.Image.Image) as appropriate
so the signature reads something like def __call__(self, text: str |
Sequence[str], images: Optional[Any] = None, **kwargs) -> Any and adjust imports
accordingly to satisfy the project's typing rules.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tensorrt_llm/_torch/models/modeling_mistral.py`:
- Line 312: The __call__ method currently lacks type annotations; update its
signature in modeling_mistral.py to annotate the parameters and return type
(e.g., text: str | Sequence[str], images: Optional[Any] = None, -> Any) or, if
it returns nothing, -> None; import or use typing primitives (Optional,
Sequence, Any) or more specific project types (e.g., torch.Tensor,
PIL.Image.Image) as appropriate so the signature reads something like def
__call__(self, text: str | Sequence[str], images: Optional[Any] = None,
**kwargs) -> Any and adjust imports accordingly to satisfy the project's typing
rules.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: bf7162af-00ae-4fee-b792-e5f82248efe8

📥 Commits

Reviewing files that changed from the base of the PR and between defc515 and 15010d7.

📒 Files selected for processing (3)
  • tensorrt_llm/_torch/models/modeling_deepseekv3.py
  • tensorrt_llm/_torch/models/modeling_mistral.py
  • tests/integration/defs/accuracy/test_llm_api_pytorch.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant