Skip to content

[https://nvbugs/6157892][fix] Restore the pre-#12743 AutoProcessor.from_pretrained(...) assignment for `text#13905

Open
tensorrt-cicd wants to merge 1 commit intoNVIDIA:mainfrom
tensorrt-cicd:repair-bot-bug6157892
Open

[https://nvbugs/6157892][fix] Restore the pre-#12743 AutoProcessor.from_pretrained(...) assignment for `text#13905
tensorrt-cicd wants to merge 1 commit intoNVIDIA:mainfrom
tensorrt-cicd:repair-bot-bug6157892

Conversation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

@tensorrt-cicd tensorrt-cicd commented May 8, 2026

Summary

  • Root cause: PR [https://nvbugs/5969216][fix] Ministral3 loading fix #12743 replaced text_processor = AutoProcessor.from_pretrained(...) with text_processor = self._processor (a MistralCommonImageProcessor), whose __call__ both requires a positional images arg and always applies apply_chat_template, corrupting raw text prompts.
  • Fix: Restore the pre-[https://nvbugs/5969216][fix] Ministral3 loading fix #12743 AutoProcessor.from_pretrained(...) assignment for text_processor in the mistral_large_3 branch, matching the intent documented in the in-line comment.
  • Automated fix generated by repair-bot

Test plan

  • Verify fix on the same GPU type as the original failure
  • Check for regressions in related tests

Links

Summary by CodeRabbit

  • Bug Fixes
    • Improved input processor initialization for different model variants, ensuring correct processor assignment for multimodal handling in specific model configurations.

…only path

MistralCommonImageProcessor.__call__ always applies apply_chat_template,
which corrupts raw text prompts (e.g. MMLU/GSM8K) and requires an images
positional argument. Revert the text_processor assignment so the text-only
branch uses AutoProcessor.from_pretrained as before, matching the intent
documented in the adjacent comment.

Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

The Mistral3InputProcessor.__init__ method now instantiates an AutoProcessor upfront before branching on model type. For mistral_large_3, the image processor and text processor are assigned separately—image handling uses MistralCommonImageProcessor while text processing uses the preloaded AutoProcessor. Other mistral types retain their original processor alignment.

Changes

Processor Initialization Refactoring

Layer / File(s) Summary
Processor Initialization
tensorrt_llm/_torch/models/modeling_mistral.py
AutoProcessor is created once at method start. For mistral_large_3 model type, _processor is set to MistralCommonImageProcessor for multimodal image handling, while text_processor is set to the preloaded auto_processor. Other mistral model types keep _processor = auto_processor and text_processor = _processor.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the fix as restoring pre-PR #12743 behavior for AutoProcessor assignment in text_processor, directly matching the main change in the code modification.
Description check ✅ Passed The PR description includes a clear summary of the root cause, the fix applied, and a test plan. However, it does not follow the repository's required template structure with 'Description' and 'Test Coverage' sections.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tensorrt_llm/_torch/models/modeling_mistral.py (1)

1-1: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add/update the NVIDIA copyright header in this modified file.

This source file is modified but does not include the required NVIDIA copyright header/current modification year.

As per coding guidelines: All C++, Python, and other source files must contain NVIDIA copyright header with current modification year.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tensorrt_llm/_torch/models/modeling_mistral.py` at line 1, This file
(modeling_mistral.py) is missing the required NVIDIA copyright header; add or
update the standard NVIDIA copyright/license header at the very top of the file
before any imports (i.e., place it above the existing import copy), ensure the
header uses the current modification year, and match the same header format used
in other project Python sources so the file contains the required NVIDIA
copyright and license lines.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@tensorrt_llm/_torch/models/modeling_mistral.py`:
- Line 1: This file (modeling_mistral.py) is missing the required NVIDIA
copyright header; add or update the standard NVIDIA copyright/license header at
the very top of the file before any imports (i.e., place it above the existing
import copy), ensure the header uses the current modification year, and match
the same header format used in other project Python sources so the file contains
the required NVIDIA copyright and license lines.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3f445d26-1606-4142-b4b2-77ef3f363d57

📥 Commits

Reviewing files that changed from the base of the PR and between 2e4b05c and a1d3f1b.

📒 Files selected for processing (1)
  • tensorrt_llm/_torch/models/modeling_mistral.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants