[https://nvbugs/6160248][fix] AutoDeploy: fixed broken pattern matching of fuse_rope_into_trtllm_attention transform by MrGeva · Pull Request #14038 · NVIDIA/TensorRT-LLM

MrGeva · 2026-05-12T07:30:52Z

The recent commit 6cd23bc changed fuse_gemms_mixed_children to emit narrow → contiguous (call_method) → view instead of split_with_sizes (closure) → getitem. But _try_trace_to_fused_qkv._trace_narrow only handles view → narrow directly — it does not unwrap a call_method("contiguous", ...) node sitting between the view and the narrow, and _unwrap_contiguous only handles call_function contiguous, not the call_method form. Result: the passthrough leg silently fails and _trtllm_fused_qkv is never set.

Fix:

Extend _unwrap_contiguous to also skip call_method nodes whose target is the string "contiguous".
In _trace_narrow and _trace_split apply _unwrap_contiguous to the view's input before testing for narrow / getitem.

Summary by CodeRabbit

Bug Fixes
- Improved attention optimization logic to correctly handle additional tensor operation patterns, ensuring more reliable detection and fusion of fused QKV operations across diverse model architectures.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

…ugh rope trace After fuse_gemms_mixed_children switched its post-fusion split path from ``split_with_sizes-closure → getitem`` to ``narrow → contiguous → view``, the contiguous node is emitted via ``graph.call_method("contiguous", ...)``, i.e. a ``call_method`` op rather than a ``call_function``. This broke ``fuse_rope_into_trtllm_attention``'s QKV-passthrough leg: ``_unwrap_contiguous`` only handled ``call_function`` forms, and ``_trace_narrow`` walked ``view → narrow`` directly without peeling intermediate contiguous nodes. As a result the passthrough silently failed to fire and ``_trtllm_fused_qkv`` was never set, causing ``test_gemm_fusion_trtllm.py::test_fuse_qkv_passthrough_with_rope`` to fail. Fix: - Extend ``_unwrap_contiguous`` to also skip ``call_method`` nodes whose target is the string ``"contiguous"``. - In ``_trace_narrow`` and ``_trace_split`` apply ``_unwrap_contiguous`` to the view's input before testing for narrow / getitem. Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>

MrGeva · 2026-05-12T07:32:06Z

/bot run --extra-stage "DGX_B200-4_GPUs-AutoDeploy-1, DGX_H100-4_GPUs-AutoDeploy-1" --disable-fail-fast

coderabbitai · 2026-05-12T07:36:53Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 189305ee-d4e0-41af-95cb-7623e43f0f82

📥 Commits

Reviewing files that changed from the base of the PR and between 7bc328f and d0f0d60.

📒 Files selected for processing (1)

tensorrt_llm/_torch/auto_deploy/transform/library/fuse_rope_into_trtllm_attention.py

📝 Walkthrough

Walkthrough

The PR enhances RoPE-attention graph tracing by expanding the _unwrap_contiguous helper to recognize additional PyTorch contiguous emission patterns, then applies this enhanced helper in both split and narrow QKV tracing paths to ensure intervening contiguous nodes don't block correct source identification.

Changes

Contiguous unwrapping in fused RoPE-attention tracing

Layer / File(s)	Summary
Expand contiguous-unwrapping helper `tensorrt_llm/_torch/auto_deploy/transform/library/fuse_rope_into_trtllm_attention.py`	`_unwrap_contiguous` now recognizes `call_method("contiguous")` and `call_function` forms (`aten.contiguous.default` overloads and `Tensor.contiguous` method call_function variants) in addition to the original pattern, allowing the trace to skip past these contiguous emissions.
Apply unwrapping in QKV tracing paths `tensorrt_llm/_torch/auto_deploy/transform/library/fuse_rope_into_trtllm_attention.py`	`_unwrap_contiguous` is called on `view_input` in both `_trace_split` and `_trace_narrow` functions to remove contiguous nodes between the view/reshape operation and its input, ensuring the fused QKV source is correctly identified.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description clearly explains the issue and the solution; however, the Description and Test Coverage sections are missing or not filled in according to the template structure.	Add a dedicated Description section explaining the issue and solution, and add a Test Coverage section listing the relevant tests (e.g., test_fuse_qkv_passthrough_with_rope) that validate this fix.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly identifies the specific issue (broken pattern matching in fuse_rope_into_trtllm_attention) and the fix category (fix), with proper NVBugs ticket reference.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

tensorrt-cicd · 2026-05-12T07:38:12Z

PR_Github #47923 [ run ] triggered by Bot. Commit: d0f0d60 Link to invocation

tensorrt-cicd · 2026-05-12T14:30:03Z

PR_Github #47923 [ run ] completed with state SUCCESS. Commit: d0f0d60
/LLM/main/L0_MergeRequest_PR pipeline #37770 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

github-actions Bot assigned MrGeva May 12, 2026

MrGeva marked this pull request as ready for review May 12, 2026 07:35

MrGeva requested a review from a team as a code owner May 12, 2026 07:35

MrGeva requested a review from Fridah-nv May 12, 2026 07:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[https://nvbugs/6160248][fix] AutoDeploy: fixed broken pattern matching of fuse_rope_into_trtllm_attention transform#14038

[https://nvbugs/6160248][fix] AutoDeploy: fixed broken pattern matching of fuse_rope_into_trtllm_attention transform#14038
MrGeva wants to merge 1 commit into
NVIDIA:mainfrom
nv-auto-deploy:fix/ad-rope-fusion-unwrap-contiguous-call-method

MrGeva commented May 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

MrGeva commented May 12, 2026

Uh oh!

coderabbitai Bot commented May 12, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

tensorrt-cicd commented May 12, 2026

Uh oh!

tensorrt-cicd commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MrGeva commented May 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

MrGeva commented May 12, 2026

Uh oh!

coderabbitai Bot commented May 12, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

tensorrt-cicd commented May 12, 2026

Uh oh!

tensorrt-cicd commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MrGeva commented May 12, 2026 •

edited by coderabbitai Bot

Loading