Skip to content

Fix attention for non-standard literal#4877

Open
ahsan-ca wants to merge 1 commit into
developfrom
fix-attention-non-standard-literal
Open

Fix attention for non-standard literal#4877
ahsan-ca wants to merge 1 commit into
developfrom
fix-attention-non-standard-literal

Conversation

@ahsan-ca
Copy link
Copy Markdown
Contributor

Motivation

Technical Details

Changelog Category

Add a CHANGELOG.md entry for any option other than Not Applicable

    • Added: New functionality.
    • Changed: Changes to existing functionality.
    • Removed: Functionality or support that has been removed. (Compared to a previous release)
    • Optimized: Component performance that has been optimized or improved.
    • Resolved Issues: Known issues from a previous version that have been resolved.
    • Not Applicable: This PR is not to be included in the changelog.

@ahsan-ca ahsan-ca self-assigned this May 12, 2026
Copilot AI review requested due to automatic review settings May 12, 2026 19:13
@ahsan-ca ahsan-ca requested a review from causten as a code owner May 12, 2026 19:13
@ahsan-ca ahsan-ca added bugfix Fixes a bug found in the code. simple small or simple changes labels May 12, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes an attention fusion regression where capturing a constant-folded, non-standard-strided bias literal into the attention group causes MLIR lowering to emit a migraphx.literal with non-standard strides (rejected by rocMLIR). The change keeps such literals outside the group so they enter as regular inputs, while preserving existing inlining behavior for standard-strided literals.

Changes:

  • Update attention fusion’s constant-capture logic to skip non-standard @literal instructions when expanding the attention group.
  • Add a regression test covering the failing “transposed/non-standard literal bias” scenario and a companion test ensuring standard literal bias is still inlined.
  • Ensure the expected grouped program structure matches the intended behavior via sort() equivalence.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/fuse_attention.cpp Prevents non-standard @literal nodes from being captured into attention groups to avoid MLIR literal lowering failures.
test/fuse_attention.cpp Adds regression/behavioral tests for non-standard vs standard bias literal handling in attention fusion.

Comment thread src/fuse_attention.cpp
Comment on lines +178 to +180
// bias) outside the group so they enter as a parameter
// instead, where mlir_contiguous + adjust_param_shapes can
// normalise the layout.
@ahsan-ca ahsan-ca force-pushed the fix-attention-non-standard-literal branch from 97287d6 to a51abb0 Compare May 12, 2026 19:24
Comment thread src/fuse_attention.cpp
// Leave non-standard literals (e.g. constant-folded transposed
// bias) outside the group so they enter as a parameter
// instead, where mlir_contiguous + adjust_param_shapes can
// normalise the layout.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just remove this comment. There so many things incorrect in this comment it will just cause more confusion.

Comment thread src/fuse_attention.cpp
// instead, where mlir_contiguous + adjust_param_shapes can
// normalise the layout.
if(input->name() == "@literal" and not input->get_shape().standard())
continue;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesnt seem like the correct fix. We need to skip literals that are not iota literals, but you are not checking if its an iota literal.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4877      +/-   ##
===========================================
+ Coverage    92.85%   92.87%   +0.02%     
===========================================
  Files          584      585       +1     
  Lines        30147    30123      -24     
===========================================
- Hits         27992    27976      -16     
+ Misses        2155     2147       -8     
Files with missing lines Coverage Δ
src/fuse_attention.cpp 98.01% <100.00%> (+0.01%) ⬆️

... and 15 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix Fixes a bug found in the code. simple small or simple changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants