Skip to content

[Fix] Fix sparse_attn_v4_paged_prefill for MI308#1003

Open
yitingw1 wants to merge 4 commits into
ROCm:mainfrom
yitingw1:fix_DSV4
Open

[Fix] Fix sparse_attn_v4_paged_prefill for MI308#1003
yitingw1 wants to merge 4 commits into
ROCm:mainfrom
yitingw1:fix_DSV4

Conversation

@yitingw1
Copy link
Copy Markdown

@yitingw1 yitingw1 commented Jun 1, 2026

Motivation

Fix DeepSeek V4 op sparse_attn_v4_paged_prefill for MI308.

In sparse_attn_v4_paged_prefill, pa_sparse_prefill_opus requires gfx950. When running on gfx942, fallback to Triton version _sparse_attn_v4_paged_prefill_triton.

Submission Checklist

@yitingw1 yitingw1 requested a review from valarLip June 1, 2026 07:56
attn_sink,
softmax_scale,
)
except RuntimeError:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should fallback to triton impl once we get runtime error…

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we will fallback to triton impl (line 389) once we get runtime error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants