[example] Linalg to XeGPU fused attention implementation. by charithaintc · Pull Request #153 · llvm/lighthouse

charithaintc · 2026-05-18T23:04:19Z

This example demonstrate how to optimize standard attention kerel written in linalg level into the fused attention kernel that can be run gpu.

Main steps involved:

Generate standard attention payload on 4d tensors (batch x head x ctx_len x d_head)
Tile and fuse the outer parallel dims (batch and head)
Vectorize/Bufferize
Use transform extensions to generate the inner tiled reduction loop (Until we have a better solution).
Distribute to GPU workgroups.
Set xegpu layouts and lower to binary.

Currently this depends on a fix for : #147

tkarna

Thanks, this is a great addition. Looks good on high level.

I've understood that the end-to-end execution requires additional changes in upstream MLIR that are still pending. Please ping us when this is ready for final review.

tkarna · 2026-05-19T08:24:02Z

+from lighthouse.dialects.transform.transform_ext import TransformExtensionDialect
+
+
+class GenerateFusedAttention(


nit: Consider changing the name. We use "generate" for methods that generate payload IR from scratch whereas this is a transform that's applied to an existing payload module.

makes sense. I renamed it to ReplaceWithFusedAttentionOp. open to other suggestions. keep in mind that this will be deprecated when we have the upstream solution.

tkarna · 2026-05-19T08:30:23Z

+    Computes fused attention:
+    output = softmax(Q @ K^T / sqrt(n_head)) @ V


nit: would be good to mention how the generated version differs from standard flash attention, i.e. what is being fused.

I renamed this to gpu_attention_payload because at payload level there is no fusion. it's just standard attention.

adam-smnk · 2026-05-19T10:28:02Z

+            q_load_op = q_load_ops[0]
+            k_load_op = k_load_ops[0]
+            v_load_op = v_load_ops[0]
+            scale_op = scale_ops[0]
+            output_op = output_ops[0]


nit: could use extra checks to ensure these are the expected ops

added more checks.

charithaintc added 2 commits May 18, 2026 23:01

add linalg to xegpu fused attention example

e1c8124

remove hard coded layout params

589b014

tkarna reviewed May 19, 2026

View reviewed changes

adam-smnk reviewed May 19, 2026

View reviewed changes

charithaintc added 5 commits May 21, 2026 22:10

Merge branch 'main' into xegpu_fused_attention_example

b3038b1

rename fused atten

7a46618

rename payload gen

866a246

remove memory utils from payload

7096e08

add more checks for input op types

ce1725d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[example] Linalg to XeGPU fused attention implementation.#153

[example] Linalg to XeGPU fused attention implementation.#153
charithaintc wants to merge 7 commits into
llvm:mainfrom
charithaintc:xegpu_fused_attention_example

charithaintc commented May 18, 2026

Uh oh!

tkarna left a comment

Uh oh!

tkarna May 19, 2026

Uh oh!

charithaintc May 21, 2026

Uh oh!

tkarna May 19, 2026

Uh oh!

charithaintc May 21, 2026

Uh oh!

Uh oh!

Uh oh!

adam-smnk May 19, 2026

Uh oh!

charithaintc May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		from lighthouse.dialects.transform.transform_ext import TransformExtensionDialect


		class GenerateFusedAttention(

		Computes fused attention:
		output = softmax(Q @ K^T / sqrt(n_head)) @ V

Conversation

charithaintc commented May 18, 2026

Uh oh!

tkarna left a comment

Choose a reason for hiding this comment

Uh oh!

tkarna May 19, 2026

Choose a reason for hiding this comment

Uh oh!

charithaintc May 21, 2026

Choose a reason for hiding this comment

Uh oh!

tkarna May 19, 2026

Choose a reason for hiding this comment

Uh oh!

charithaintc May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

adam-smnk May 19, 2026

Choose a reason for hiding this comment

Uh oh!

charithaintc May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants