Implement device_gemm_universal_preshuffle_instance for RDNA4 #3429

yungshengtu · 2025-12-15T16:02:42Z

Proposed changes

Summary:

Implementation
FP8/FP16 WMMA examples
WMMA instances
add WMMA instances to existing tests

Checklist

Please put an x into the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.

I have added tests relevant to the introduced functionality, and the unit tests are passing locally
I have added the test to REGRESSION_TESTS list defined at the top of CMakeLists.txt in tests/CMakeLists.txt, IF the test takes more than 30 seconds to run.
I have added inline documentation which enables the maintainers with understanding the motivation
I have removed the stale documentation which is no longer relevant after this pull request
(If this change is user-facing) I have added release notes which provide the end users with a brief summary of the improvement from this pull request
I have run clang-format on all changed files
Any dependent changes have been merged

Discussion

If this is a relatively large or complex change, feel free to start a discussion by explaining why you chose the solution you did and what alternatives you considered

ErwinTerpstra

One small comment, beyond that it looks good!

library/include/ck/library/tensor_operation_instance/gpu/gemm_universal_preshuffle.hpp

EnricoDeg · 2026-01-14T13:18:33Z

Can we remove some code duplications between the examples?

EnricoDeg · 2026-01-14T13:46:16Z

Apart from some refactoring in the examples code to remove duplications which it would be nice to have, it looks good to me

yungshengtu · 2026-01-14T13:50:21Z

Can we remove some code duplications between the examples?

I have removed it in the last commit (15248f6). Thanks.

yungshengtu requested review from a team, ThomasNing, afagaj, andriy-ca, aosewski, asleepzzz, bartekxk, carlushuang, cgmillette, coderfeli, geyyer, illsilin, poyenc, qianfengz, shumway, tenpercent and vidyasagar-amd as code owners December 15, 2025 16:02

krithalith requested a review from ErwinTerpstra December 16, 2025 12:40

yungshengtu added the organization: streamhpc label Dec 17, 2025

ErwinTerpstra reviewed Dec 18, 2025

View reviewed changes

library/include/ck/library/tensor_operation_instance/gpu/gemm_universal_preshuffle.hpp Outdated Show resolved Hide resolved

yungshengtu self-assigned this Dec 18, 2025

yungshengtu force-pushed the users/yungshengtu/implement-device_gemm_universal_preshuffle_instance-for-rdna4 branch from d8e7a2e to 3dee146 Compare December 18, 2025 09:09

EnricoDeg self-requested a review January 14, 2026 12:10

yungshengtu force-pushed the users/yungshengtu/implement-device_gemm_universal_preshuffle_instance-for-rdna4 branch from 3dee146 to 6eb3f39 Compare January 14, 2026 13:21

add device_gemm_wmma_cshuffle_v3_b_preshuffle.hpp

a7cc1af

yungshengtu added 3 commits January 14, 2026 13:46

add examples

ed5825f

add instances to test

7d4f122

remove duplicate code between examples

15248f6

yungshengtu force-pushed the users/yungshengtu/implement-device_gemm_universal_preshuffle_instance-for-rdna4 branch from 6eb3f39 to 15248f6 Compare January 14, 2026 13:46

EnricoDeg approved these changes Jan 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement device_gemm_universal_preshuffle_instance for RDNA4 #3429

Implement device_gemm_universal_preshuffle_instance for RDNA4 #3429

Uh oh!

yungshengtu commented Dec 15, 2025

Uh oh!

ErwinTerpstra left a comment

Uh oh!

Uh oh!

EnricoDeg commented Jan 14, 2026

Uh oh!

EnricoDeg commented Jan 14, 2026

Uh oh!

yungshengtu commented Jan 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Implement device_gemm_universal_preshuffle_instance for RDNA4 #3429

Are you sure you want to change the base?

Implement device_gemm_universal_preshuffle_instance for RDNA4 #3429

Uh oh!

Conversation

yungshengtu commented Dec 15, 2025

Proposed changes

Checklist

Discussion

Uh oh!

ErwinTerpstra left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

EnricoDeg commented Jan 14, 2026

Uh oh!

EnricoDeg commented Jan 14, 2026

Uh oh!

yungshengtu commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yungshengtu commented Jan 14, 2026 •

edited

Loading