Skip to content

Conversation

@vpietila-amd
Copy link
Contributor

@vpietila-amd vpietila-amd commented Jan 14, 2026

Proposed changes

This PR has four main changes

  1. Unified XDL and WMMA descriptions for warp under a single WarpGemmDescriptor concept and corresponding types.
  2. Refactored the convolution algorithm concepts for conv dispatcher such that they have a well defined hierarchy. The convolution algorithms are grouped into three categories: XDL, WMMA, and DL. The XDL and WMMA algorithm have a common base algorithm concept ConvAlgorithm from which the hierarchy of XDL and WMMA algorithms is derived.
  3. The convolution algorithm specialization are not bit flags such that one can define multiple specilization at the time. This allows us to achieve the hierarchy of the conv algorithms.
  4. The input/output tile related thread clusters have now more descriptive names.

The unified warp GEMM description (item 1.) allows us to treat XDL and WMMA algorithms at unique footing and reduces the amount of boilerplate code.

The hierarchical description of the conv algorithms (item 2.) simplifies the convolution algorithm concepts and allows better compile-time error messages when no corresponding factory is found for a given algorithms description.

Items 1. and 2. combined allow easier addition of new factories.

Ville Pietilä added 30 commits December 18, 2025 04:36
Ville Pietilä and others added 30 commits January 8, 2026 05:22
… TileTransferParameters concept for conv algorithms.
…s' into vpietila/ckb-refactor-warp-gemm-descriptors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants