Skip to content

BF16 matmul from KernelBench working#145

Open
rengolin wants to merge 4 commits into
llvm:mainfrom
rengolin:kb_bf16
Open

BF16 matmul from KernelBench working#145
rengolin wants to merge 4 commits into
llvm:mainfrom
rengolin:kb_bf16

Conversation

@rengolin
Copy link
Copy Markdown
Member

Performance is twice as F32, which is around 85% of peak, so good first step.

@rengolin rengolin requested a review from adam-smnk May 13, 2026 17:54
@rengolin
Copy link
Copy Markdown
Member Author

@adam-smnk some of the "automation" for schedules we discussed today.

Copy link
Copy Markdown
Member

@adam-smnk adam-smnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good 👍

Makes do with what we have today.
As we generalize and improve different schedules, we should see complexity shift from pipelines back into transforms and IR.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High-level question, how will this scale further to bf16 avx512 vs amx etc.?

return kb_default_pipeline

# Level 1 matmuls should use the same pipelines
if kernel_name.startswith("level1") and "matrix_multiplication" in kernel_name:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some level1 kernels also just use _Matmul_ in their names. Not sure if you tested with these too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants