Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
5175aad
Naive implementation of grouped linear op
timmoon10 Jan 7, 2026
5ffd57e
Use grouped GEMM tex functions
timmoon10 Jan 7, 2026
2ee42da
Support quantized compute
timmoon10 Jan 8, 2026
93e71df
Debug test failures with MXFP8 or NVFP4 params
timmoon10 Jan 8, 2026
fdddc47
Add multiply op
timmoon10 Jan 10, 2026
b448a17
Bug fixes
timmoon10 Jan 10, 2026
3f38897
Expose option for custom op fusions
timmoon10 Jan 14, 2026
a359b67
Add tests for custom ops
timmoon10 Jan 14, 2026
5f7204f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 14, 2026
8ddb8ce
Fix linter warnings and numerical test failures
timmoon10 Jan 14, 2026
cfc2617
Tweak pattern matching logic with fixed window sizes
timmoon10 Jan 15, 2026
0ce5dfb
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 15, 2026
9bf5843
Merge branch 'main' into tmoon/custom-fused-ops
timmoon10 Jan 15, 2026
4992903
Use TF32 tols in fused op tests
timmoon10 Jan 15, 2026
9ab7751
Review suggestion from @greptile-apps
timmoon10 Jan 15, 2026
a086d81
Merge branch 'main' into tmoon/custom-fused-ops
timmoon10 Jan 15, 2026
f05f7a8
Merge branch 'main' into tmoon/grouped-linear-op
timmoon10 Jan 15, 2026
9348138
Fix linter warnings
timmoon10 Jan 15, 2026
5366729
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 15, 2026
1b0b229
Merge branch 'tmoon/grouped-linear-op' into tmoon/cute-gemm-swiglu
timmoon10 Jan 15, 2026
3bbe881
Merge branch 'tmoon/custom-fused-ops' into tmoon/cute-gemm-swiglu
timmoon10 Jan 15, 2026
321646e
Initial impl of fused op for grouped MLP
timmoon10 Jan 16, 2026
e137451
Import group GEMM+SwiGLU kernel
timmoon10 Jan 17, 2026
11da59d
Merge branch 'main' into tmoon/cute-gemm-swiglu
timmoon10 Jan 20, 2026
cb728bb
Add unit test for grouped MLP op
timmoon10 Jan 20, 2026
e7459cc
Call fused group GEMM + SwiGLU kernel
timmoon10 Jan 21, 2026
b15ca0d
Debug test failures
timmoon10 Jan 21, 2026
3da2c17
Get test to not pass trivially
timmoon10 Jan 22, 2026
0270eb1
Handle interleaving for SwiGLU
timmoon10 Jan 22, 2026
0b09790
Fix numeric tests, except for probs grad
timmoon10 Jan 22, 2026
7c40290
Use pre-swizzled scales from GEMM+SwiGLU output
timmoon10 Jan 22, 2026
a098cc0
Add scaled SwiGLU op
timmoon10 Jan 23, 2026
e4f51d3
Avoid CPU splits in group GEMM+SwiGLU kernel
timmoon10 Jan 23, 2026
fb28b6e
Debug scaled SwiGLU
timmoon10 Jan 23, 2026
b0bf34d
Handle case where fused kernel is not available
timmoon10 Jan 24, 2026
000c273
Revert to plain tensor concat
timmoon10 Jan 24, 2026
e2ea4d2
Support GLU interleaving in plain SwiGLU op
timmoon10 Jan 24, 2026
4c6c35f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 24, 2026
caf580b
Remove MultiplyExtraInput op
timmoon10 Jan 24, 2026
b36007e
Merge branch 'main' into tmoon/cute-gemm-swiglu
timmoon10 Jan 25, 2026
36e6918
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 25, 2026
ba28c6f
Fix linter warnings
timmoon10 Jan 25, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Loading