-
Notifications
You must be signed in to change notification settings - Fork 263
Addition of Stream-K tests using Tile Engine #3514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
| {"m": 256, "n": 256, "k": 128, "split_k": 1}, | ||
| {"m": 512, "n": 256, "k": 256, "split_k": 1}, | ||
| {"m": 256, "n": 512, "k": 256, "split_k": 1} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these sizes cover the existing cases in test/ck_tile/gemm_streamk?
If so, did we want to remove the old test files and replace them with these to avoid adding redundant tests?
7db8519 to
0e6e6a7
Compare
| - **SKIPPED**: Kernel validation returned "Arguments not supported" (expected for certain problem sizes/configurations) ⚠️ | ||
| - **FAILED**: Actual error or incorrect computation results ❌ | ||
|
|
||
| When a kernel's `IsSupportedArgument()` check fails (e.g., due to vector alignment requirements, dimension constraints, or padding limitations), the test is automatically skipped rather than failed. This allows comprehensive testing across various problem sizes while gracefully handling configurations that don't meet specific kernel requirements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My opinion is there should never be a set of arguments passed to the test suite that are not supported.
We should have a set of black and white tests that we know will either absolutely pass or absolutely fail.
Because in a scenario, where IsSupportedAruguments()'s implementation is changed, for instance someone unintentionally reduces the vector alignment requirements, it will result in valid kernels just being skipped rather than failing. So, we might miss this regression.
IsSupportedArgument() is pretty fragile in a sense because it calls a lot of getter functions from all over the place.
This change adds an implementation for generating Stream-K tests using Tile Engine. This will generate various test executables for different combinations based on the config files. This addition has simple tests running for bf16 and fp16, with both atomic and reduction strategies and compv3 pipeline. The tests rely on the implementation of Stream-K in Tile Engine.
0e6e6a7 to
89e4aba
Compare
| # Enable parallel compilation optimizations | ||
| # Set up job pools for better parallel compilation control | ||
| set_property(GLOBAL PROPERTY JOB_POOLS | ||
| compile_heavy=4 # Limit heavy compilations to prevent OOM | ||
| compile_normal=16 # Allow more parallel normal compilations | ||
| ) | ||
|
|
||
| # Enable compiler cache if available and explicitly requested | ||
| # Disabled by default due to permission issues in CI environments | ||
| option(ENABLE_CCACHE_TESTS "Enable ccache for test compilation" OFF) | ||
| if(ENABLE_CCACHE_TESTS) | ||
| find_program(CCACHE_PROGRAM ccache) | ||
| if(CCACHE_PROGRAM) | ||
| set(CMAKE_CXX_COMPILER_LAUNCHER ${CCACHE_PROGRAM}) | ||
| message(STATUS "Using ccache for faster test compilation") | ||
| else() | ||
| message(WARNING "ccache requested but not found") | ||
| endif() | ||
| else() | ||
| message(STATUS "ccache disabled for tests (use -DENABLE_CCACHE_TESTS=ON to enable)") | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tabbed in.
Also there are job pools set up here. Where are they used?
| - **`--gen_individual`**: Generate all kernel headers in parallel during CMake configuration | ||
| - **`--gen_single`**: Generate individual kernel header for each configuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference between these two is not very clear for me
|
|
||
| ## Data Type Support | ||
| - ✅ **fp16, bf16**: Fully supported - all layouts (rcr, rrr, ccr, crr) | ||
| - ❌ **fp64**: Not supported (hardware MFMA limitation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fp64 mfma is supported on gfx9 (minus gfx908). Is this more of a CK limitation?
This PR introduces the generation of unit tests for Stream-K using Tile Engine. This will allow us to scale up the unit tests we have for better coverage and maintainability. It establishes a small test targeting fp16 and bf16 data types and covers both atomic and parallel reduction strategies within the compv3 pipeline. It also lays the groundwork for expanding to the full set of Stream-K smoke and extended tests.
These are the supported instances:
Checklist
Please put an
xinto the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.clang-formaton all changed filesDiscussion
If this is a relatively large or complex change, feel free to start a discussion by explaining why you chose the solution you did and what alternatives you considered