-
Notifications
You must be signed in to change notification settings - Fork 8
Add rocSPARSE CSR SpGEMM #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
YvanMokwinski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a general comment: I might not be used to cutting-edge C++, but I don't find the code friendly to read.
| to_rocsparse_indextype<typename output_type::offset_type>(), | ||
| to_rocsparse_indextype<typename output_type::index_type>(), | ||
| rocsparse_index_base_zero, | ||
| to_rocsparse_datatype<typename output_type::scalar_type>())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we could have something like:
using output_scalar_t = typename output_type::scalar_type;
It would simplify the writings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have merged them into a helper function to create a matrix handle easily from the csr_view
| to_rocsparse_datatype<value_type>(), rocsparse_spgemm_alg_default, | ||
| rocsparse_spgemm_stage_buffer_size, &buffer_size, nullptr)); | ||
| if (buffer_size > this->buffer_size_) { | ||
| this->alloc_.deallocate(workspace_, this->buffer_size_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's have a reallocate method in the allocator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is close to the array more than the allocator?
do you imagine something like alloc_.reallocate(&workspace_, old_buffer_size_, new_buffer_size_)?
| tensor_scalar_t<A> alpha = alpha_optional.value_or(1); | ||
| value_type alpha_val = alpha; | ||
| auto beta_optional = __detail::get_scaling_factor(d); | ||
| value_type beta = beta_optional.value_or(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that relying on a and d types for alpha and beta is the best approach.
| __detail::is_csr_view_v<C> | ||
| void multiply_compute(spgemm_state_t& spgemm_handle, A&& a, B&& b, C&& c) { | ||
| auto d = __rocsparse::create_null_matrix<std::remove_reference_t<C>>(); | ||
| spgemm_handle.multiply_compute(a, b, c, scaled(0.0, d)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we have either scaled(0.0, d) or d?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is to reuse the 4 arguments SpGEMM. It will give the same setup when using rocsparse_spgemm for C = A*B, null_matrix set the all pointer to nullptr.
605f085 to
64d662f
Compare
|
@yhmtsai What's the status of this PR? Anything required on my part here, or are you still updating, waiting for @YvanMokwinski to comment on your most recent updates? |
c39ef55 to
eeb3a67
Compare
|
@YvanMokwinski @BenBrock forgot to mention it can be checked again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @yhmtsai, looks mostly good, I just have a couple of small comments.
As a favor to me, can you please avoid performing Git force pushes on branches for which you have an open PR? If no one is looking at your code yet, force pushing is mostly fine. However, if you force push to a branch I've already reviewed, it becomes very difficult to see what's changed recently. Please use merge commits instead of rebasing if necessary.
| throw_if_error(rocsparse_create_csr_descr( | ||
| &descr, __backend::shape(a)[0], __backend::shape(a)[1], a.values().size(), | ||
| a.rowptr().data(), a.colind().data(), a.values().data(), | ||
| to_rocsparse_indextype<typename matrix_type::offset_type>(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the tensor_traits inspector instead of depending on scalar_type, index_type, and offset_type typedefs. This is cleaner (you can directly write tensor_scalar_t<mat>) and will work with types we don't define like mdspan as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will it be uniform for the other types? like Coo, offset_type does not make sense.
|
|
||
| } // namespace __rocsparse | ||
|
|
||
| class spgemm_state_t { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is okay for now, but eventually we need to have one operation_state_t that encapsulates this (if that's how you want to design it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if the operation_state_t is for user-friendly in the end. for now, I think separating individually should be clearer and easier for future change
| template <matrix A, matrix B, matrix C> | ||
| requires __detail::has_csr_base<A> && __detail::has_csr_base<B> && | ||
| __detail::is_csr_view_v<C> | ||
| void multiply_inspect(spgemm_state_t& spgemm_handle, A&& a, B&& b, C&& c) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to at least add a version that takes in three matrices and returns a state (see multiply.hpp).
|
@BenBrock Sure, I can avoid force rebasing. It is what we need to do to keep linear history in Ginkgo and I try to reorganize the commit history to describe the change. As we always do squash, my commit history will be merged together anyway. |
|
Hi @BenBrock I have used merge way to update this branch. hope this will help the review process. |
BenBrock
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have one comment about the CMakeLists.txt, but otherwise looks fine to merge.
test/gtest/CMakeLists.txt
Outdated
|
|
||
| function(add_rocsparse_lang file_list) | ||
| if (ENABLE_ROCSPARSE) | ||
| set_source_files_properties(${${file_list}} PROPERTIES LANGUAGE HIP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we perhaps just have this inline? Personally I find that easier to read than needing to go to a separate function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. agree with it. It is quite small so no need for function now
Co-authored-by: Benjamin Brock <brock@cs.berkeley.edu>
|
after push, it automatically dismiss the review. @BenBrock could you check it again? |
Summary:
This PR adds rocSPARSE CSR SpGEMM with 3 args (C = alpha A * B), 4 args (C = alpha A * B + beta D), and the symbolic/numeric phase.
Details:
potential TODO:
Merge Checklist: