Skip to content

Conversation

@yhmtsai
Copy link
Contributor

@yhmtsai yhmtsai commented Mar 27, 2025

Summary:
This PR adds rocSPARSE CSR SpGEMM with 3 args (C = alpha A * B), 4 args (C = alpha A * B + beta D), and the symbolic/numeric phase.

Details:

  • Add spgemm state
  • Add 4 arguments version of SpGEMM (C = alpha A * B + beta D)
  • the 3 arguments is passing a null matrix D into 4 arguments version
  • also adds the symbolic_compute, symbolic_fill, and numeric stage
  • currently, three type of test: spgemm similar to the current CPU version, spgemm_reuse: checking symbolic/numeric, and spgemm_4args: checking 4 arguments version

potential TODO:

  • create a structure to handle the rocsparse handle in one place
  • helper for create rocsparse_mat (is already merged in another pr)

Merge Checklist:

  • Passing CI
  • Update documentation or README.md
  • Additional Test/example added (if applicable) and passing
  • At least one reviewer approval
  • (optional) Clang sanitizer scan run and triaged
  • Clang formatter applied (verified as part of passing CI)

@yhmtsai yhmtsai self-assigned this Mar 27, 2025
Copy link

@YvanMokwinski YvanMokwinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a general comment: I might not be used to cutting-edge C++, but I don't find the code friendly to read.

to_rocsparse_indextype<typename output_type::offset_type>(),
to_rocsparse_indextype<typename output_type::index_type>(),
rocsparse_index_base_zero,
to_rocsparse_datatype<typename output_type::scalar_type>()));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we could have something like:

using output_scalar_t = typename output_type::scalar_type;
It would simplify the writings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have merged them into a helper function to create a matrix handle easily from the csr_view

to_rocsparse_datatype<value_type>(), rocsparse_spgemm_alg_default,
rocsparse_spgemm_stage_buffer_size, &buffer_size, nullptr));
if (buffer_size > this->buffer_size_) {
this->alloc_.deallocate(workspace_, this->buffer_size_);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's have a reallocate method in the allocator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is close to the array more than the allocator?
do you imagine something like alloc_.reallocate(&workspace_, old_buffer_size_, new_buffer_size_)?

tensor_scalar_t<A> alpha = alpha_optional.value_or(1);
value_type alpha_val = alpha;
auto beta_optional = __detail::get_scaling_factor(d);
value_type beta = beta_optional.value_or(1);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that relying on a and d types for alpha and beta is the best approach.

__detail::is_csr_view_v<C>
void multiply_compute(spgemm_state_t& spgemm_handle, A&& a, B&& b, C&& c) {
auto d = __rocsparse::create_null_matrix<std::remove_reference_t<C>>();
spgemm_handle.multiply_compute(a, b, c, scaled(0.0, d));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have either scaled(0.0, d) or d?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is to reuse the 4 arguments SpGEMM. It will give the same setup when using rocsparse_spgemm for C = A*B, null_matrix set the all pointer to nullptr.

@yhmtsai yhmtsai force-pushed the dev/yhmtsai/rocsparse_spgemm branch from 605f085 to 64d662f Compare April 17, 2025 14:43
@BenBrock
Copy link
Collaborator

@yhmtsai What's the status of this PR? Anything required on my part here, or are you still updating, waiting for @YvanMokwinski to comment on your most recent updates?

@yhmtsai yhmtsai force-pushed the dev/yhmtsai/rocsparse_spgemm branch from c39ef55 to eeb3a67 Compare May 22, 2025 14:10
@yhmtsai yhmtsai marked this pull request as ready for review May 22, 2025 14:49
@yhmtsai
Copy link
Contributor Author

yhmtsai commented Jun 5, 2025

@YvanMokwinski @BenBrock forgot to mention it can be checked again.
I decide not to introduce a class to handle the rocsparse handle in this PR.
I am also thinking whether using shared_ptr to handle rocsparse matrix handle but I keep it as current state until matrix_opt.
I think 4arg SpGEMM are not supported by the other backend, so I keep them in rocsparse only

BenBrock
BenBrock previously approved these changes Jun 9, 2025
Copy link
Collaborator

@BenBrock BenBrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yhmtsai, looks mostly good, I just have a couple of small comments.

As a favor to me, can you please avoid performing Git force pushes on branches for which you have an open PR? If no one is looking at your code yet, force pushing is mostly fine. However, if you force push to a branch I've already reviewed, it becomes very difficult to see what's changed recently. Please use merge commits instead of rebasing if necessary.

throw_if_error(rocsparse_create_csr_descr(
&descr, __backend::shape(a)[0], __backend::shape(a)[1], a.values().size(),
a.rowptr().data(), a.colind().data(), a.values().data(),
to_rocsparse_indextype<typename matrix_type::offset_type>(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the tensor_traits inspector instead of depending on scalar_type, index_type, and offset_type typedefs. This is cleaner (you can directly write tensor_scalar_t<mat>) and will work with types we don't define like mdspan as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will it be uniform for the other types? like Coo, offset_type does not make sense.


} // namespace __rocsparse

class spgemm_state_t {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is okay for now, but eventually we need to have one operation_state_t that encapsulates this (if that's how you want to design it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if the operation_state_t is for user-friendly in the end. for now, I think separating individually should be clearer and easier for future change

template <matrix A, matrix B, matrix C>
requires __detail::has_csr_base<A> && __detail::has_csr_base<B> &&
__detail::is_csr_view_v<C>
void multiply_inspect(spgemm_state_t& spgemm_handle, A&& a, B&& b, C&& c) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to at least add a version that takes in three matrices and returns a state (see multiply.hpp).

@yhmtsai
Copy link
Contributor Author

yhmtsai commented Jun 26, 2025

@BenBrock Sure, I can avoid force rebasing. It is what we need to do to keep linear history in Ginkgo and I try to reorganize the commit history to describe the change. As we always do squash, my commit history will be merged together anyway.

@yhmtsai
Copy link
Contributor Author

yhmtsai commented Aug 7, 2025

Hi @BenBrock I have used merge way to update this branch. hope this will help the review process.
I only do adaption on changes in this pr not revert something in this pr here

BenBrock
BenBrock previously approved these changes Aug 20, 2025
Copy link
Collaborator

@BenBrock BenBrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have one comment about the CMakeLists.txt, but otherwise looks fine to merge.


function(add_rocsparse_lang file_list)
if (ENABLE_ROCSPARSE)
set_source_files_properties(${${file_list}} PROPERTIES LANGUAGE HIP)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we perhaps just have this inline? Personally I find that easier to read than needing to go to a separate function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. agree with it. It is quite small so no need for function now

Co-authored-by: Benjamin Brock <brock@cs.berkeley.edu>
@yhmtsai
Copy link
Contributor Author

yhmtsai commented Aug 21, 2025

after push, it automatically dismiss the review. @BenBrock could you check it again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants