-
Notifications
You must be signed in to change notification settings - Fork 159
Basic Sparse functionality in Numba #1676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
913e012 to
71cc54a
Compare
71cc54a to
1de9697
Compare
Codecov Report❌ Patch coverage is ❌ Your patch check has failed because the patch coverage (83.20%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #1676 +/- ##
==========================================
- Coverage 81.70% 81.67% -0.04%
==========================================
Files 246 251 +5
Lines 53632 52549 -1083
Branches 9438 9271 -167
==========================================
- Hits 43822 42919 -903
+ Misses 7329 7258 -71
+ Partials 2481 2372 -109
🚀 New features to boost your workflow:
|
14a11e2 to
285d7ca
Compare
b0057c4 to
7c2edbc
Compare
e9c1320 to
dc3e431
Compare
* Handle static shape * Rename to more readable Op classes * Simplify perform
2304 -> 288
Co-authored-by: Adrian Seyboldt <aseyboldt@users.noreply.github.com> Co-authored-by: Jesse Grabowski <48652735+jessegrabowski@users.noreply.github.com>
Co-authored-by: Jesse Grabowski <48652735+jessegrabowski@users.noreply.github.com>
Co-authored-by: Jesse Grabowski <48652735+jessegrabowski@users.noreply.github.com>
dc3e431 to
b4cccc7
Compare
jessegrabowski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved with some questions
| @numba_basic.numba_njit | ||
| def sparse_multiply_scalar(x, y): | ||
| if same_dtype: | ||
| z = x.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can't ever be inplace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The base Op probably doesn't have inplace optimization as I basically copied the perform method. Will double check
|
|
||
|
|
||
| @overload(numba_deepcopy) | ||
| def numba_deepcopy_sparse(x): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's deep about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sparse_matrix.copy() does a deepcopy just like array.copy(). But for other types like list or rng there's a difference between copy and deepcopy hence the more explicit name
| def numba_funcify_CSMProperties(op, node, **kwargs): | ||
| @numba_basic.numba_njit | ||
| def csm_properties(x): | ||
| # Reconsider this int32/int64. Scipy/base PyTensor use int32 for indices/indptr. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we able to just go to int64 ourselves, or do we need to wait for upstream to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would need to change stuff in the pre-existing Ops so that fallback to obj mode is compatible. Would leave that for a later PR if we decide
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would need to change stuff in the pre-existing Ops so that fallback to obj mode is compatible. Would leave that for a later PR if we decide
| shape_obj = c.box(typ.shape, struct_ptr.shape) | ||
|
|
||
| # Call scipy.sparse.cs[c|r]_matrix | ||
| cls_obj = c.pyapi.unserialize(c.pyapi.serialize_object(typ.instance_class)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this line mean that we always have to come back to python during construction of a numba sparse array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, just at the end of the outer jitted function if there's a sparse variable in the outputs. You need this for every numba type. It's where the conversion from internal numba representation to python objects happens.
If a function only uses sparse arrays internally this isn't called.
| @overload(sp.sparse.csr_matrix) | ||
| def overload_csr_matrix(arg1, shape, dtype=None): | ||
| if not isinstance(arg1, types.BaseAnonymousTuple) or len(arg1) != 3: | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does it mean to return None from an overload? It fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overloads work by trying all registered methods until one works. So numba will keep trying until one matches
| return impl | ||
|
|
||
|
|
||
| @overload(np.shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does this interact with other overloads of np.shape? e.g. what if I import this code then call np.shape on an array in numba mode, does it still work as expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, like your question above. When this overload returns None, numba will keep trying other overloads of np.shape until one returns something other than None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, like your question above. When this overload returns None, numba will keep trying other overloads of np.shape until one returns something other than None
| out[0] = self.comparison(x, y).astype("uint8") | ||
| # FIXME: Scipy csc > csc outputs csr format, but make_node assumes it will be the same as inputs | ||
| # Casting to respect make_node, but this is very inefficient | ||
| # TODO: Why not go with default bool? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not indeed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect some archaic C bug. We should try to remove in a separate PR
|
|
||
| inputs = [x, y] # Need to convert? e.g. assparse | ||
| outputs = [psb.SparseTensorType(dtype=x.type.dtype, format=myformat)()] | ||
| outputs = [SparseTensorType(dtype=x.type.dtype, format=myformat)()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not your code but i hate the name myformat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree
| x = sp.sparse.csr_matrix(np.eye(100)) | ||
|
|
||
| y = test_fn(x) | ||
| assert y is not x and np.all(x.data == y.data) and np.all(x.indices == y.indices) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you also need to test x.data is not y.data or is that guaranteed by the first check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be even better to check not np.shares_memory. I'll do that
We had support for boxing/unboxing Sparse objects in numba, but we couldn't do anything with them.
This PR implements the basic functionality:
TODO
sparsemodule #1674)📚 Documentation preview 📚: https://pytensor--1676.org.readthedocs.build/en/1676/