Skip to content

Conversation

@ricardoV94
Copy link
Member

@ricardoV94 ricardoV94 commented Oct 18, 2025

We had support for boxing/unboxing Sparse objects in numba, but we couldn't do anything with them.

This PR implements the basic functionality:

  1. CSMProperties (Op that retrieves the attributes from a CSM Matrix)
  2. CSM (Op that rebuilds a Sparse variable from the attributes)
  3. astype()
  4. csr_matrix, csc_matrix sicpy constructor overloads
  5. MULSD (as a POC)

TODO


📚 Documentation preview 📚: https://pytensor--1676.org.readthedocs.build/en/1676/

@codecov
Copy link

codecov bot commented Nov 18, 2025

Codecov Report

❌ Patch coverage is 83.20000% with 42 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.67%. Comparing base (8617558) to head (1de9697).
⚠️ Report is 30 commits behind head on main.

Files with missing lines Patch % Lines
pytensor/link/numba/dispatch/sparse/basic.py 82.38% 24 Missing and 7 partials ⚠️
pytensor/link/numba/dispatch/sparse/math.py 90.76% 3 Missing and 3 partials ⚠️
pytensor/link/numba/dispatch/basic.py 0.00% 3 Missing ⚠️
pytensor/tensor/type.py 33.33% 1 Missing and 1 partial ⚠️

❌ Your patch check has failed because the patch coverage (83.20%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1676      +/-   ##
==========================================
- Coverage   81.70%   81.67%   -0.04%     
==========================================
  Files         246      251       +5     
  Lines       53632    52549    -1083     
  Branches     9438     9271     -167     
==========================================
- Hits        43822    42919     -903     
+ Misses       7329     7258      -71     
+ Partials     2481     2372     -109     
Files with missing lines Coverage Δ
pytensor/link/numba/dispatch/compile_ops.py 92.64% <ø> (ø)
pytensor/link/numba/dispatch/sparse/__init__.py 100.00% <100.00%> (ø)
pytensor/sparse/variable.py 76.79% <100.00%> (ø)
pytensor/tensor/type.py 94.20% <33.33%> (-0.41%) ⬇️
pytensor/link/numba/dispatch/basic.py 81.77% <0.00%> (-3.25%) ⬇️
pytensor/link/numba/dispatch/sparse/math.py 90.76% <90.76%> (ø)
pytensor/link/numba/dispatch/sparse/basic.py 82.38% <82.38%> (ø)

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ricardoV94 ricardoV94 force-pushed the numba_sparse_ops branch 2 times, most recently from 14a11e2 to 285d7ca Compare November 19, 2025 12:12
@ricardoV94 ricardoV94 force-pushed the numba_sparse_ops branch 3 times, most recently from b0057c4 to 7c2edbc Compare January 15, 2026 17:40
@ricardoV94 ricardoV94 marked this pull request as ready for review January 15, 2026 17:40
@ricardoV94 ricardoV94 changed the title Implement Sparse Ops in Numba Basic Sparse functionality in Numba Jan 15, 2026
@ricardoV94 ricardoV94 force-pushed the numba_sparse_ops branch 4 times, most recently from e9c1320 to dc3e431 Compare January 16, 2026 16:18
ricardoV94 and others added 4 commits January 16, 2026 17:53
Co-authored-by: Adrian Seyboldt <aseyboldt@users.noreply.github.com>
Co-authored-by: Jesse Grabowski <48652735+jessegrabowski@users.noreply.github.com>
Co-authored-by: Jesse Grabowski <48652735+jessegrabowski@users.noreply.github.com>
Co-authored-by: Jesse Grabowski <48652735+jessegrabowski@users.noreply.github.com>
Copy link
Member

@jessegrabowski jessegrabowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved with some questions

@numba_basic.numba_njit
def sparse_multiply_scalar(x, y):
if same_dtype:
z = x.copy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't ever be inplace?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base Op probably doesn't have inplace optimization as I basically copied the perform method. Will double check



@overload(numba_deepcopy)
def numba_deepcopy_sparse(x):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's deep about this?

Copy link
Member Author

@ricardoV94 ricardoV94 Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sparse_matrix.copy() does a deepcopy just like array.copy(). But for other types like list or rng there's a difference between copy and deepcopy hence the more explicit name

def numba_funcify_CSMProperties(op, node, **kwargs):
@numba_basic.numba_njit
def csm_properties(x):
# Reconsider this int32/int64. Scipy/base PyTensor use int32 for indices/indptr.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we able to just go to int64 ourselves, or do we need to wait for upstream to change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would need to change stuff in the pre-existing Ops so that fallback to obj mode is compatible. Would leave that for a later PR if we decide

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would need to change stuff in the pre-existing Ops so that fallback to obj mode is compatible. Would leave that for a later PR if we decide

shape_obj = c.box(typ.shape, struct_ptr.shape)

# Call scipy.sparse.cs[c|r]_matrix
cls_obj = c.pyapi.unserialize(c.pyapi.serialize_object(typ.instance_class))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this line mean that we always have to come back to python during construction of a numba sparse array?

Copy link
Member Author

@ricardoV94 ricardoV94 Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, just at the end of the outer jitted function if there's a sparse variable in the outputs. You need this for every numba type. It's where the conversion from internal numba representation to python objects happens.

If a function only uses sparse arrays internally this isn't called.

@overload(sp.sparse.csr_matrix)
def overload_csr_matrix(arg1, shape, dtype=None):
if not isinstance(arg1, types.BaseAnonymousTuple) or len(arg1) != 3:
return None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean to return None from an overload? It fails?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overloads work by trying all registered methods until one works. So numba will keep trying until one matches

return impl


@overload(np.shape)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this interact with other overloads of np.shape? e.g. what if I import this code then call np.shape on an array in numba mode, does it still work as expected?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, like your question above. When this overload returns None, numba will keep trying other overloads of np.shape until one returns something other than None

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, like your question above. When this overload returns None, numba will keep trying other overloads of np.shape until one returns something other than None

out[0] = self.comparison(x, y).astype("uint8")
# FIXME: Scipy csc > csc outputs csr format, but make_node assumes it will be the same as inputs
# Casting to respect make_node, but this is very inefficient
# TODO: Why not go with default bool?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not indeed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect some archaic C bug. We should try to remove in a separate PR


inputs = [x, y] # Need to convert? e.g. assparse
outputs = [psb.SparseTensorType(dtype=x.type.dtype, format=myformat)()]
outputs = [SparseTensorType(dtype=x.type.dtype, format=myformat)()]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not your code but i hate the name myformat

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

x = sp.sparse.csr_matrix(np.eye(100))

y = test_fn(x)
assert y is not x and np.all(x.data == y.data) and np.all(x.indices == y.indices)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you also need to test x.data is not y.data or is that guaranteed by the first check

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be even better to check not np.shares_memory. I'll do that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants