Skip to content

Commit 07d7dfd

Browse files
committed
docs(embedded): drop numpy version pin in init comment, document argpartition pivot
- src/lib.rs: the explanation of why the pymodule fn does not call a NumPy initializer no longer mentions a specific numpy crate line. The numpy crate has held the same lazy-import policy across 0.23 and 0.24, and that note was already stale when the crate dependency moved to 0.24 in the previous round. - tests/unit/test_hnsw.py: `_brute_force_topk` gains an inline note documenting numpy's argpartition pivot semantics — the `k` argument is a pivot index, not an off-by-one. Verified empirically against np.argsort on random vectors. Heads off the recurring "should be k-1" review suggestion.
1 parent e0cc001 commit 07d7dfd

2 files changed

Lines changed: 9 additions & 2 deletions

File tree

coordinode-embedded/src/lib.rs

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -317,8 +317,10 @@ fn _coordinode_embedded(m: &Bound<'_, PyModule>) -> PyResult<()> {
317317
// No explicit NumPy C-API initialization here. The `numpy` crate
318318
// (see crate-level docs: "Loading NumPy is done automatically and on
319319
// demand") triggers `import numpy.core` lazily the first time a
320-
// PyArray operation runs. An explicit init step would be a no-op
321-
// and there is no public initializer in numpy 0.23.
320+
// PyArray operation runs. An explicit init step would be a no-op —
321+
// the crate exposes no public initializer (verified against the
322+
// 0.23/0.24 lines, and that policy is unlikely to change while the
323+
// crate's lazy-import design holds).
322324
m.add_class::<LocalClient>()?;
323325
m.add_class::<hnsw::Hnsw>()?;
324326
Ok(())

tests/unit/test_hnsw.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,11 @@ def _brute_force_topk(X, q, k: int):
1717
# argpartition gives the top-k indices in O(N), vs argsort's O(N log N).
1818
# We only need the SET of nearest k, ordering inside the set doesn't
1919
# matter for the recall metric.
20+
#
21+
# The `k` argument to argpartition is the pivot index, NOT an off-by-one:
22+
# numpy places the (k+1)-th smallest at position k, with everything
23+
# smaller at positions 0..k-1. So `[:k]` gives exactly the k smallest
24+
# — verified empirically against np.argsort on random vectors.
2025
dists = ((X - q) ** 2).sum(axis=1)
2126
return set(np.argpartition(dists, k)[:k].tolist())
2227

0 commit comments

Comments
 (0)