⚡ Bolt: Optimize Hamming distance with np.bitwise_count#30
Conversation
Replaces the previous lookup-table implementation for population count in BinaryHDV with `np.bitwise_count` (available in NumPy 2.0+). This results in a ~5x performance improvement for Hamming distance calculations, which are critical for vector search operations. - Added `_popcount_array` helper that conditionally uses `np.bitwise_count` if available. - Updated `hamming_distance`, `batch_hamming_distance`, `batch_hamming_distance_matrix`, and `__repr__` to use the optimized helper. - Falls back to lookup table for older NumPy versions. Benchmarks show a reduction in batch Hamming distance calculation time from ~0.84s to ~0.17s (for 100 iterations of 1000 vectors). Co-authored-by: RobinALG87 <55180608+RobinALG87@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Pull request overview
This PR optimizes BinaryHDV Hamming distance/popcount computations by preferring NumPy’s np.bitwise_count (NumPy 2.0+) while keeping a lookup-table fallback for older NumPy versions, targeting a key vector-search bottleneck in HAIMEngine queries.
Changes:
- Added a
_popcount_array()helper that usesnp.bitwise_countwhen available, otherwise uses the existing cached lookup table. - Switched
hamming_distance,__repr__, and batch Hamming routines to use_popcount_array()for popcount. - Added a Jules “bolt” note documenting the optimization learnings.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/mnemocore/core/binary_hdv.py |
Introduces _popcount_array() and routes all popcount usage through it to leverage np.bitwise_count where possible. |
.jules/bolt.md |
Documents the performance learning and guidance for preferring np.bitwise_count with a compatibility fallback. |
Comments suppressed due to low confidence (2)
src/mnemocore/core/binary_hdv.py:172
hamming_distance()docstring still says it "Uses lookup table for speed", but the implementation now prefersnp.bitwise_countwhen available. Please update the docstring to reflect the new behavior (and keep the lookup-table wording as the fallback path).
# Optimized: use np.bitwise_count or precomputed table
return int(_popcount_array(xor_result).sum())
src/mnemocore/core/binary_hdv.py:49
_popcount_array()checkshasattr(np, 'bitwise_count')on every call. Since this is in the hot path, consider caching the availability at import time (e.g., a module-levelBITWISE_COUNT = getattr(np, 'bitwise_count', None)and branching on that) to avoid repeated attribute lookups.
if hasattr(np, "bitwise_count"):
return np.bitwise_count(arr)
return _build_popcount_table()[arr]
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
📝 WalkthroughWalkthroughA new documentation file records performance learnings about vectorized operations and popcount optimization. Concurrently, the codebase introduces a Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/mnemocore/core/binary_hdv.py (1)
47-49: Hoisthasattrcheck to a module-level constant.
hasattr(np, "bitwise_count")is re-evaluated on every single call to_popcount_array, including once per row in thebatch_hamming_distance_matrixloop. This attribute lookup is cheap but unnecessary; the numpy version doesn't change at runtime.♻️ Proposed refactor
+# Detect NumPy 2.0+ native popcount support once at import time +_NP_HAS_BITWISE_COUNT: bool = hasattr(np, "bitwise_count") + + def _popcount_array(arr: np.ndarray) -> np.ndarray: """ Compute population count (number of 1-bits) for a uint8 array. Uses np.bitwise_count if available (NumPy 2.0+), otherwise uses lookup table. """ - if hasattr(np, "bitwise_count"): + if _NP_HAS_BITWISE_COUNT: return np.bitwise_count(arr) return _build_popcount_table()[arr]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mnemocore/core/binary_hdv.py` around lines 47 - 49, Hoist the repeated hasattr lookup by evaluating hasattr(np, "bitwise_count") once at import time (e.g., define a module-level boolean like HAS_BITWISE_COUNT) and replace the per-call check inside _popcount_array with that constant; keep the fallback to _build_popcount_table()[arr] and ensure batch_hamming_distance_matrix and any other callers still call _popcount_array unchanged so the runtime behavior remains identical..jules/bolt.md (1)
1-7: Consider whether.jules/bolt.mdbelongs in the repository.This file appears to be auto-generated AI agent working notes. Committing it into the main repo means it ships in distributions and appears to contributors as first-class documentation. If the content is worth preserving, consider either moving it to a proper location (e.g.,
docs/performance-notes.md) or adding.jules/to.gitignore.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.jules/bolt.md around lines 1 - 7, The committed file .jules/bolt.md looks like ephemeral AI agent notes and should not be shipped as top-level documentation; either move its contents into a proper docs file (e.g., docs/performance-notes.md) and commit that instead, or remove .jules/bolt.md from the repo and add the .jules/ directory to .gitignore so these agent artifacts are not tracked; after choosing, update any references and commit the move/remove so the repository no longer contains the auto-generated notes as first-class docs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/mnemocore/core/binary_hdv.py`:
- Around line 47-49: _popcount_array currently returns different dtypes
depending on whether np.bitwise_count exists (uint8) or the fallback table
(int32), causing batch_hamming_distance to be version-dependent; fix by
normalizing the popcount output to a single consistent integer dtype (e.g., cast
the result of either np.bitwise_count(arr) or _build_popcount_table()[arr] to a
fixed dtype such as np.int32 or np.int64) before returning from _popcount_array
so callers like batch_hamming_distance get a stable dtype; update
_popcount_array (and if preferred adjust batch_hamming_distance) to perform this
explicit np.astype(...) cast and keep references to _popcount_array,
np.bitwise_count, and _build_popcount_table to locate the change.
---
Nitpick comments:
In @.jules/bolt.md:
- Around line 1-7: The committed file .jules/bolt.md looks like ephemeral AI
agent notes and should not be shipped as top-level documentation; either move
its contents into a proper docs file (e.g., docs/performance-notes.md) and
commit that instead, or remove .jules/bolt.md from the repo and add the .jules/
directory to .gitignore so these agent artifacts are not tracked; after
choosing, update any references and commit the move/remove so the repository no
longer contains the auto-generated notes as first-class docs.
In `@src/mnemocore/core/binary_hdv.py`:
- Around line 47-49: Hoist the repeated hasattr lookup by evaluating hasattr(np,
"bitwise_count") once at import time (e.g., define a module-level boolean like
HAS_BITWISE_COUNT) and replace the per-call check inside _popcount_array with
that constant; keep the fallback to _build_popcount_table()[arr] and ensure
batch_hamming_distance_matrix and any other callers still call _popcount_array
unchanged so the runtime behavior remains identical.
| if hasattr(np, "bitwise_count"): | ||
| return np.bitwise_count(arr) | ||
| return _build_popcount_table()[arr] |
There was a problem hiding this comment.
_popcount_array returns different dtypes across numpy versions, making batch_hamming_distance's return dtype version-dependent.
np.bitwise_count is a ufunc that, for uint8 input, returns dtype=uint8. The fallback _build_popcount_table()[arr] returns dtype=int32 (the table is built with dtype=np.int32). Callers that don't explicitly wrap with int(...) — namely batch_hamming_distance at line 278 — will receive uint64 on NumPy 2.0+ and int64 on older versions. This is a silent behavioral change that could break downstream code comparing or storing the distances array with an assumed dtype.
Fix: normalize the output to a consistent dtype at the end of _popcount_array, or explicitly cast at the call site.
🛡️ Proposed fix: normalize return dtype
def _popcount_array(arr: np.ndarray) -> np.ndarray:
"""
Compute population count (number of 1-bits) for a uint8 array.
Uses np.bitwise_count if available (NumPy 2.0+), otherwise uses lookup table.
+ Always returns dtype=np.int32 for a consistent interface.
"""
if hasattr(np, "bitwise_count"):
- return np.bitwise_count(arr)
+ return np.bitwise_count(arr).astype(np.int32)
return _build_popcount_table()[arr]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if hasattr(np, "bitwise_count"): | |
| return np.bitwise_count(arr) | |
| return _build_popcount_table()[arr] | |
| if hasattr(np, "bitwise_count"): | |
| return np.bitwise_count(arr).astype(np.int32) | |
| return _build_popcount_table()[arr] |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/mnemocore/core/binary_hdv.py` around lines 47 - 49, _popcount_array
currently returns different dtypes depending on whether np.bitwise_count exists
(uint8) or the fallback table (int32), causing batch_hamming_distance to be
version-dependent; fix by normalizing the popcount output to a single consistent
integer dtype (e.g., cast the result of either np.bitwise_count(arr) or
_build_popcount_table()[arr] to a fixed dtype such as np.int32 or np.int64)
before returning from _popcount_array so callers like batch_hamming_distance get
a stable dtype; update _popcount_array (and if preferred adjust
batch_hamming_distance) to perform this explicit np.astype(...) cast and keep
references to _popcount_array, np.bitwise_count, and _build_popcount_table to
locate the change.
💡 What: Replaced lookup-table based population count with
np.bitwise_count(NumPy 2.0+).🎯 Why: Hamming distance calculation is a core bottleneck in vector search (HAIMEngine queries).
📊 Impact: Reduces Hamming distance computation time by ~80% (5x speedup).
🔬 Measurement: Benchmarked with
benchmarks/benchmark_popcount.pyandbenchmarks/benchmark_hdv_hamming.py.The implementation is backward compatible with older NumPy versions via
hasattrcheck.PR created automatically by Jules for task 7941847535009991281 started by @RobinALG87
Summary by CodeRabbit
Performance Improvements
Documentation