perf: reduce memory use when splitting IVF partitions#6687
Open
perf: reduce memory use when splitting IVF partitions#6687
Conversation
The optimize/append path created `IvfIndexBuilder` with `NoopIndexBuildProgress`, so progress callbacks were silently ignored. This adds a `progress` field to `OptimizeOptions` and passes it through to the builder in all index type variants of `optimize_vector_indices_v2`. Also adds shuffle stage reporting in `shuffle_data()`. Ref #6378 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…signment Previously, `build_split_plan` loaded all raw vectors for every partition being split, and ran up to `num_cpus` partitions in parallel. For high-dimensional vectors this caused OOM. Similarly, `collect_candidate_moves` loaded neighbor partitions in parallel. This splits the work into two phases: - Training (parallel, low memory): sample 512 row IDs per partition, load only those vectors, train kmeans. - Assignment (sequential, high memory): load full raw vectors one partition at a time. Candidate moves also run sequentially. Ref #6378 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously, splitting oversized IVF partitions during index optimization loaded all raw vectors for every split partition and their neighbors into memory simultaneously (~11.5 GB for 30 partitions at 3072 dims). This refactors the split path to reuse the existing streaming shuffle infrastructure: train new centroids from samples, then stream affected partition vectors through the IVF+quantizer transform pipeline into temp files on disk. Peak memory drops from O(all split vectors) to O(one batch). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Simplify three-way partition routing by extracting split_reader once - Remove dead AssignOp::Remove variant and simplify build_assign_batch - Add Debug impl for PartitionAdjustment - Add SPLIT_SAMPLE_SIZE constant for kmeans training sample size - Include partition index in "centroid not found" error message Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
57cd64b to
efc816e
Compare
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
wjones127
commented
May 5, 2026
Extract `apply_centroid_splits` from `compute_split_centroids` to make the centroid ordering logic directly testable. Add a unit test verifying that K simultaneous splits on N partitions produce N+K centroids with unchanged partitions at their original indices and centroid2s appended in split order. Replaces the removed `finalize_split_plans_reassigns_filtered_centroid_ids` test. The other two removed tests' properties are now covered structurally (global nearest-centroid assignment) and by existing integration tests (`test_split_multiple_partitions_in_one_optimize`, `test_partition_split_on_append_multivec`). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We improve memory use when splitting IVF partitions in two of the stages:
512of the vectors, which should be sufficient to train for just 2 centroids.Vec<SplitPlan>. This is the largest source of peak memory use currently. If many partitions are being split, this can be > 100GB. We now instead stream these raw vectors through the partition assignment and quantization pipeline, just like we do in the case of new indices.This PR also adds progress reporting to
optimize_indices, to make this more observable.Test Workload: IVF_PQ append on 560K base rows (16 partitions, 3072-dim float32 vectors) with 160K new rows — triggers partition splitting since each partition exceeds the 32K row threshold.
Peak RSS: 26.2 GB before, 4.1 GB after.
Runtime: 93s before, 16.5s after — 5.6x faster as well
Closes #6378