Skip to content

(draft) integrate determinant diversity in tokp full precision search and disk search#1011

Draft
narendatha wants to merge 23 commits intomainfrom
u/narendatha/det_div_plugins
Draft

(draft) integrate determinant diversity in tokp full precision search and disk search#1011
narendatha wants to merge 23 commits intomainfrom
u/narendatha/det_div_plugins

Conversation

@narendatha
Copy link
Copy Markdown
Contributor

No description provided.

@narendatha narendatha changed the base branch from main to mhildebr/benchmark-plugins May 4, 2026 12:41
Base automatically changed from mhildebr/benchmark-plugins to main May 5, 2026 00:45
narendatha and others added 10 commits May 5, 2026 12:41
Replace kind()-based string equality checks with explicit is_match()
and get() phase-shape helpers on plugin structs. This avoids fragile
ordering assumptions and makes each plugin responsible for recognising
its own phase shape.
Co-authored-by: Copilot <copilot@github.com>
…plugins

# Conflicts:
#	diskann-benchmark/src/backend/index/benchmarks.rs
#	diskann-benchmark/src/backend/index/product.rs
#	diskann-benchmark/src/backend/index/scalar.rs
#	diskann-benchmark/src/backend/index/search/plugins.rs
#	diskann-benchmark/src/backend/index/spherical.rs
#	diskann-benchmark/src/inputs/graph_index.rs
#	diskann-benchmark/src/inputs/mod.rs
Co-authored-by: Copilot <copilot@github.com>
…ojection issue

This commit explores approaches to wire real candidate vectors into async determinant-diversity post-processing.

Current state: IN COMPILATION ERROR (intentional for analysis)

Attempted approaches:
1. Initial shim-trait FullPrecisionVectorAccessor with async get_full_precision_vector()
   - Resulted in 'implementation not general enough' at search_with() call

2. Removed explicit for<'a> post_processor::DeterminantDiversity bound
   - Still fails - the constraint is inherent in search_with() signature itself

Root cause analysis:
- search_with() requires: PP: for<'a> SearchPostProcess<S::SearchAccessor<'a>, T, O>
- This means post-processor must work for ANY accessor lifetime 'a
- But query = queries.row(query_idx) is borrowed for specific loop iteration lifetime
- These are fundamentally incompatible - a borrowed value can't satisfy for<'a> generically

Compiler errors (3 total):
- 'not general enough': implementation needed for or<'a> but found specific '0
- 'does not live long enough': queries lifetime too short for 'static requirement

Files modified:
- diskann-benchmark/src/backend/index/benchmarks.rs:
  * Removed explicit for<'a> post_processor::DeterminantDiversity constraint
  * Narrowed plugin impl to FullPrecisionProvider<f32>

- diskann-benchmark/src/backend/index/post_processor/determinant_diversity.rs:
  * Added shim trait FullPrecisionVectorAccessor
  * Async method get_full_precision_vector(&mut self, id) -> impl Future<...>

Next steps to investigate:
- Move determinant-diversity outside search_with() as post-processing reranking
- This avoids HRTB entirely by applying after candidates are returned
- Benchmark impact: measure recall/QPS with external reranking vs baseline

Related context:
- Disk index determinant-diversity works correctly (uses real vectors, shows 51-53% QPS cost)
- Shared algorithm fixed (distance-to-similarity scoring direction)
- Branch already merged with origin/main
Co-authored-by: Copilot <copilot@github.com>
…educe duplication

- Use for<'a, 'b> SearchStrategy bound (user-provided fix) to break HRTB lifetime
  projection issue in the search_with post-processor constraint
- Wire FullPrecisionVectorAccessor shim trait so async det-div post-processor fetches
  real candidate vectors instead of placeholder distances
- Populate QPS/latency metrics in async det-div benchmark path (previously all 'missing')
- Extract run_topk_timed helper to eliminate ~100 lines of duplicated loop/timing/recall
  machinery from DeterminantDiversity::run
- Update async-determinant-diversity.json example tag (async-index-build -> graph-index-build)
- Fix clippy::manual_async_fn in FullPrecisionVectorAccessor shim trait
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 6, 2026

Codecov Report

❌ Patch coverage is 11.76471% with 495 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.03%. Comparing base (be804aa) to head (701ce8e).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
...vider/async_/determinant_diversity_post_process.rs 0.00% 194 Missing ⚠️
diskann-benchmark/src/backend/index/benchmarks.rs 5.51% 120 Missing ⚠️
diskann-disk/src/search/provider/disk_provider.rs 18.18% 90 Missing ⚠️
...kend/index/post_processor/determinant_diversity.rs 0.00% 41 Missing ⚠️
diskann-benchmark/src/inputs/graph_index.rs 55.26% 17 Missing ⚠️
diskann-benchmark/src/inputs/disk.rs 0.00% 11 Missing ⚠️
diskann-benchmark/src/inputs/post_processor.rs 0.00% 11 Missing ⚠️
...kann-benchmark/src/backend/index/search/plugins.rs 62.96% 10 Missing ⚠️
diskann-tools/src/utils/search_disk_index.rs 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1011      +/-   ##
==========================================
- Coverage   90.63%   89.03%   -1.61%     
==========================================
  Files         460      463       +3     
  Lines       85424    85961     +537     
==========================================
- Hits        77427    76532     -895     
- Misses       7997     9429    +1432     
Flag Coverage Δ
miri 89.03% <11.76%> (-1.61%) ⬇️
unittests 88.87% <11.76%> (-1.73%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann-benchmark/src/backend/index/mod.rs 100.00% <ø> (ø)
diskann-benchmark/src/backend/index/spherical.rs 100.00% <ø> (ø)
diskann-benchmark/src/inputs/mod.rs 78.26% <ø> (ø)
diskann-disk/src/build/builder/core.rs 95.24% <100.00%> (+<0.01%) ⬆️
diskann-tools/src/utils/search_disk_index.rs 0.00% <0.00%> (ø)
...kann-benchmark/src/backend/index/search/plugins.rs 62.31% <62.96%> (-3.69%) ⬇️
diskann-benchmark/src/inputs/disk.rs 4.24% <0.00%> (-0.24%) ⬇️
diskann-benchmark/src/inputs/post_processor.rs 0.00% <0.00%> (ø)
diskann-benchmark/src/inputs/graph_index.rs 39.93% <55.26%> (+2.91%) ⬆️
...kend/index/post_processor/determinant_diversity.rs 0.00% <0.00%> (ø)
... and 3 more

... and 45 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants