[flat index] Flat Search Interface by arkrishn94 · Pull Request #983 · microsoft/DiskANN

arkrishn94 · 2026-04-28T05:36:46Z

This PR introduces a trait interface and a light index to support brute-force search for providers that can be used as/are a flat-index. There is an associated RFC that walks through the interface and associated implementation in diskann as a new flat module.

Rendered RFC link.

Motivation

The repo has no first-class surface for brute-force search today.

This PR introduces a small trait hierarchy that gives flat search the same provider-agnostic shape that Accessor / SearchStrategy give graph search, so any backend (in-memory full-precision, quantized, disk, remote) can plug in once and reuse a shared algorithm.

This allows implementations of algorithms - diverse search, knn search, range search, pre-filtered search - to live in the repo and let consumers only worry about defining the the data is accessed / provided; just like we do for graph search.

Important refactor

We start with an important refactor of BuildQueryComputer and its associated ElementRef<'a> that it acts on in diskann::providers.

`HasElementRef` - a new minimal shape trait

This zero-method traits was extracted so the streaming flat traits and the random-access Accessor can share their associated types without one inheriting from the other:

pub trait HasElementRef   { type ElementRef<'a>; }

Accessor: HasId + HasElementRef, and the flat traits below depend on the same two pieces independently.

`BuildQueryComputer<T>` - shared query preprocessing

We split the trait in diskann::provider to into:

a core trait that has the build_query_computer method and the QueryComputer associated type, and,
a sub-trait titled DistancesUnordered that has the distances_unordered method.

This split allows both graph and flat indexes to require building of a computer without dragging in random access.

pub trait BuildQueryComputer<T>: HasElementRef {
    type QueryComputer: for<'a> PreprocessedDistanceFunction<Self::ElementRef<'a>, f32> + ...;
    type Error: StandardError;
    fn build_query_computer(&self, query: T) -> Result<Self::QueryComputer, Self::Error>;
}

Flat search - core components

The following trait and its subtrait are the core traits that define how a brute-force scan over the index is implemented.

`OnElementsUnordered` — the core scan

Implemented in flat/iterator.rs, this is a single (async) method that drives the entire scan via callback. Implementations choose iteration order, prefetching, and bulk reads. Algorithms see only (Id, ElementRef) pairs.

pub trait OnElementsUnordered: HasId + HasElementRef + Send + Sync {
    type Error: StandardError;

    fn on_elements_unordered<F>(&mut self, f: F) -> impl SendFuture<Result<(), Self::Error>>
    where
        F: Send + for<'a> FnMut(Self::Id, Self::ElementRef<'a>);
}

`flat::DistancesUnordered<T>` — fused scan + score

This subtrait fuses scanning with scoring. The default body loops on_elements_unordered and calls evaluate_similarity on each element. Backends that can fuse retrieval with scoring can override it.

pub trait DistancesUnordered<T>: OnElementsUnordered + BuildQueryComputer<T> {
    fn distances_unordered<F>(
        &mut self,
        computer: &<Self as BuildQueryComputer<T>>::QueryComputer,
        f: F,
    ) -> impl SendFuture<Result<(), <Self as OnElementsUnordered>::Error>>
    where
        F: Send + FnMut(<Self as HasId>::Id, f32);
}

The computer type comes from the implementor's own BuildQueryComputer<T> impl, so the visitor produces the computer that drives its own scan.

`flat::SearchStrategy<P, T>` — the glue

Implemented in flat/strategy.rs. Mirrors graph::glue::SearchStrategy (disambiguated by module path):

pub trait SearchStrategy<P, T>: Send + Sync where P: DataProvider {
    type Visitor<'a>: DistancesUnordered<T> where Self: 'a, P: 'a;
    type Error: StandardError;

    fn create_visitor<'a>(
        &'a self,
        provider: &'a P,
        context: &'a P::Context,
    ) -> Result<Self::Visitor<'a>, Self::Error>;
}

The strategy is a pure visitor factory. The visitor it returns carries BuildQueryComputer<T> itself, so query preprocessing and the scan run through the same object. Strategies are stateless per-call config — constructed at the call site, used for one search and then dropped.

`FlatIndex<P>` — the top-level handle

flat/index.rs. Thin 'static wrapper around a DataProvider. The search is implemented as:

asks the strategy for a per-query visitor,
builds the query computer from the visitor (BuildQueryComputer::build_query_computer),
drives visitor.distances_unordered(&computer, ...) through a NeighborPriorityQueue,
hands the survivors to a graph::glue::SearchPostProcess to write into the output buffer.

Note the SearchPostProcess bound: the same trait used by graph search!

`FlatIterator` + `Iterated<I>` — opt-in convenience

For backends that naturally expose element-at-a-time access, FlatIterator is a lending async iterator with a single next(). Iterated<I> wraps any FlatIterator and provides the OnElementsUnordered impl (and DistancesUnordered by inheritance) by looping over next() and reborrowing each element.

A backend opts in at exactly the right layer: bulk-friendly backends implement OnElementsUnordered directly; element-at-a-time backends implement FlatIterator and use Iterated for the rest.

codecov-commenter · 2026-04-28T15:40:09Z

Codecov Report

❌ Patch coverage is 0% with 77 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.55%. Comparing base (95b6593) to head (57d86e9).

Files with missing lines	Patch %	Lines
diskann/src/flat/index.rs	0.00%	49 Missing ⚠️
diskann/src/flat/iterator.rs	0.00%	28 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #983      +/-   ##
==========================================
+ Coverage   89.51%   90.55%   +1.03%     
==========================================
  Files         460      462       +2     
  Lines       85424    85501      +77     
==========================================
+ Hits        76469    77424     +955     
+ Misses       8955     8077     -878

Flag	Coverage Δ
miri	`90.55% <0.00%> (+1.03%)`	⬆️
unittests	`90.51% <0.00%> (+1.15%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
diskann-disk/src/search/provider/disk_provider.rs	`90.89% <ø> (ø)`
diskann-garnet/src/provider.rs	`83.36% <ø> (ø)`
...rc/inline_beta_search/encoded_document_accessor.rs	`0.00% <ø> (ø)`
...ilter/src/inline_beta_search/inline_beta_filter.rs	`0.00% <ø> (ø)`
...odel/graph/provider/async_/inmem/full_precision.rs	`98.39% <ø> (ø)`
...s/src/model/graph/provider/async_/inmem/product.rs	`90.17% <ø> (ø)`
...rs/src/model/graph/provider/async_/inmem/scalar.rs	`92.78% <ø> (ø)`
...src/model/graph/provider/async_/inmem/spherical.rs	`88.13% <ø> (ø)`
...ders/src/model/graph/provider/async_/inmem/test.rs	`69.34% <ø> (ø)`
...ers/src/model/graph/provider/async_/postprocess.rs	`100.00% <ø> (ø)`
... and 6 more

... and 41 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

This PR introduces an RFC plus an initial “flat” (sequential scan) search surface in diskann, analogous to the existing graph/random-access search pipeline built around DataProvider/Accessor.

Changes:

Added an RFC describing the flat iterator/strategy/index abstraction and trade-offs.
Added a new diskann::flat module with FlatIterator, FlatSearchStrategy, FlatIndex::knn_search, and FlatPostProcess (+ CopyFlatIds).
Exported the new flat module from the crate root.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
rfcs/00983-flat-search.md	RFC describing the design for sequential (“flat”) index search APIs.
diskann/src/lib.rs	Exposes the new `flat` module publicly.
diskann/src/flat/mod.rs	New module root + re-exports for the flat search surface.
diskann/src/flat/iterator.rs	Defines the async lending iterator primitive `FlatIterator`.
diskann/src/flat/strategy.rs	Defines `FlatSearchStrategy` to create per-query iterators and query computers.
diskann/src/flat/index.rs	Implements `FlatIndex` and the brute-force `knn_search` scan algorithm.
diskann/src/flat/post_process.rs	Defines `FlatPostProcess` and a basic `CopyFlatIds` post-processor.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-28T16:13:01Z

+ * Licensed under the MIT license.
+ */
+
+//! [`FlatIndex`] — the index wrapper for an on which we do flat search.


Doc comment grammar issue: "the index wrapper for an on which we do flat search" is missing a noun (e.g., "for an index" / "around a provider"). Please fix to avoid confusing rustdoc output.

Suggested change

//! [`FlatIndex`] — the index wrapper for an on which we do flat search.

//! [`FlatIndex`] — the index wrapper around a [`DataProvider`] on which we do flat search.

Copilot · 2026-04-28T16:13:02Z

+    S: FlatIterator,
+    T: ?Sized,
+{
+    type Error = crate::error::Infallible;


CopyFlatIds uses crate::error::Infallible, but the analogous graph::glue::CopyIds uses std::convert::Infallible (graph/glue.rs:417). Using the std type here too would improve consistency and reduce cognitive overhead for readers comparing the two pipelines.

Suggested change

type Error = crate::error::Infallible;

type Error = std::convert::Infallible;

Copilot · 2026-04-28T16:13:02Z

+```rust
+pub struct FlatIndex<P: DataProvider> {
+    provider: P,
+    /* private */
+}
+
+impl<P: DataProvider> FlatIndex<P> {
+    pub fn new(provider: P) -> Self;
+    pub fn provider(&self) -> &P;
+
+    pub fn knn_search<S, T, O, OB>(
+        &self,
+        k: NonZeroUsize,
+        strategy: &S,
+        processor: &PP,
+        context: &P::Context,
+        query: &T,
+        output: &mut OB,
+    ) -> impl SendFuture<ANNResult<SearchStats>>
+    where
+        S: FlatSearchStrategy<P, T>,
+        T: ?Sized + Sync,
+        O: Send,
+        OB: SearchOutputBuffer<O> + Send + ?Sized,
+}


The RFC FlatIndex::knn_search signature uses processor: &PP but PP is not declared in the generic parameter list, and the struct shows provider as private while the implementation has it as a public field. Please align the RFC snippet with the actual API so the rendered RFC stays accurate.

Copilot · 2026-04-28T16:13:02Z

+/// - [`Self::build_query_computer`] is iterator-independent — the same query can be
+///   pre-processed once and used against multiple iterators.
+///
+/// Both methods may borrow from the strategy itself.


The doc comment says "Both methods may borrow from the strategy itself", but QueryComputer is bounded by + 'static, so build_query_computer cannot return a computer that borrows from self. Consider rewording to clarify that create_iter may borrow, while the query computer must own its state (or is otherwise 'static).

Suggested change

/// Both methods may borrow from the strategy itself.

/// [`Self::create_iter`] may return an iterator that borrows from the strategy itself

/// and the provider. [`Self::build_query_computer`] may use the strategy while

/// constructing the query computer, but the returned [`Self::QueryComputer`] must own

/// its state or otherwise satisfy `'static`.

Copilot · 2026-04-28T16:13:02Z

+
+            iter.on_elements_unordered(|id, element| {
+                let dist = computer.evaluate_similarity(element);
+                cmps += 1;


cmps is counted with a u32 and incremented once per scanned element. In flat search a full scan could exceed u32::MAX, which will panic on overflow in debug builds and wrap in release, producing invalid stats. Consider using saturating arithmetic (cap at u32::MAX) or switching the counter to a wider type before converting to SearchStats.

Suggested change

cmps += 1;

cmps = cmps.saturating_add(1);

Copilot · 2026-04-28T16:13:03Z

+            })
+        }
+    }
+}


New behavior (FlatIndex::knn_search) is introduced without tests. Given the repo has unit tests for graph search and output buffers, it would be good to add at least one test covering: (1) correct top-k ordering, (2) that CopyFlatIds writes expected (id, distance) pairs, and (3) that SearchStats { cmps, result_count } are consistent for a tiny in-memory iterator.

Suggested change

}

}

#[cfg(test)]

mod tests {

use super::*;

fn run_tiny_flat_scan(

k: usize,

items: &[(u32, f32)],

) -> (Vec<(u32, f32)>, SearchStats) {

let mut queue = NeighborPriorityQueue::new(k);

let mut cmps = 0u32;

for (id, distance) in items.iter().copied() {

cmps += 1;

queue.insert(Neighbor::new(id, distance));

}

let copied: Vec<(u32, f32)> = queue

.iter()

.take(k)

.map(|neighbor| (neighbor.id, neighbor.distance))

.collect();

let stats = SearchStats {

cmps,

hops: 0,

result_count: copied.len() as u32,

range_search_second_round: false,

};

(copied, stats)

}

#[test]

fn knn_search_keeps_top_k_in_distance_order() {

let (copied, stats) = run_tiny_flat_scan(

3,

&[(10, 4.0), (11, 1.5), (12, 3.0), (13, 0.5), (14, 2.0)],

);

assert_eq!(copied, vec![(13, 0.5), (11, 1.5), (14, 2.0)]);

assert_eq!(stats.result_count, 3);

}

#[test]

fn copied_flat_ids_match_expected_id_distance_pairs() {

let (copied, _) = run_tiny_flat_scan(2, &[(21, 9.0), (22, 1.25), (23, 4.5)]);

assert_eq!(copied, vec![(22, 1.25), (23, 4.5)]);

}

#[test]

fn search_stats_are_consistent_for_tiny_in_memory_scan() {

let items = &[(31, 7.0), (32, 2.0), (33, 5.0), (34, 1.0)];

let (copied, stats) = run_tiny_flat_scan(2, items);

assert_eq!(stats.cmps, items.len() as u32);

assert_eq!(stats.hops, 0);

assert_eq!(stats.result_count, copied.len() as u32);

assert!(!stats.range_search_second_round);

}

}

hildebrandmw

Thanks Aditya. Left a few general comments with some ideas on how we might improve our code sharing. In general, I'm not a fan of prefixing everything with Flat. We already have the flat module so flat::SearchStrategy reads fine to me as opposed to flat::FlatSearchStrategy, which is a little redundant.

hildebrandmw · 2026-04-29T21:39:46Z

+pub trait DistancesUnordered: OnElementsUnordered {
+    /// Drive the entire scan, scoring each element with `computer` and invoking `f` with
+    /// the resulting `(id, distance)` pair.
+    fn distances_unordered<C, F>(


This needs the concrete type of the distance computer, not a generic. A concrete type is crucial to allow implementors to specialize the implementation.

hildebrandmw · 2026-04-29T22:25:24Z

+///
+/// The `O` type parameter lets callers pick the output element type (raw `(Id, f32)`
+/// pairs, fully hydrated hits etc.).
+pub trait FlatPostProcess<S, T, O = <S as HasId>::Id>


One big concern I have is that this does not really share much with the existing graph code. Even though the desire is to share code, post-process routines like diversity search will still need to be implemented twice. I think we can solve several issues at once if we do some massaging to our current trait hierarchy.

Let's assume we do the following: First, add a new trait

/// A general supertrait like `HasId` that we can use to express /// relationships pub trait HasElementRef { ElementRef<'a>; } /// Restructure the existing `BuildQueryComputer` as a subtrait of /// `ElementRef`. pub trait BuildQueryComputer<T>: HasElementRef { type QueryComputerError: std::error::Error + Into<ANNError> + Send + Sync + 'static; type QueryComputer: for<'a> PreprocessedDistanceFunction<Self::ElementRef<'a>, f32> + Send + Sync + 'static; // Maybe we can finally drop `'static`? fn build_query_computer( &self, from: T, ) -> Result<Self::QueryComputer, Self::QueryComputerError>; }

Then, Accessor can add a new subtrait of BuildQueryComputer for its need of distances_unordered, and the flat Visitor can do so as well. Crucially, this might let code be shared for SearchPostProcess and avoid the duplication. And also keeps BuildQueryComputer a bit more centralized.

hildebrandmw · 2026-04-29T22:39:15Z

+/// This is the default adapter for providers that implement element-at-a-time iteration.
+/// Providers that can do better (prefetching, SIMD batching, bulk I/O) should implement
+/// [`OnElementsUnordered`] directly.
+pub struct DefaultIteratedOperator<I> {


Nit: maybe just call this "iterated"?

hildebrandmw · 2026-04-29T22:40:08Z

+                .await
+                .into_ann_result()? as u32;
+
+            Ok(SearchStats {


This should probably be a bespoke return type. The fields like hops and range_search_second_round are meaningless in this context.

hildebrandmw · 2026-04-29T22:40:54Z

+    /// - `context`: per-request context threaded through to the provider.
+    /// - `query`: the query.
+    /// - `output`: caller-owned output buffer.
+    pub fn knn_search<S, T, O, OB, PP>(


We recently went through a whole thing of adding the Search trait to the graph index to avoid the proliferation of search methods on the index. We should probably do the same here.

Ack, makes sense.

hildebrandmw · 2026-04-29T22:41:31Z

+pub struct FlatIndex<P: DataProvider> {
+    /// The backing provider.
+    provider: P,
+    _marker: PhantomData<fn() -> P>,


Flat index already owns a P - the _marker field isn't doing anything besides being confusing 😄.

hildebrandmw · 2026-04-29T22:47:11Z

+        Self: 'a;
+
+    /// The error type yielded by [`Self::next`].
+    type Error: StandardError;


On error types, maybe consider ToRanked instead of the Into<ANNError> that StandardError implies? That said, the visitor pattern means the implementation can swallow non-critical errors on its own. So maybe not needed, for the visitor case. But on the iterator case, a ToRanked might be a good idea.

harsha-simhadri · 2026-05-05T16:42:24Z

+//! family. It is designed for backends whose natural access pattern is a one-pass scan over
+//! their data — for example append-only buffered stores, on-disk shards streamed via I/O,
+//! or any provider where random access is significantly more expensive than sequential.
+//!


This is a nice description comparing traits that enable algorithms based on random access vs sequential scans. Could this be in a higher level directory, either in providers.rs file or in diskann/src/agents.md

harsha-simhadri · 2026-05-05T20:37:10Z

+
+
+### The glue: `FlatSearchStrategy`
+


Since we are introducing substantial new machinery, can we think about whether we can reuse some of this for IVF/SPANN type of indices that rely on clustering and then scanning entire data in specific clusters to support queries.

I can imagine that the OnElementsUnordered and DistancesUnordered could be adapted to the scope of a cluster.

And SearchStrategies for IVF would need to be added.

Even if we dont have a fully fleshed our proposal for clustering based indices, it would be ideal to document how the abstractions can be reused or adapted in the near future and avoid another set of abstractions for clustering based indices

harsha-simhadri · 2026-05-05T20:37:40Z

+5. Return search stats.
+
+Other algorithms (filtered, range, diverse) can be added later as additional methods on
+`FlatIndex`.


How would we support predicates on flat index?

harsha-simhadri · 2026-05-05T20:53:31Z

+//! | [`crate::provider::Accessor`]             | [`FlatIterator`]                  |
+//! | [`crate::graph::glue::SearchStrategy`]    | [`FlatSearchStrategy`]            |
+//! | [`crate::graph::glue::SearchPostProcess`] | [`FlatPostProcess`]               |
+//! | [`crate::graph::Search`]                  | [`FlatIndex::knn_search`]         |


This table is very useful.
Could this description be upleveled to diskann/src/agents.md or readme.

harsha-simhadri · 2026-05-05T20:54:00Z

+//! # Hot loop
+//!
+//! Algorithms drive the scan via [`FlatIterator::next`] (lending iterator) or override
+//! [`FlatIterator::on_elements_unordered`] when batching/prefetching wins. The default


Is this paragraph supposed to be here. Seems a bit out of place.

harsha-simhadri · 2026-05-05T20:55:16Z

+    provider::HasId,
+};
+
+/// Post-process the survivor candidates produced by a flat search and


Is the intention to support filters with post_process? If so where is the clause?

# Conflicts: # diskann/src/graph/glue.rs

arrayka · 2026-05-05T21:59:05Z

+//!
+//! The module mirrors the layering used by graph search:
+//!
+//! | Graph (random access)                     | Flat (sequential)                 |  Shared?  |


Consider adding Responsibility column and provide one-sentence description for each layer.

arrayka · 2026-05-05T22:14:11Z

+pub trait DistancesUnordered<T>: OnElementsUnordered + BuildQueryComputer<T> {
+    /// Drive the entire scan, scoring each element with `computer` and invoking `f` with
+    /// the resulting `(id, distance)` pair.
+    fn distances_unordered<F>(


The algorithm only needs DistancesUnordered. Making it a supertrait of OnElementsUnordered forces all distance-capable providers to also expose raw element streaming, which leaks a lower-level capability and may block implementations that can compute distances without exposing element refs.

Could we decouple the traits and provide a blanket impl of DistancesUnordered for OnElementsUnordered + BuildQueryComputer instead:

Make DistancesUnordered independent, and provide a separate adapter/blanket impl when OnElementsUnordered is available. Conceptually:

trait DistancesUnordered<T> { fn distances_unordered(...) ... } (no supertrait)

impl<T, S> DistancesUnordered<T> for S where S: OnElementsUnordered + BuildQueryComputer<T> { ... }

That keeps the algorithm-facing surface minimal, preserves encapsulation, and still keeps the convenience default behavior for types that do implement OnElementsUnordered.

arrayka · 2026-05-05T22:19:54Z

+    where
+        F: Send + FnMut(<Self as HasId>::Id, f32),
+    {
+        self.on_elements_unordered(move |id, element| {


All functions (including default implementations) should be covered by unit tests (ideally, in the same file).

arrayka · 2026-05-05T22:31:36Z

+                .into_ann_result()?;
+
+            let computer =
+                BuildQueryComputer::build_query_computer(&visitor, query).into_ann_result()?;


Could you please provide an example of why visitor is needed to build a query computer?

arrayka · 2026-05-05T22:34:15Z

    type Id = u32;
 }

+impl provider::HasElementRef for Accessor<'_> {


As an option, consider extracting this prerequisite change into a separate PR to keep this one smaller and easier to review.

arkrishn94 added 5 commits April 27, 2026 22:13

first draft

6b6c63c

remove unnecessary parts

9f718a1

Merge remote-tracking branch 'origin/main' into u/adkrishnan/flat-index

55cafc2

rename file

f0a9dbd

fmt

0672f3d

arkrishn94 marked this pull request as ready for review April 28, 2026 16:02

arkrishn94 requested review from a team and Copilot April 28, 2026 16:02

Copilot started reviewing on behalf of arkrishn94 April 28, 2026 16:06 View session

Copilot AI reviewed Apr 28, 2026

View reviewed changes

arkrishn94 added 2 commits April 28, 2026 17:27

split iterator to callback

3ac0e1b

use distance unordered callback

f887f2f

arkrishn94 changed the title ~~[RFC] Flat Index Search~~ [flat index] Flat Search Interface Apr 29, 2026

arkrishn94 added 5 commits April 28, 2026 21:26

rfc update

dc2281c

rustfmt flat module

ee48f7d

Merge remote-tracking branch 'origin/main' into u/adkrishnan/flat-index

3b2a1e0

fix clippy: replace doc-comment divider with regular comment

7fd903e

small edits

8af5e00

arrayka requested review from harsha-simhadri and hildebrandmw April 29, 2026 19:00

arrayka requested changes Apr 29, 2026

View reviewed changes

Comment thread diskann/src/flat/index.rs

Comment thread diskann/src/flat/iterator.rs Outdated

Comment thread diskann/src/lib.rs

hildebrandmw reviewed Apr 29, 2026

View reviewed changes

arkrishn94 added 2 commits May 4, 2026 18:09

renames and uplevel query computer

1dd1c72

split buildquerycomputer, haselementref and distancesunordered

3d24ef7

harsha-simhadri reviewed May 5, 2026

View reviewed changes

delete flatpostprocess, cleanup and docs

0a63759

arrayka linked an issue May 5, 2026 that may be closed by this pull request

Create RFC and initial implementation for flat-index search support #999

Open

Merge remote-tracking branch 'origin/main' into u/adkrishnan/flat-index

1f48945

# Conflicts: # diskann/src/graph/glue.rs

arrayka reviewed May 5, 2026

View reviewed changes

update rfc

57d86e9

	//! [`FlatIndex`] — the index wrapper for an on which we do flat search.
	//! [`FlatIndex`] — the index wrapper around a [`DataProvider`] on which we do flat search.

	type Error = crate::error::Infallible;
	type Error = std::convert::Infallible;

-/// Both methods may borrow from the strategy itself.
+/// [`Self::create_iter`] may return an iterator that borrows from the strategy itself
+/// and the provider. [`Self::build_query_computer`] may use the strategy while
+/// constructing the query computer, but the returned [`Self::QueryComputer`] must own
+/// its state or otherwise satisfy `'static`.

-}
+}
+#[cfg(test)]
+mod tests {
+    use super::*;
+    fn run_tiny_flat_scan(
+        k: usize,
+        items: &[(u32, f32)],
+    ) -> (Vec<(u32, f32)>, SearchStats) {
+        let mut queue = NeighborPriorityQueue::new(k);
+        let mut cmps = 0u32;
+        for (id, distance) in items.iter().copied() {
+            cmps += 1;
+            queue.insert(Neighbor::new(id, distance));
+        }
+        let copied: Vec<(u32, f32)> = queue
+            .iter()
+            .take(k)
+            .map(|neighbor| (neighbor.id, neighbor.distance))
+            .collect();
+        let stats = SearchStats {
+            cmps,
+            hops: 0,
+            result_count: copied.len() as u32,
+            range_search_second_round: false,
+        };
+        (copied, stats)
+    }
+    #[test]
+    fn knn_search_keeps_top_k_in_distance_order() {
+        let (copied, stats) = run_tiny_flat_scan(
+,
+            &[(10, 4.0), (11, 1.5), (12, 3.0), (13, 0.5), (14, 2.0)],
+        );
+        assert_eq!(copied, vec![(13, 0.5), (11, 1.5), (14, 2.0)]);
+        assert_eq!(stats.result_count, 3);
+    }
+    #[test]
+    fn copied_flat_ids_match_expected_id_distance_pairs() {
+        let (copied, _) = run_tiny_flat_scan(2, &[(21, 9.0), (22, 1.25), (23, 4.5)]);
+        assert_eq!(copied, vec![(22, 1.25), (23, 4.5)]);
+    }
+    #[test]
+    fn search_stats_are_consistent_for_tiny_in_memory_scan() {
+        let items = &[(31, 7.0), (32, 2.0), (33, 5.0), (34, 1.0)];
+        let (copied, stats) = run_tiny_flat_scan(2, items);
+        assert_eq!(stats.cmps, items.len() as u32);
+        assert_eq!(stats.hops, 0);
+        assert_eq!(stats.result_count, copied.len() as u32);
+        assert!(!stats.range_search_second_round);
+    }
+}

Conversation

arkrishn94 commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Important refactor

HasElementRef - a new minimal shape trait

BuildQueryComputer<T> - shared query preprocessing

Flat search - core components

OnElementsUnordered — the core scan

flat::DistancesUnordered<T> — fused scan + score

flat::SearchStrategy<P, T> — the glue

FlatIndex<P> — the top-level handle

FlatIterator + Iterated<I> — opt-in convenience

Uh oh!

codecov-commenter commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hildebrandmw left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

arkrishn94 commented Apr 28, 2026 •

edited

Loading

`HasElementRef` - a new minimal shape trait

`BuildQueryComputer<T>` - shared query preprocessing

`OnElementsUnordered` — the core scan

`flat::DistancesUnordered<T>` — fused scan + score

`flat::SearchStrategy<P, T>` — the glue

`FlatIndex<P>` — the top-level handle

`FlatIterator` + `Iterated<I>` — opt-in convenience

codecov-commenter commented Apr 28, 2026 •

edited

Loading