Skip to content

[benchmark] Automatic Input Registration#1066

Open
hildebrandmw wants to merge 4 commits into
mainfrom
mhildebr/automatic-input-registration
Open

[benchmark] Automatic Input Registration#1066
hildebrandmw wants to merge 4 commits into
mainfrom
mhildebr/automatic-input-registration

Conversation

@hildebrandmw
Copy link
Copy Markdown
Contributor

Part of ongoing work to simplify the benchmark API.

Unify input and benchmark registries into a single diskann_benchmark_runner::Registry and automatically register Benchmark::Inputs when a benchmark is registered via Registry::register or Registry::register_regression.

Since this can result in multiple registration of the same input, these APIs are extended to return a Result<(), RegistrationError> where an error is returned if an Input::tag() is already registered, but the dynamic type of the input is different.

Suggested Review Order

  • diskann_benchmark_runner:
    • The main change is in registry.rs, where Inputs is removed and its contents moved into Registry. The internal Registry::register_input is where the logic for rejecting different input types with the same tag resides.
    • The rest of the changes are minor tweaks to use Registry where either Inputs or Benchmarks were previously threaded.
  • diskann_benchmark:
    • README.md: The README has gotten a little out of date and would be even more so after this change. I took a pass at updating it.
    • The previous independent Input registration sites are removed and the Benchmark registrations are updated to use the new API which can return RegistryError. I kept anyhow::Error as the return type for better error stack-traces if we ever do end up with a duplicate registration error.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR unifies benchmark input and benchmark registration under diskann_benchmark_runner::Registry, with benchmark registration automatically registering the benchmark’s associated input type.

Changes:

  • Replaced separate Inputs/Benchmarks flows with a unified Registry.
  • Propagated registration errors through benchmark registration call sites.
  • Updated benchmark README/API documentation and runner tests for the new model.

Reviewed changes

Copilot reviewed 34 out of 34 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
diskann-benchmark/src/utils/mod.rs Updates stub registration to use Registry.
diskann-benchmark/src/main.rs Creates one registry and passes it to the app.
diskann-benchmark/src/inputs/mod.rs Removes centralized input registration.
diskann-benchmark/src/inputs/graph_index.rs Removes graph input registration helper.
diskann-benchmark/src/inputs/filters.rs Removes filter input registration helper.
diskann-benchmark/src/inputs/exhaustive.rs Removes exhaustive input registration helper.
diskann-benchmark/src/inputs/disk.rs Removes disk input registration helper.
diskann-benchmark/src/backend/mod.rs Propagates registry registration errors.
diskann-benchmark/src/backend/index/spherical.rs Registers spherical benchmarks through Registry.
diskann-benchmark/src/backend/index/search/plugins.rs Updates docs to refer to Registry.
diskann-benchmark/src/backend/index/scalar.rs Registers scalar benchmarks through Registry.
diskann-benchmark/src/backend/index/product.rs Registers product benchmarks through Registry.
diskann-benchmark/src/backend/index/mod.rs Threads unified registry through index backend.
diskann-benchmark/src/backend/index/benchmarks.rs Updates graph index benchmark registration.
diskann-benchmark/src/backend/filters/mod.rs Threads unified registry through filters backend.
diskann-benchmark/src/backend/filters/benchmark.rs Updates metadata benchmark registration.
diskann-benchmark/src/backend/exhaustive/spherical.rs Updates spherical exhaustive registration.
diskann-benchmark/src/backend/exhaustive/product.rs Updates product exhaustive registration.
diskann-benchmark/src/backend/exhaustive/mod.rs Propagates exhaustive registration errors.
diskann-benchmark/src/backend/exhaustive/minmax.rs Updates minmax exhaustive registration.
diskann-benchmark/src/backend/disk_index/mod.rs Threads unified registry through disk backend.
diskann-benchmark/src/backend/disk_index/benchmarks.rs Updates disk regression registration.
diskann-benchmark/README.md Revises benchmark API documentation.
diskann-benchmark-simd/src/lib.rs Updates SIMD benchmark registration API.
diskann-benchmark-simd/src/bin.rs Uses unified registry in SIMD binary.
diskann-benchmark-runner/src/test/mod.rs Registers test benchmarks through Registry.
diskann-benchmark-runner/src/registry.rs Introduces unified registry and duplicate input type checks.
diskann-benchmark-runner/src/lib.rs Re-exports Registry and RegistryError.
diskann-benchmark-runner/src/jobs.rs Parses jobs against unified registry inputs.
diskann-benchmark-runner/src/internal/regression.rs Uses unified registry for regression checks.
diskann-benchmark-runner/src/input.rs Adds input reflection helpers for duplicate checks.
diskann-benchmark-runner/src/benchmark.rs Updates registry documentation links.
diskann-benchmark-runner/src/app.rs Updates app execution to accept one registry.
diskann-benchmark-runner/dev/main.rs Uses unified registry in dev runner.
Comments suppressed due to low confidence (6)

diskann-benchmark/README.md:405

  • The command immediately above uses compute-groundtruth, but the example input tag is compute_groundtruth. Since the inputs subcommand looks up input tags, following this example would not display the new input.
**diskann-benchmark/README.md:421**
* The preceding input implementation is for `crate::inputs::Input<ComputeGroundTruth>`, but this benchmark declares its associated input as `ComputeGroundTruth`. `Benchmark::Input` must implement `diskann_benchmark_runner::Input`, so the example will not compile as written.
type Input = ComputeGroundTruth;
**diskann-benchmark/README.md:429**
* `MatchScore` is a tuple struct and does not provide a `new` constructor in `diskann_benchmark_runner`; this example should use the actual constructor shape or it will not compile.
    Ok(MatchScore::new(0))
**diskann-benchmark/README.md:479**
* This sentence still describes the old multi-argument dispatch model. `Benchmark::try_match` now evaluates one associated input, so "all arguments" is misleading in the updated API documentation.

The method Benchmark::try_match returns both a successful MatchScore and an
unsuccessful FailureScore. The registry will only invoke methods where all arguments

**diskann-benchmark-runner/src/app.rs:192**
* The `run` documentation still refers to separate `inputs` and `benchmarks`, but the method now accepts a single unified `Registry`. Updating this wording would keep the public API docs consistent with the new signature.
    registry: &registry::Registry,
**diskann-benchmark/README.md:493**
* The signature shown here does not match `Benchmark::description`: the actual trait method includes a `&self` receiver and returns `std::fmt::Result`. As written, the documented API signature is not implementable.

fn description(f: &mut std::fmt::Formatter<'_>, from: Option<&Self::Input>);

</details>



---

💡 <a href="/microsoft/DiskANN/new/main?filename=.github/instructions/*.instructions.md" class="Link--inTextBlock" target="_blank" rel="noopener noreferrer">Add Copilot custom instructions</a> for smarter, more guided reviews. <a href="https://docs.github.com/en/copilot/customizing-copilot/adding-repository-custom-instructions-for-github-copilot" class="Link--inTextBlock" target="_blank" rel="noopener noreferrer">Learn how to get started</a>.

automatically register the associated input.

At run time, the front end will discover benchmarks in the input JSON file and use the tag
string in the "contents" field to select the correct input deserializer. Benchmarks will
// Otherwise, returns a failure.
fn try_match(from: &&'a Any) -> Result<MatchScore, FailureScore> {
from.try_match::<ComputeGroundTruth, Self>(from)
impl Benchmark for RunGroundTruth {
all other runs. Benchmark implementations do not need to worry about saving their input
as well as this is automatically handled by the benchmarking infrastructure.
The argument `output: &mut dyn diskann_benchmark_runner::Output` is a dynamic type where
all output should be written too. Additionally, it provides a
When the dispatcher cannot find any matching method for an input, it begins a process of
When the registry cannot find any matching method for an input, it begins a process of
finding the "nearest misses" by inspecting and ranking methods based on their `FailureScore`.
Benchmarks can opt-in to this process by returning meaning `FailureScores` when an input is

// NOTE: This benchmark is heavily monomorphized. Each `(NBITS, T)` pair
// generates a full `Benchmark` impl via the `impl_sq_build!` macro in `mod imp`,
// generates a full `Registry` impl via the `impl_sq_build!` macro in `mod imp`,
};
}

// For the types below, `A` and `B` have distinct tags, but `A2`'s tag conflicts with `A2`.
Comment on lines +409 to +415
// For the types below, `A` and `B` have distinct tags, but `A2`'s tag conflicts with `A2`.
input!(A, "type-a");
input!(B, "type-b");
input!(A2, "type-a");

#[test]
fn test_name_conflicts() {
Comment on lines +42 to +43
//! // registry.register::<MyBenchmark>("my-bench");
//! // registry.register_regression::<MyRegressionBenchmark>("my-regression");
Comment on lines +202 to +203
if o.get().as_any().is::<crate::input::Wrapper<T>>() {
Ok(())

Ok(ParsedInner {
entry,
entry: entry.clone(),
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 84.50704% with 33 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.48%. Comparing base (9fe7053) to head (2f5b0b1).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
diskann-benchmark-simd/src/lib.rs 57.89% 16 Missing ⚠️
diskann-benchmark-runner/src/registry.rs 88.88% 8 Missing ⚠️
diskann-benchmark/src/backend/index/benchmarks.rs 61.90% 8 Missing ⚠️
diskann-benchmark-runner/src/test/mod.rs 90.00% 1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (84.50%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1066      +/-   ##
==========================================
- Coverage   89.51%   89.48%   -0.04%     
==========================================
  Files         461      461              
  Lines       85920    85934      +14     
==========================================
- Hits        76911    76894      -17     
- Misses       9009     9040      +31     
Flag Coverage Δ
miri 89.48% <84.50%> (-0.04%) ⬇️
unittests 89.10% <84.50%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann-benchmark-runner/src/app.rs 84.24% <100.00%> (-0.35%) ⬇️
diskann-benchmark-runner/src/benchmark.rs 89.21% <ø> (ø)
diskann-benchmark-runner/src/input.rs 81.39% <100.00%> (+3.01%) ⬆️
...iskann-benchmark-runner/src/internal/regression.rs 97.69% <100.00%> (-0.01%) ⬇️
diskann-benchmark-runner/src/jobs.rs 96.82% <100.00%> (ø)
diskann-benchmark-simd/src/bin.rs 87.71% <100.00%> (-0.42%) ⬇️
diskann-benchmark/src/backend/disk_index/mod.rs 100.00% <100.00%> (ø)
diskann-benchmark/src/backend/exhaustive/minmax.rs 100.00% <100.00%> (ø)
diskann-benchmark/src/backend/exhaustive/mod.rs 100.00% <100.00%> (ø)
...iskann-benchmark/src/backend/exhaustive/product.rs 100.00% <100.00%> (ø)
... and 20 more

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants