[benchmark] Automatic Input Registration#1066
Conversation
There was a problem hiding this comment.
Pull request overview
This PR unifies benchmark input and benchmark registration under diskann_benchmark_runner::Registry, with benchmark registration automatically registering the benchmark’s associated input type.
Changes:
- Replaced separate
Inputs/Benchmarksflows with a unifiedRegistry. - Propagated registration errors through benchmark registration call sites.
- Updated benchmark README/API documentation and runner tests for the new model.
Reviewed changes
Copilot reviewed 34 out of 34 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| diskann-benchmark/src/utils/mod.rs | Updates stub registration to use Registry. |
| diskann-benchmark/src/main.rs | Creates one registry and passes it to the app. |
| diskann-benchmark/src/inputs/mod.rs | Removes centralized input registration. |
| diskann-benchmark/src/inputs/graph_index.rs | Removes graph input registration helper. |
| diskann-benchmark/src/inputs/filters.rs | Removes filter input registration helper. |
| diskann-benchmark/src/inputs/exhaustive.rs | Removes exhaustive input registration helper. |
| diskann-benchmark/src/inputs/disk.rs | Removes disk input registration helper. |
| diskann-benchmark/src/backend/mod.rs | Propagates registry registration errors. |
| diskann-benchmark/src/backend/index/spherical.rs | Registers spherical benchmarks through Registry. |
| diskann-benchmark/src/backend/index/search/plugins.rs | Updates docs to refer to Registry. |
| diskann-benchmark/src/backend/index/scalar.rs | Registers scalar benchmarks through Registry. |
| diskann-benchmark/src/backend/index/product.rs | Registers product benchmarks through Registry. |
| diskann-benchmark/src/backend/index/mod.rs | Threads unified registry through index backend. |
| diskann-benchmark/src/backend/index/benchmarks.rs | Updates graph index benchmark registration. |
| diskann-benchmark/src/backend/filters/mod.rs | Threads unified registry through filters backend. |
| diskann-benchmark/src/backend/filters/benchmark.rs | Updates metadata benchmark registration. |
| diskann-benchmark/src/backend/exhaustive/spherical.rs | Updates spherical exhaustive registration. |
| diskann-benchmark/src/backend/exhaustive/product.rs | Updates product exhaustive registration. |
| diskann-benchmark/src/backend/exhaustive/mod.rs | Propagates exhaustive registration errors. |
| diskann-benchmark/src/backend/exhaustive/minmax.rs | Updates minmax exhaustive registration. |
| diskann-benchmark/src/backend/disk_index/mod.rs | Threads unified registry through disk backend. |
| diskann-benchmark/src/backend/disk_index/benchmarks.rs | Updates disk regression registration. |
| diskann-benchmark/README.md | Revises benchmark API documentation. |
| diskann-benchmark-simd/src/lib.rs | Updates SIMD benchmark registration API. |
| diskann-benchmark-simd/src/bin.rs | Uses unified registry in SIMD binary. |
| diskann-benchmark-runner/src/test/mod.rs | Registers test benchmarks through Registry. |
| diskann-benchmark-runner/src/registry.rs | Introduces unified registry and duplicate input type checks. |
| diskann-benchmark-runner/src/lib.rs | Re-exports Registry and RegistryError. |
| diskann-benchmark-runner/src/jobs.rs | Parses jobs against unified registry inputs. |
| diskann-benchmark-runner/src/internal/regression.rs | Uses unified registry for regression checks. |
| diskann-benchmark-runner/src/input.rs | Adds input reflection helpers for duplicate checks. |
| diskann-benchmark-runner/src/benchmark.rs | Updates registry documentation links. |
| diskann-benchmark-runner/src/app.rs | Updates app execution to accept one registry. |
| diskann-benchmark-runner/dev/main.rs | Uses unified registry in dev runner. |
Comments suppressed due to low confidence (6)
diskann-benchmark/README.md:405
- The command immediately above uses
compute-groundtruth, but the example input tag iscompute_groundtruth. Since theinputssubcommand looks up input tags, following this example would not display the new input.
**diskann-benchmark/README.md:421**
* The preceding input implementation is for `crate::inputs::Input<ComputeGroundTruth>`, but this benchmark declares its associated input as `ComputeGroundTruth`. `Benchmark::Input` must implement `diskann_benchmark_runner::Input`, so the example will not compile as written.
type Input = ComputeGroundTruth;
**diskann-benchmark/README.md:429**
* `MatchScore` is a tuple struct and does not provide a `new` constructor in `diskann_benchmark_runner`; this example should use the actual constructor shape or it will not compile.
Ok(MatchScore::new(0))
**diskann-benchmark/README.md:479**
* This sentence still describes the old multi-argument dispatch model. `Benchmark::try_match` now evaluates one associated input, so "all arguments" is misleading in the updated API documentation.
The method Benchmark::try_match returns both a successful MatchScore and an
unsuccessful FailureScore. The registry will only invoke methods where all arguments
**diskann-benchmark-runner/src/app.rs:192**
* The `run` documentation still refers to separate `inputs` and `benchmarks`, but the method now accepts a single unified `Registry`. Updating this wording would keep the public API docs consistent with the new signature.
registry: ®istry::Registry,
**diskann-benchmark/README.md:493**
* The signature shown here does not match `Benchmark::description`: the actual trait method includes a `&self` receiver and returns `std::fmt::Result`. As written, the documented API signature is not implementable.
fn description(f: &mut std::fmt::Formatter<'_>, from: Option<&Self::Input>);
</details>
---
💡 <a href="/microsoft/DiskANN/new/main?filename=.github/instructions/*.instructions.md" class="Link--inTextBlock" target="_blank" rel="noopener noreferrer">Add Copilot custom instructions</a> for smarter, more guided reviews. <a href="https://docs.github.com/en/copilot/customizing-copilot/adding-repository-custom-instructions-for-github-copilot" class="Link--inTextBlock" target="_blank" rel="noopener noreferrer">Learn how to get started</a>.
| automatically register the associated input. | ||
|
|
||
| At run time, the front end will discover benchmarks in the input JSON file and use the tag | ||
| string in the "contents" field to select the correct input deserializer. Benchmarks will |
| // Otherwise, returns a failure. | ||
| fn try_match(from: &&'a Any) -> Result<MatchScore, FailureScore> { | ||
| from.try_match::<ComputeGroundTruth, Self>(from) | ||
| impl Benchmark for RunGroundTruth { |
| all other runs. Benchmark implementations do not need to worry about saving their input | ||
| as well as this is automatically handled by the benchmarking infrastructure. | ||
| The argument `output: &mut dyn diskann_benchmark_runner::Output` is a dynamic type where | ||
| all output should be written too. Additionally, it provides a |
| When the dispatcher cannot find any matching method for an input, it begins a process of | ||
| When the registry cannot find any matching method for an input, it begins a process of | ||
| finding the "nearest misses" by inspecting and ranking methods based on their `FailureScore`. | ||
| Benchmarks can opt-in to this process by returning meaning `FailureScores` when an input is |
|
|
||
| // NOTE: This benchmark is heavily monomorphized. Each `(NBITS, T)` pair | ||
| // generates a full `Benchmark` impl via the `impl_sq_build!` macro in `mod imp`, | ||
| // generates a full `Registry` impl via the `impl_sq_build!` macro in `mod imp`, |
| }; | ||
| } | ||
|
|
||
| // For the types below, `A` and `B` have distinct tags, but `A2`'s tag conflicts with `A2`. |
| // For the types below, `A` and `B` have distinct tags, but `A2`'s tag conflicts with `A2`. | ||
| input!(A, "type-a"); | ||
| input!(B, "type-b"); | ||
| input!(A2, "type-a"); | ||
|
|
||
| #[test] | ||
| fn test_name_conflicts() { |
| //! // registry.register::<MyBenchmark>("my-bench"); | ||
| //! // registry.register_regression::<MyRegressionBenchmark>("my-regression"); |
| if o.get().as_any().is::<crate::input::Wrapper<T>>() { | ||
| Ok(()) |
|
|
||
| Ok(ParsedInner { | ||
| entry, | ||
| entry: entry.clone(), |
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (84.50%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #1066 +/- ##
==========================================
- Coverage 89.51% 89.48% -0.04%
==========================================
Files 461 461
Lines 85920 85934 +14
==========================================
- Hits 76911 76894 -17
- Misses 9009 9040 +31
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Part of ongoing work to simplify the benchmark API.
Unify input and benchmark registries into a single
diskann_benchmark_runner::Registryand automatically registerBenchmark::Inputs when a benchmark is registered viaRegistry::registerorRegistry::register_regression.Since this can result in multiple registration of the same input, these APIs are extended to return a
Result<(), RegistrationError>where an error is returned if anInput::tag()is already registered, but the dynamic type of the input is different.Suggested Review Order
diskann_benchmark_runner:registry.rs, whereInputsis removed and its contents moved intoRegistry. The internalRegistry::register_inputis where the logic for rejecting different input types with the same tag resides.Registrywhere eitherInputsorBenchmarkswere previously threaded.diskann_benchmark:README.md: The README has gotten a little out of date and would be even more so after this change. I took a pass at updating it.Inputregistration sites are removed and theBenchmarkregistrations are updated to use the new API which can returnRegistryError. I keptanyhow::Erroras the return type for better error stack-traces if we ever do end up with a duplicate registration error.