rapidsai · rapids-bot · Apr 9, 2026 · Mar 6, 2026 · Mar 6, 2026 · Mar 6, 2026
@@ -106,24 +106,36 @@ Running the benchmarks
 End-to-end: smaller-scale benchmarks (<1M to 10M)
 -------------------------------------------------
 
-The steps below demonstrate how to download, install, and run benchmarks on a subset of 10M vectors from the Yandex Deep-1B dataset By default the datasets will be stored and used from the folder indicated by the `RAPIDS_DATASET_ROOT_DIR` environment variable if defined, otherwise a datasets sub-folder from where the script is being called:
+The steps below demonstrate how to download, install, and run benchmarks on a subset of 10M vectors from the Yandex Deep-1B dataset. By default the datasets will be stored and used from the folder indicated by the `RAPIDS_DATASET_ROOT_DIR` environment variable if defined, otherwise a datasets sub-folder from where the script is being called.
 
 .. code-block:: bash
 
-
-    # (1) prepare dataset.
+    # (1) Prepare dataset.
     python -m cuvs_bench.get_dataset --dataset deep-image-96-angular --normalize
 
-    # (2) build and search index
-    python -m cuvs_bench.run --dataset deep-image-96-inner --algorithms cuvs_cagra --batch-size 10 -k 10
+.. code-block:: python
+
+    # (2) Build and search index.
+    from cuvs_bench.orchestrator import BenchmarkOrchestrator
+
+    orchestrator = BenchmarkOrchestrator(backend_type="cpp_gbench")
+    results = orchestrator.run_benchmark(
+        dataset="deep-image-96-inner",
+        algorithms="cuvs_cagra",
+        count=10,
+        batch_size=10,
+        build=True,
+        search=True,
+    )
 
-    # (3) export data
+.. code-block:: bash
+
+    # (3) Export data.
     python -m cuvs_bench.run --data-export --dataset deep-image-96-inner
 
-    # (4) plot results
+    # (4) Plot results.
     python -m cuvs_bench.plot --dataset deep-image-96-inner
 
-
 .. list-table::
 
  * - Dataset name
@@ -192,19 +204,33 @@ The steps below demonstrate how to download, install, and run benchmarks on a su
 .. code-block:: bash
 
     mkdir -p datasets/deep-1B
-    # (1) prepare dataset
+    # (1) Prepare dataset.
     # download manually "Ground Truth" file of "Yandex DEEP"
     # suppose the file name is deep_new_groundtruth.public.10K.bin
     python -m cuvs_bench.split_groundtruth --groundtruth datasets/deep-1B/deep_new_groundtruth.public.10K.bin
     # two files 'groundtruth.neighbors.ibin' and 'groundtruth.distances.fbin' should be produced
 
-    # (2) build and search index
-    python -m cuvs_bench.run --dataset deep-1B --algorithms cuvs_cagra --batch-size 10 -k 10
+.. code-block:: python
 
-    # (3) export data
+    # (2) Build and search index.
+    from cuvs_bench.orchestrator import BenchmarkOrchestrator
+
+    orchestrator = BenchmarkOrchestrator(backend_type="cpp_gbench")
+    results = orchestrator.run_benchmark(
+        dataset="deep-1B",
+        algorithms="cuvs_cagra",
+        count=10,
+        batch_size=10,
+        build=True,
+        search=True,
+    )
+
+.. code-block:: bash
+
+    # (3) Export data.
     python -m cuvs_bench.run --data-export --dataset deep-1B
 
-    # (4) plot results
+    # (4) Plot results.
     python -m cuvs_bench.plot --dataset deep-1B
 
 The usage of `python -m cuvs_bench.split_groundtruth` is:
@@ -414,7 +440,7 @@ Creating and customizing dataset configurations
 
 A single configuration will often define a set of algorithms, with associated index and search parameters, that can be generalize across datasets. We use YAML to define dataset specific and algorithm specific configurations.
 
-A default `datasets.yaml` is provided by CUVS in `${CUVS_HOME}/python/cuvs_bench/src/cuvs_bench/run/conf` with configurations available for several datasets. Here's a simple example entry for the `sift-128-euclidean` dataset:
+A default `datasets.yaml` is provided by CUVS in `${CUVS_HOME}/python/cuvs_bench/cuvs_bench/config/datasets/datasets.yaml` with configurations available for several datasets. Here's a simple example entry for the `sift-128-euclidean` dataset:
 
 .. code-block:: yaml
 
@@ -430,6 +456,9 @@ Configuration files for ANN algorithms supported by `cuvs-bench` are provided in
 .. code-block:: yaml
 
     name: cuvs_cagra
+    constraints:
+      build: cuvs_bench.config.algos.constraints.cuvs_cagra_build
+      search: cuvs_bench.config.algos.constraints.cuvs_cagra_search
     groups:
       base:
         build:
@@ -447,9 +476,11 @@ Configuration files for ANN algorithms supported by `cuvs-bench` are provided in
 
 The default parameters for which the benchmarks are run can be overridden by creating a custom YAML file for algorithms with a `base` group.
 
-There config above has 2 fields:
-1. `name` - define the name of the algorithm for which the parameters are being specified.
-2. `groups` - define a run group which has a particular set of parameters. Each group helps create a cross-product of all hyper-parameter fields for `build` and `search`.
+The config above has 3 fields:
+
+1. `name` - The name of the algorithm for which the parameters are being specified.
+2. `constraints` - Optional. Python import paths to functions that validate build and search parameter combinations (e.g. ``cuvs_bench.config.algos.constraints.cuvs_cagra_build``). Each function returns ``True`` if the parameters are valid, ``False`` otherwise; invalid combinations are skipped and not benchmarked.
+3. `groups` - Run groups, each with a set of parameters. Each group defines a cross-product of all hyper-parameter fields for `build` and `search`.
 
 The table below contains all algorithms supported by cuVS. Each unique algorithm will have its own set of `build` and `search` settings. The :doc:`ANN Algorithm Parameter Tuning Guide <param_tuning>` contains detailed instructions on choosing build and search parameters for each supported algorithm.
 
@@ -626,4 +657,5 @@ Add a new entry to `algos.yaml` to map the name of the algorithm to its binary e
    build.rst
    datasets.rst
    param_tuning.rst
+   pluggable_backend.rst
    wiki_all_dataset.rst
@@ -4,6 +4,38 @@ cuVS Bench Parameter Tuning Guide
 
 This guide outlines the various parameter settings that can be specified in :doc:`cuVS Benchmarks <index>` yaml configuration files and explains the impact they have on corresponding algorithms to help inform their settings for benchmarking across desired levels of recall.
 
+Benchmark modes
+===============
+
+When you run benchmarks with ``BenchmarkOrchestrator.run_benchmark()``, you can choose how parameters are explored:
+
+**Sweep mode (default)**
+
+Pass ``mode="sweep"`` or omit ``mode``. The orchestrator builds the full Cartesian product of all build and search parameter lists defined in the algorithm YAML (see :doc:`Creating and customizing dataset configurations <index>`). Every valid combination (after constraint filtering) is run. Use this for exhaustive comparison across the configured parameter grid.
+
+**Tune mode**
+
+Pass ``mode="tune"`` to perform hyperparameter optimization using Optuna instead of running every combination. You must pass:
+
+- **constraints** (dict): The optimization target and optional bounds. One metric must be ``"maximize"`` or ``"minimize"`` (the goal). Others can set hard limits with ``{"min": X}`` or ``{"max": X}``. Examples: ``{"recall": "maximize", "latency": {"max": 10}}`` or ``{"latency": "minimize", "recall": {"min": 0.95}}``.
+- **n_trials** (int, optional): Maximum number of Optuna trials (default 100). Ignored in sweep mode.
+
+Example:
+
+.. code-block:: python
+
+    results = orchestrator.run_benchmark(
+        mode="tune",
+        dataset="deep-image-96-inner",
+        algorithms="cuvs_cagra",
+        constraints={"recall": "maximize", "latency": {"max": 5.0}},
+        n_trials=50,
+        count=10,
+        batch_size=10,
+    )
+
+The parameter tables below describe the build and search knobs that sweep mode varies and that tune mode can optimize.
+
 cuVS Indexes
 ============