add config examples to use PQ and SQ indexing and search for wiki-1M with Cohere embeddings#1047
add config examples to use PQ and SQ indexing and search for wiki-1M with Cohere embeddings#1047harsha-simhadri wants to merge 5 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds new benchmark configuration examples intended to demonstrate product quantization (PQ) and spherical quantization workflows for the wikipedia-1M Cohere embedding dataset.
Changes:
- Added a PQ graph-index build+search example JSON for wikipedia-1M.
- Added an exhaustive spherical-quantization example JSON (currently named as a wiki1M graph-index example).
Reviewed changes
Copilot reviewed 1 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| diskann-benchmark/example/graph-index-spherical-quantization-wiki1M.json | Adds an exhaustive spherical quantization benchmark config (but currently references siftsmall test data / exhaustive tag). |
| diskann-benchmark/example/graph-index-product-quantization-wiki1M.json | Adds a PQ graph-index benchmark config for wikipedia-1M (currently uses an unregistered job type). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
The PR accidentally renamed spherical-exhaustive.json to graph-index-spherical-quantization-wiki1M.json, breaking the spherical_quantization_intergration test which references the old name. - Restore spherical-exhaustive.json with its original content - Update graph-index-spherical-quantization-wiki1M.json with proper content Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1047 +/- ##
==========================================
- Coverage 90.60% 89.47% -1.13%
==========================================
Files 461 461
Lines 85494 85494
==========================================
- Hits 77462 76498 -964
- Misses 8032 8996 +964
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
| @@ -0,0 +1,55 @@ | |||
| { | |||
| "search_directories": [ | |||
| "../big-ann-benchmarks/data/wikipedia_cohere" ], | |||
There was a problem hiding this comment.
Minor: In the spirit of self‑service, consider including guidance on how to download the dataset, so that even an AI agent can follow the steps and retrieve the required data.
add config examples to use PQ and SQ indexing and search for wiki-1M with Cohere embeddings