Skip to content

Close vector search final-phase gaps with sample configs and queries#18168

Open
xiangfu0 wants to merge 3 commits intoapache:masterfrom
xiangfu0:vector-phase-5
Open

Close vector search final-phase gaps with sample configs and queries#18168
xiangfu0 wants to merge 3 commits intoapache:masterfrom
xiangfu0:vector-phase-5

Conversation

@xiangfu0
Copy link
Copy Markdown
Contributor

@xiangfu0 xiangfu0 commented Apr 11, 2026

What changed

This PR closes the remaining mergeable vector-search gaps for the final phase of the Pinot vector stack.

It makes the generic quantizer path real in the build and query path, wires HNSW runtime controls into actual search behavior, brings IVF_ON_DISK to filter-aware ANN parity, promotes approximate-radius capability flags only where real backend support exists, and narrows the mutable/offline gap with a mergeable convergence layer instead of speculative mutable IVF indexing.

Concretely, the diff includes:

  • real SQ8 and SQ4 quantizer integration through the IVF creator/reader/search paths
  • HNSW query-time controls for vectorEfSearch, relative-distance checking, and bounded-queue behavior
  • immutable and mutable HNSW on the same runtime-control surface
  • IVF_ON_DISK FILTER_THEN_ANN support with explain/debug reporting
  • backend-aware approximate radius support plus truthful capability metadata
  • clearer mutable fallback behavior and mixed consuming/immutable explainability
  • the final close-out design note under docs/design/vector-backends-closeout.md

Why this changed

After the earlier vector phases, the remaining gaps were mostly consistency and execution-path issues rather than broad roadmap work. Several features existed in validators, explain metadata, or capability flags without being fully real in the search path, and mutable versus immutable behavior still diverged in ways that were hard to reason about operationally.

This PR closes those concrete gaps while keeping the scope narrow enough to merge safely.

User manual

1. Sample table configs

HNSW

{
  "fieldConfigList": [
    {
      "name": "embedding",
      "encodingType": "RAW",
      "indexType": "VECTOR",
      "properties": {
        "vectorIndexType": "HNSW",
        "vectorDimension": 1536,
        "vectorDistanceFunction": "COSINE",
        "version": 1,
        "maxCon": "16",
        "beamWidth": "200"
      }
    }
  ]
}

IVF_FLAT with scalar quantization

{
  "fieldConfigList": [
    {
      "name": "embedding",
      "encodingType": "RAW",
      "indexType": "VECTOR",
      "properties": {
        "vectorIndexType": "IVF_FLAT",
        "vectorDimension": 768,
        "vectorDistanceFunction": "EUCLIDEAN",
        "version": 1,
        "nlist": "128",
        "trainSampleSize": "20000",
        "quantizer": "SQ8"
      }
    }
  ]
}

IVF_FLAT and IVF_ON_DISK now accept quantizer=FLAT|SQ8|SQ4.

IVF_ON_DISK with scalar quantization

{
  "fieldConfigList": [
    {
      "name": "embedding",
      "encodingType": "RAW",
      "indexType": "VECTOR",
      "properties": {
        "vectorIndexType": "IVF_ON_DISK",
        "vectorDimension": 768,
        "vectorDistanceFunction": "EUCLIDEAN",
        "version": 1,
        "nlist": "256",
        "trainSampleSize": "50000",
        "quantizer": "SQ4"
      }
    }
  ]
}

IVF_PQ

{
  "fieldConfigList": [
    {
      "name": "embedding",
      "encodingType": "RAW",
      "indexType": "VECTOR",
      "properties": {
        "vectorIndexType": "IVF_PQ",
        "vectorDimension": 768,
        "vectorDistanceFunction": "EUCLIDEAN",
        "version": 1,
        "nlist": "256",
        "trainSampleSize": "50000",
        "pqM": "32",
        "pqNbits": "8",
        "quantizer": "PQ"
      }
    }
  ]
}

IVF_PQ still accepts quantizer=FLAT for backward compatibility, but PQ is the intended setting.

2. Sample SQL queries

Basic top-K ANN query

SELECT cosineDistance(embedding, ARRAY[0.12, 0.34, 0.56]) AS dist, doc_id
FROM my_table
WHERE VECTOR_SIMILARITY(embedding, ARRAY[0.12, 0.34, 0.56], 10)
ORDER BY dist ASC
LIMIT 10

HNSW runtime controls

set vectorEfSearch=128;
set vectorHnswUseRelativeDistance=false;
set vectorHnswUseBoundedQueue=false;
SELECT cosineDistance(embedding, ARRAY[0.12, 0.34, 0.56]) AS dist, doc_id
FROM my_table
WHERE VECTOR_SIMILARITY(embedding, ARRAY[0.12, 0.34, 0.56], 10)
ORDER BY dist ASC
LIMIT 10

These controls now affect real HNSW search behavior on both immutable and mutable HNSW segments.

IVF runtime controls

set vectorNprobe=16;
set vectorMaxCandidates=500;
set vectorExactRerank=true;
SELECT l2Distance(embedding, ARRAY[1.0, 2.0, 3.0]) AS dist, doc_id
FROM my_table
WHERE VECTOR_SIMILARITY(embedding, ARRAY[1.0, 2.0, 3.0], 20)
ORDER BY dist ASC
LIMIT 20

Approximate radius query

SELECT l2Distance(embedding, ARRAY[1.0, 2.0, 3.0]) AS dist, doc_id
FROM my_table
WHERE VECTOR_SIMILARITY_RADIUS(embedding, ARRAY[1.0, 2.0, 3.0], 0.75)
ORDER BY dist ASC
LIMIT 200

3. Sample HTTP query-options usage

{
  "sql": "SELECT cosineDistance(embedding, ARRAY[0.12, 0.34, 0.56]) AS dist, doc_id FROM my_table WHERE VECTOR_SIMILARITY(embedding, ARRAY[0.12, 0.34, 0.56], 10) ORDER BY dist ASC LIMIT 10",
  "queryOptions": "vectorEfSearch=128;vectorHnswUseRelativeDistance=false;vectorHnswUseBoundedQueue=false"
}

4. Behavior notes

  • Existing SQL remains unchanged: VECTOR_SIMILARITY(...) and VECTOR_SIMILARITY_RADIUS(...) keep the same surface.
  • HNSW supports the runtime controls in this PR.
  • IVF_FLAT, IVF_ON_DISK, and IVF_PQ now truthfully advertise approximate-radius support because the backend path is real.
  • IVF_ON_DISK now participates in filter-aware ANN execution and FILTER_THEN_ANN planning when supported.
  • Mutable HNSW is on the same runtime-control surface as immutable HNSW.
  • Mutable IVF-family ANN is still not introduced here. On consuming segments, unsupported mutable backends converge through explicit exact-scan fallback with explain/debug visibility instead of hidden behavior changes.

Benchmark snapshot

I also ran a local synthetic benchmark pass against the new feature surfaces on a 4000 x 64 corpus with 30 queries, topK=10, nlist=64, nprobe=8, radiusTargetMatches=20, and radiusMaxCandidates=256.

Highlights:

  • Quantized IVF real-path search:
    • IVF_FLAT + FLAT: build 487.8 ms, size 1032.4 KB, recall@10 0.4700, p50 257.8 us
    • IVF_FLAT + SQ8: build 317.4 ms, size 282.9 KB, recall@10 0.4700, p50 190.2 us
    • IVF_FLAT + SQ4: build 305.2 ms, size 157.9 KB, recall@10 0.4633, p50 183.4 us
    • IVF_PQ: build 1605.5 ms, size 158.5 KB, recall@10 0.3533, p50 476.6 us
  • HNSW runtime controls affect both immutable and mutable search paths:
    • immutable default: recall@10 0.6100, p50 190.6 us
    • immutable ef=64: recall@10 0.0667, p50 56.9 us
    • mutable default: recall@10 0.6033, p50 1145.5 us
    • mutable ef=64: recall@10 0.1100, p50 813.3 us
  • IVF_ON_DISK filter-aware ANN:
    • 100% selectivity: recall@10 0.4567, p50 205.0 us vs exact filtered p50 1149.3 us
    • 10% selectivity: recall@10 0.3700, p50 75.5 us vs exact filtered p50 59.6 us
    • 1% selectivity: recall@10 0.2467, p50 29.7 us vs exact filtered p50 4.0 us
  • Approximate radius:
    • IVF_FLAT Euclidean approximate radius: recall 0.4200, p50 75.2 us vs exact scan p50 267.5 us
    • IVF_ON_DISK cosine approximate radius: recall 0.4200, p50 82.4 us vs exact scan p50 242.1 us
  • Mixed immutable plus mutable HNSW:
    • default: recall@10 0.6433, p50 2425.9 us
    • ef=64: recall@10 0.0600, p50 965.5 us

The benchmark intent was not to claim globally tuned numbers. It was to verify that the newly-realized control surfaces and backend capabilities execute through the intended paths and produce measurable latency/recall tradeoffs.

User and developer impact

Users get runtime controls and capability flags that now match real backend behavior, especially for HNSW and approximate radius queries. Filter-aware ANN semantics are more consistent across IVF backends, and consuming-segment fallback behavior is more explicit instead of relying on opaque degradation.

Developers get a documented final-phase design, a real quantizer abstraction in the vector path, and stronger regression coverage for mutable HNSW, IVF_ON_DISK filter-aware search, and radius execution behavior.

Root cause

The remaining vector-stack issues came from partial feature rollout across layers: validation and metadata had moved ahead of the actual readers/operators in some places, while mutable and immutable paths had diverged in others. That left capability mismatches, explain-only knobs, and backend-specific behavior that was not consistently reflected in runtime execution.

Validation

I ran the vector-focused local validation after rebasing onto the latest upstream/master:

./mvnw -pl pinot-common,pinot-core,pinot-segment-local,pinot-segment-spi -am \
  -Dtest=CalciteSqlParserVectorRadiusTest,VectorSimilarityRadiusPredicateTest,VectorConfigTest,VectorBackendCapabilitiesTest,VectorIndexConfigValidatorTest,VectorBackendTypeTest,VectorExecutionModeTest,SegmentWithNullValueVectorTest,VectorTransformFunctionTest,VectorSearchStrategyTest,FilterAwareVectorSearchTest,VectorSearchParamsTest,VectorSimilarityFilterOperatorTest,VectorRadiusFilterOperatorTest,ExactVectorScanFilterOperatorTest,VectorQueryExecutionContextTest,VectorCompoundQueryTest,VectorFunctionsTest,MutableVectorIndexTest,MutableSegmentImplNullValueVectorTest,MutableNullValueVectorTest,VectorIndexUtilsTest,IvfPqVectorIndexTest,IvfFlatVectorIndexTest,VectorQuantizationUtilsTest,VectorIndexTypeTest,IvfPqIndexFormatTest,VectorIndexTest,HnswVectorIndexCreatorTest,NullValueVectorCreatorTest,VectorIndexHandlerTest,NullValueVectorReaderImplTest,IvfFlatFilterAwareTest,IvfOnDiskFilterAwareTest,FilterPlanNodeTest \
  test -Dsurefire.failIfNoSpecifiedTests=false

I also ran the targeted follow-up validation for the final IVF_ON_DISK filter-aware hot-path fix:

./mvnw -pl pinot-segment-local,pinot-core -am \
  -Dtest=IvfOnDiskFilterAwareTest,FilterAwareVectorSearchTest,FilterPlanNodeTest,VectorRadiusFilterOperatorTest \
  test -Dsurefire.failIfNoSpecifiedTests=false

Result: BUILD SUCCESS

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 11, 2026

Codecov Report

❌ Patch coverage is 63.34356% with 239 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.27%. Comparing base (34d3fd6) to head (1ffc1c6).

Files with missing lines Patch % Lines
.../readers/vector/LuceneHnswRuntimeControlUtils.java 24.69% 55 Missing and 6 partials ⚠️
...dex/readers/vector/IvfOnDiskVectorIndexReader.java 47.16% 48 Missing and 8 partials ⚠️
...index/readers/vector/IvfFlatVectorIndexReader.java 70.27% 10 Missing and 12 partials ⚠️
...t/index/readers/vector/IvfPqVectorIndexReader.java 53.48% 15 Missing and 5 partials ⚠️
...local/realtime/impl/vector/MutableVectorIndex.java 72.13% 15 Missing and 2 partials ⚠️
.../segment/index/vector/VectorQuantizationUtils.java 46.15% 8 Missing and 6 partials ⚠️
...perator/filter/VectorSimilarityFilterOperator.java 81.39% 6 Missing and 2 partials ⚠️
...nt/index/readers/vector/HnswVectorIndexReader.java 77.41% 7 Missing ⚠️
...ava/org/apache/pinot/core/plan/FilterPlanNode.java 25.00% 4 Missing and 2 partials ⚠️
...ment/local/segment/index/vector/FlatQuantizer.java 83.33% 1 Missing and 4 partials ⚠️
... and 9 more
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18168      +/-   ##
============================================
+ Coverage     63.13%   63.27%   +0.13%     
- Complexity     1610     1627      +17     
============================================
  Files          3213     3214       +1     
  Lines        195730   196281     +551     
  Branches      30240    30349     +109     
============================================
+ Hits         123583   124191     +608     
+ Misses        62281    62141     -140     
- Partials       9866     9949      +83     
Flag Coverage Δ
custom-integration1 100.00% <ø> (?)
integration 100.00% <ø> (+100.00%) ⬆️
integration1 100.00% <ø> (?)
integration2 0.00% <ø> (ø)
java-11 63.23% <63.34%> (+0.11%) ⬆️
java-21 63.24% <63.34%> (+0.14%) ⬆️
temurin 63.27% <63.34%> (+0.13%) ⬆️
unittests 63.26% <63.34%> (+0.12%) ⬆️
unittests1 55.23% <15.28%> (-0.15%) ⬇️
unittests2 34.93% <48.61%> (+0.16%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@xiangfu0 xiangfu0 changed the title [codex] Close vector search final-phase gaps Close vector search final-phase gaps Apr 11, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR closes remaining gaps in Pinot’s vector-search “final phase” by making several previously parsed/advertised behaviors real in execution paths (quantizers, HNSW runtime knobs, IVF_ON_DISK filter-aware ANN, and approximate-radius support), and aligning capability metadata + explain/debug output with actual backend behavior.

Changes:

  • Wire SQ8/SQ4 quantizers through IVF_FLAT/IVF_ON_DISK build + read/search paths (with new v2 index format support).
  • Implement HNSW query-time controls (vectorEfSearch, relative-distance checks, bounded-queue behavior) across immutable + mutable Lucene HNSW.
  • Add approximate-radius reader SPI and route radius queries through threshold-aware candidate generation where supported; extend IVF_ON_DISK to be pre-filter aware.

Reviewed changes

Copilot reviewed 34 out of 34 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java Adds query option keys for new HNSW runtime controls.
pinot-segment-spi/src/test/java/org/apache/pinot/segment/spi/index/creator/VectorIndexConfigValidatorTest.java Adds validator test coverage for quantizer support/validation across backends.
pinot-segment-spi/src/test/java/org/apache/pinot/segment/spi/index/creator/VectorBackendTypeTest.java Updates backend capability assertions (runtime params + IVF_ON_DISK coverage).
pinot-segment-spi/src/test/java/org/apache/pinot/segment/spi/index/creator/VectorBackendCapabilitiesTest.java Updates/extends capabilities expectations and adds IVF_ON_DISK capability tests.
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/reader/EfSearchAware.java Extends SPI to carry additional HNSW runtime toggles (relative distance, bounded queue).
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/reader/ApproximateRadiusVectorIndexReader.java Introduces SPI for backends that can generate threshold-aware approximate radius candidates.
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/creator/VectorIndexConfigValidator.java Enables/validates quantizer selection by backend type (incl. legacy IVF_PQ behavior).
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/creator/VectorExecutionMode.java Removes outdated comment implying FILTER_THEN_ANN is not selectable.
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/creator/VectorBackendType.java Updates capability flags (runtime params, approximate radius, IVF_ON_DISK filter-aware).
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/vector/IvfPqVectorIndexTest.java Adds approximate-radius candidate-cap test for IVF_PQ reader.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/vector/IvfFlatVectorIndexTest.java Adds SQ8/SQ4 round-trip tests and approximate-radius tests for IVF_FLAT.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfOnDiskFilterAwareTest.java New tests for IVF_ON_DISK pre-filter behavior and debug counters.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/segment/index/creator/HnswVectorIndexCreatorTest.java Adds tests proving HNSW runtime controls affect search + debug output.
pinot-segment-local/src/test/java/org/apache/pinot/segment/local/realtime/impl/vector/MutableVectorIndexTest.java New tests for mutable HNSW runtime controls parity and debug output.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/vector/VectorQuantizationUtils.java Adds quantizer type resolution + quantizer factory/serialization helpers.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/vector/IvfFlatVectorIndexCreator.java Introduces IVF_FLAT file format v2 and writes quantized encoded vectors + quantizer metadata.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/LuceneHnswRuntimeControlUtils.java New shared Lucene HNSW query/collector implementation honoring runtime controls.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfPqVectorIndexReader.java Adds approximate-radius API and implementation for IVF_PQ.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfOnDiskVectorIndexReader.java Adds v1/v2 format reading, quantizer-aware distance, pre-filter support, and approximate-radius path.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfFlatVectorIndexReader.java Adds v1/v2 reading, quantizer-aware scoring, and approximate-radius path for IVF_FLAT.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/HnswVectorIndexReader.java Routes HNSW searches through runtime-controlled Lucene query and surfaces debug fields.
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/realtime/impl/vector/MutableVectorIndex.java Aligns mutable HNSW with runtime controls + debug info; improves exception handling and resource closing.
pinot-core/src/test/java/org/apache/pinot/core/plan/FilterPlanNodeTest.java Adds tests for mutable-segment vector fallback reason strings.
pinot-core/src/test/java/org/apache/pinot/core/operator/filter/VectorSimilarityFilterOperatorTest.java Adds tests ensuring HNSW runtime knobs are dispatched/cleared and appear in explain output.
pinot-core/src/test/java/org/apache/pinot/core/operator/filter/VectorSearchParamsTest.java Adds tests for parsing new HNSW runtime controls from query options.
pinot-core/src/test/java/org/apache/pinot/core/operator/filter/VectorRadiusFilterOperatorTest.java Adds tests for approximate-radius routing and distance-function correctness for thresholds.
pinot-core/src/test/java/org/apache/pinot/core/operator/filter/FilterAwareVectorSearchTest.java Asserts explain reports FILTER_THEN_ANN when pre-filter search is used.
pinot-core/src/main/java/org/apache/pinot/core/plan/FilterPlanNode.java Makes vector fallback reasons segment-mutability-aware for better explainability.
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/VectorSimilarityFilterOperator.java Wires HNSW runtime params into readers, clears them, and extends explain/debug metadata.
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/VectorSearchParams.java Extends runtime params to include HNSW toggles and updates query-option parsing.
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/VectorRadiusFilterOperator.java Adds approximate-radius capability gating + explain attributes; uses threshold-aware candidate generation when available.
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/VectorExplainContext.java Carries additional HNSW runtime effective values into explain output.
pinot-common/src/main/java/org/apache/pinot/common/utils/config/QueryOptionsUtils.java Adds parsing helpers for new HNSW boolean query options.
docs/design/vector-backends-closeout.md Adds final-phase design closeout note documenting contracts/risks and insertion points.

@xiangfu0 xiangfu0 marked this pull request as ready for review April 11, 2026 06:32
@xiangfu0 xiangfu0 changed the title Close vector search final-phase gaps Close vector search final-phase gaps with sample configs and queries Apr 11, 2026
@xiangfu0 xiangfu0 added vector Related to vector similarity search index Related to indexing (general) labels Apr 11, 2026

/** Controls whether HNSW uses relative-distance competitive checks during traversal.
* Defaults to true. Setting false disables score-threshold pruning. */
public static final String VECTOR_HNSW_USE_RELATIVE_DISTANCE = "vectorHnswUseRelativeDistance";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is HNSW implicit for all vector search? If so, remove it from query option for consistency

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 1ffc1c6e0c. I introduced canonical runtime option names vectorUseRelativeDistance and vectorUseBoundedQueue, kept the old HNSW-prefixed names as deprecated aliases for compatibility, and added conflict-checking plus regression tests in QueryOptionsUtils and VectorSearchParamsTest.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are new added config, no need to handle backward compatibility lol

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

index Related to indexing (general) vector Related to vector similarity search

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants