Skip to content

SupportsHybrid + client-side RRF fallback#16

Merged
thorwhalen merged 1 commit into
masterfrom
claude/hybrid-rrf
May 27, 2026
Merged

SupportsHybrid + client-side RRF fallback#16
thorwhalen merged 1 commit into
masterfrom
claude/hybrid-rrf

Conversation

@thorwhalen
Copy link
Copy Markdown
Member

SupportsHybrid + client-side RRF fallback

Closes #15. Refs #11 (third deferred item).

What ships

  • vd.hybrid_search(collection, ...) — single user-facing entry that works on every vd backend. Dispatches to a collection's native hybrid_search when isinstance(c, vd.SupportsHybrid); otherwise fuses dense collection.search() with a pure-Python BM25 lexical scan via Reciprocal Rank Fusion.
  • SupportsHybrid protocol refined — portable contract is RRF; per-backend weighted-fusion knobs (alpha, etc.) accepted via **kwargs.
  • vd.bm25_lexical_search(collection, query_text, ...) — exposed as a standalone helper; pluggable into hybrid_search via lexical_search=.
  • Three native adapters: weaviate, elasticsearch, redis — each implements _lexical_query using its backend's BM25 and a hybrid_search method that calls the shared AbstractCollection._hybrid_via_rrf orchestrator. Chosen because these adapters already provision text-side indexing at create_collection time.
  • Twelve other backends use the client-side BM25 + RRF fallback. Six of them (milvus, mongodb, pinecone, turbopuffer, qdrant, lancedb) have native hybrid in their underlying engine and are tracked for a follow-up — each needs its own infra surgery (sparse-vec models, Atlas Search index, BM25 function fields, FTS index) too large to cram into this PR.

Verification

  • 71 new hybrid tests pass across 13 reachable backends (memory, chroma, faiss, duckdb, lancedb, qdrant, pgvector, mongodb on the fallback path; weaviate, elasticsearch, redis on the native path).
  • Full vd suite: 355 passed, no regressions (was 344 before this PR — 11 new tests).
  • Verified locally against the docker-compose harness from tests/docker-compose.yml (weaviate/ES/redis on real engines).

Design decisions (confirmed with maintainer in #15)

  1. RRF-only portable contract. Weighted-blend alpha and other fusion variants go through **kwargs. Their semantics don't agree across backends so promising them in the protocol would mislead.
  2. Top-level vd.hybrid_search is the user-facing entry, not a baseline method on Collection — so the runtime SupportsHybrid check is meaningful and the fallback path doesn't masquerade as native.
  3. Pure-Python BM25 for the fallback (no new deps; ~50 LOC; honest O(N) cost), pluggable via lexical_search= so users can drop in rank-bm25, a real FTS, or a backend-specific text search.
  4. pgvector stays on the fallback path in v1; tsvector + SQL fusion is a clean follow-up.

Still deferred from #11 (after this PR)

  • pinecone / turbopuffer adapter verification (cloud accounts).
  • Native hybrid implementations on milvus, mongodb, pinecone, turbopuffer, qdrant, lancedb (follow-up issue to be opened).
  • AsyncClient / AsyncCollection parallel protocols.
  • LangChain adapter shims + the F() filter builder.

@thorwhalen
Copy link
Copy Markdown
Member Author

CI is failing on GitHub-side infrastructure — the runs alternate between:

  1. codeload.github.com returning 404 on the astral-sh/setup-uv@v7 tarball download.
  2. github.com returning HTTP 403 Your account is suspended on the runner's git fetch (auth from this side is fine; gh auth status is green and local pushes succeed).

Code + the full 355-test suite pass cleanly in the verification venv (see PR description). Will re-run CI once GitHub stabilizes.

Hybrid (dense + lexical) search across all 15 backends via a uniform
contract — the top-level vd.hybrid_search(collection, ...) dispatches to
the collection's native hybrid_search when isinstance(c, SupportsHybrid),
otherwise fuses dense collection.search() with a pure-Python BM25 lexical
scan via Reciprocal Rank Fusion.

Protocol (vd/base.py)
- SupportsHybrid signature refined: query: Union[str, Vector] mirrors
  Collection.search; query_text= overrides the lexical side; k_dense/
  k_lexical control per-side over-fetch; rrf_k tunes RRF. The portable
  contract is RRF; backend-specific weighted-fusion knobs go through
  **kwargs (documented per adapter).
- AbstractCollection._resolve_hybrid_inputs() normalizes (query, query_text)
  into a vetted (vec, text) tuple.
- AbstractCollection._hybrid_via_rrf() — shared orchestration: dense _query
  + adapter's _lexical_query, fused with RRF. Adapters opt into
  SupportsHybrid in ~10 lines.

Top-level helpers (vd/search.py)
- hybrid_search(collection, ...) — single user-facing entry, with
  fallback dispatch.
- bm25_lexical_search(collection, query_text, ...) — pure-Python Okapi
  BM25 over collection.values(); O(N), no new deps. Pluggable via
  lexical_search= callable in hybrid_search.
- _rrf_fuse() — internal, used by both the fallback and adapter paths
  so RRF semantics are identical.

Native adapters (3)
- weaviate, elasticsearch, redis — each defines _lexical_query (BM25 via
  collection.query.bm25 / ES match / RediSearch FT.SEARCH) and a
  hybrid_search method that calls _hybrid_via_rrf. These were chosen
  because their adapters already provision text-side indexing at
  create_collection time.
- 6 other "native hybrid" backends (milvus, mongodb, pinecone,
  turbopuffer, qdrant, lancedb) fall back to the client-side BM25 path
  in this PR; each needs its own infra setup (sparse-vec models,
  Atlas Search index, BM25 function fields, FTS index) and is tracked
  for a follow-up.

Tests
- tests/test_hybrid.py — parametrized over every reachable backend.
  Verifies the contract shape, fused ordering (a doc with both signals
  ranks above single-signal docs), filter pass-through, error on
  vector query without query_text, the native-vs-fallback split via
  isinstance(c, SupportsHybrid), and the custom-lexical_search hook.
- 71 hybrid tests pass across 13 backends (memory, chroma, faiss,
  duckdb, lancedb, qdrant, pgvector, mongodb on fallback; weaviate,
  elasticsearch, redis on native). Full suite: 355 passed, no regressions.

Refs #11, closes #15
@thorwhalen thorwhalen force-pushed the claude/hybrid-rrf branch from 3297499 to eedcf63 Compare May 27, 2026 07:41
@thorwhalen thorwhalen merged commit 254920e into master May 27, 2026
12 checks passed
@thorwhalen thorwhalen deleted the claude/hybrid-rrf branch May 27, 2026 07:44
thorwhalen added a commit that referenced this pull request May 27, 2026
Phase 1 of #18 — the fourth deferred item from #11. Adds async/await
support across every vd backend through a universal wrapper, with
opt-in native implementations to follow in per-backend PRs.

Architecture (same pattern as PR #16's SupportsHybrid + RRF fallback):
the universal AsyncClientWrapper/AsyncCollectionWrapper adapts every sync
vd Collection/Client by dispatching every method to asyncio.to_thread.
Adapters with native async SDKs override the wrapper in Phase 2.

vd/base.py
- AsyncClient / AsyncCollection runtime-checkable Protocols mirror the
  Client/Collection contracts. Stdlib has no AsyncMutableMapping; the
  AsyncCollection exposes explicit get/set/delete/keys/count (Motor /
  aiopg convention) instead of dunders.
- SupportsNativeAsync marker — present on both the wrapper (with
  native_async=False) and native adapters (with native_async=True), so
  consumers can prefer truly non-blocking adapters in high-concurrency
  HTTP servers while still treating the wrapper as a valid AsyncClient.

vd/asynchronous.py (new)
- AsyncClientWrapper / AsyncCollectionWrapper — thin wrappers over any
  sync Client/Collection; every method dispatches through asyncio.to_thread.
  AsyncCollection.keys() / search() materialize once in a worker thread
  then stream from memory (most backends' sync iter is already O(N)).
  Both wrappers expose `.sync` as a documented escape hatch.
- connect_async(backend, **kwargs) — async sibling of vd.connect(),
  returns an awaitable. Phase 2 will plug per-backend native adapters
  in here; for Phase 1 every backend uses the universal wrapper.
- hybrid_search_async(collection, query, ...) — async sibling of
  vd.hybrid_search; dispatches the whole fused call to a worker thread.
- Module name is `asynchronous` to avoid the `async` keyword.

Exports
- vd.AsyncClient, vd.AsyncCollection, vd.SupportsNativeAsync.
- vd.AsyncClientWrapper, vd.AsyncCollectionWrapper.
- vd.connect_async, vd.hybrid_search_async.

Tests
- tests/test_async.py — 12 contract tests on the memory backend (no infra
  needed). Cover the entry point + protocol satisfaction, async context
  manager, create/get/delete/list_collections, CRUD on AsyncCollection,
  search with filter + egress, upsert/add_documents, escape hatches,
  hybrid_search_async happy + error paths.
- pyproject.toml — adds pytest-asyncio>=0.23 to dev deps; configures
  `asyncio_mode = "auto"` so `async def test_*` runs without per-test
  decorators.

Verification
- 12 async tests pass.
- Full suite: 262 passed, 189 skipped (server backends require docker).
  No regressions vs the prior 250 sync tests — 12 net new.

Phase 2 (one follow-up issue per backend, to be opened): native async
adapters for chroma, qdrant, weaviate, elasticsearch, redis, mongodb,
lancedb, milvus, pinecone, turbopuffer (the 10 backends with native
async SDKs). Each will set `native_async = True` and bypass to_thread.

Refs #11, closes #18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SupportsHybrid + client-side RRF fallback

1 participant