SupportsHybrid + client-side RRF fallback#16
Merged
Conversation
Member
Author
|
CI is failing on GitHub-side infrastructure — the runs alternate between:
Code + the full 355-test suite pass cleanly in the verification venv (see PR description). Will re-run CI once GitHub stabilizes. |
Hybrid (dense + lexical) search across all 15 backends via a uniform contract — the top-level vd.hybrid_search(collection, ...) dispatches to the collection's native hybrid_search when isinstance(c, SupportsHybrid), otherwise fuses dense collection.search() with a pure-Python BM25 lexical scan via Reciprocal Rank Fusion. Protocol (vd/base.py) - SupportsHybrid signature refined: query: Union[str, Vector] mirrors Collection.search; query_text= overrides the lexical side; k_dense/ k_lexical control per-side over-fetch; rrf_k tunes RRF. The portable contract is RRF; backend-specific weighted-fusion knobs go through **kwargs (documented per adapter). - AbstractCollection._resolve_hybrid_inputs() normalizes (query, query_text) into a vetted (vec, text) tuple. - AbstractCollection._hybrid_via_rrf() — shared orchestration: dense _query + adapter's _lexical_query, fused with RRF. Adapters opt into SupportsHybrid in ~10 lines. Top-level helpers (vd/search.py) - hybrid_search(collection, ...) — single user-facing entry, with fallback dispatch. - bm25_lexical_search(collection, query_text, ...) — pure-Python Okapi BM25 over collection.values(); O(N), no new deps. Pluggable via lexical_search= callable in hybrid_search. - _rrf_fuse() — internal, used by both the fallback and adapter paths so RRF semantics are identical. Native adapters (3) - weaviate, elasticsearch, redis — each defines _lexical_query (BM25 via collection.query.bm25 / ES match / RediSearch FT.SEARCH) and a hybrid_search method that calls _hybrid_via_rrf. These were chosen because their adapters already provision text-side indexing at create_collection time. - 6 other "native hybrid" backends (milvus, mongodb, pinecone, turbopuffer, qdrant, lancedb) fall back to the client-side BM25 path in this PR; each needs its own infra setup (sparse-vec models, Atlas Search index, BM25 function fields, FTS index) and is tracked for a follow-up. Tests - tests/test_hybrid.py — parametrized over every reachable backend. Verifies the contract shape, fused ordering (a doc with both signals ranks above single-signal docs), filter pass-through, error on vector query without query_text, the native-vs-fallback split via isinstance(c, SupportsHybrid), and the custom-lexical_search hook. - 71 hybrid tests pass across 13 backends (memory, chroma, faiss, duckdb, lancedb, qdrant, pgvector, mongodb on fallback; weaviate, elasticsearch, redis on native). Full suite: 355 passed, no regressions. Refs #11, closes #15
3297499 to
eedcf63
Compare
This was referenced May 27, 2026
thorwhalen
added a commit
that referenced
this pull request
May 27, 2026
Phase 1 of #18 — the fourth deferred item from #11. Adds async/await support across every vd backend through a universal wrapper, with opt-in native implementations to follow in per-backend PRs. Architecture (same pattern as PR #16's SupportsHybrid + RRF fallback): the universal AsyncClientWrapper/AsyncCollectionWrapper adapts every sync vd Collection/Client by dispatching every method to asyncio.to_thread. Adapters with native async SDKs override the wrapper in Phase 2. vd/base.py - AsyncClient / AsyncCollection runtime-checkable Protocols mirror the Client/Collection contracts. Stdlib has no AsyncMutableMapping; the AsyncCollection exposes explicit get/set/delete/keys/count (Motor / aiopg convention) instead of dunders. - SupportsNativeAsync marker — present on both the wrapper (with native_async=False) and native adapters (with native_async=True), so consumers can prefer truly non-blocking adapters in high-concurrency HTTP servers while still treating the wrapper as a valid AsyncClient. vd/asynchronous.py (new) - AsyncClientWrapper / AsyncCollectionWrapper — thin wrappers over any sync Client/Collection; every method dispatches through asyncio.to_thread. AsyncCollection.keys() / search() materialize once in a worker thread then stream from memory (most backends' sync iter is already O(N)). Both wrappers expose `.sync` as a documented escape hatch. - connect_async(backend, **kwargs) — async sibling of vd.connect(), returns an awaitable. Phase 2 will plug per-backend native adapters in here; for Phase 1 every backend uses the universal wrapper. - hybrid_search_async(collection, query, ...) — async sibling of vd.hybrid_search; dispatches the whole fused call to a worker thread. - Module name is `asynchronous` to avoid the `async` keyword. Exports - vd.AsyncClient, vd.AsyncCollection, vd.SupportsNativeAsync. - vd.AsyncClientWrapper, vd.AsyncCollectionWrapper. - vd.connect_async, vd.hybrid_search_async. Tests - tests/test_async.py — 12 contract tests on the memory backend (no infra needed). Cover the entry point + protocol satisfaction, async context manager, create/get/delete/list_collections, CRUD on AsyncCollection, search with filter + egress, upsert/add_documents, escape hatches, hybrid_search_async happy + error paths. - pyproject.toml — adds pytest-asyncio>=0.23 to dev deps; configures `asyncio_mode = "auto"` so `async def test_*` runs without per-test decorators. Verification - 12 async tests pass. - Full suite: 262 passed, 189 skipped (server backends require docker). No regressions vs the prior 250 sync tests — 12 net new. Phase 2 (one follow-up issue per backend, to be opened): native async adapters for chroma, qdrant, weaviate, elasticsearch, redis, mongodb, lancedb, milvus, pinecone, turbopuffer (the 10 backends with native async SDKs). Each will set `native_async = True` and bypass to_thread. Refs #11, closes #18
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
SupportsHybrid + client-side RRF fallback
Closes #15. Refs #11 (third deferred item).
What ships
vd.hybrid_search(collection, ...)— single user-facing entry that works on every vd backend. Dispatches to a collection's nativehybrid_searchwhenisinstance(c, vd.SupportsHybrid); otherwise fuses densecollection.search()with a pure-Python BM25 lexical scan via Reciprocal Rank Fusion.SupportsHybridprotocol refined — portable contract is RRF; per-backend weighted-fusion knobs (alpha, etc.) accepted via**kwargs.vd.bm25_lexical_search(collection, query_text, ...)— exposed as a standalone helper; pluggable intohybrid_searchvialexical_search=._lexical_queryusing its backend's BM25 and ahybrid_searchmethod that calls the sharedAbstractCollection._hybrid_via_rrforchestrator. Chosen because these adapters already provision text-side indexing atcreate_collectiontime.Verification
tests/docker-compose.yml(weaviate/ES/redis on real engines).Design decisions (confirmed with maintainer in #15)
alphaand other fusion variants go through**kwargs. Their semantics don't agree across backends so promising them in the protocol would mislead.vd.hybrid_searchis the user-facing entry, not a baseline method onCollection— so the runtimeSupportsHybridcheck is meaningful and the fallback path doesn't masquerade as native.lexical_search=so users can drop inrank-bm25, a real FTS, or a backend-specific text search.Still deferred from #11 (after this PR)
AsyncClient/AsyncCollectionparallel protocols.F()filter builder.