Feature Request: Core support for Semantic Caching (Redis HNSW / Vector Search) #570
AnkitNeupane007
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
With the growing adoption of LLM/RAG applications built on FastAPI, I’ve noticed that traditional exact-key caching becomes less effective for prompt-driven workloads.
For example:
These produce semantically equivalent responses, but result in separate cache entries under exact-match caching.
I’ve been experimenting with a semantic caching layer built on top of Redis vector search (HNSW + cosine similarity), and I think this capability could fit nicely as an optional extension to
fastapi-cache.Conceptually, the flow looks like:
I currently have a working prototype using
redis.commands.searchand would be interested in adapting it into a PR if the maintainers think this aligns with the project direction.A few architectural questions before proceeding:
1. Scope
Would semantic caching be considered within the scope of
fastapi-cache, or would maintainers prefer it live as a separate extension package?2. Backend API
The current
Backendabstraction is key-oriented (get_with_ttl(key: str)).Since semantic caching requires vector queries + similarity scores, would a separate interface (e.g.
SemanticBackend) make more sense to avoid complicating existing backends?3. Decorator/API Surface
Since embeddings must be generated before lookup, would a dedicated decorator such as
@semantic_cache(...)be preferable over extending@cache?4. Dependencies
My assumption is that embedding generation should remain entirely user-supplied (callable injection/configuration) so the library itself avoids heavyweight ML dependencies.
Happy to share implementation details or open a draft PR if there’s interest.
Beta Was this translation helpful? Give feedback.
All reactions