-
Notifications
You must be signed in to change notification settings - Fork 45
Create cache in sdk on behalf of semantic tool filtering policy. #822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Introduced the semantic-tool-filtering policy to dynamically filter tools based on semantic relevance to user queries. - Updated policy manifests and configuration files to include the new policy. - Implemented an embedding cache to store and manage embeddings for tools, enhancing performance and efficiency. - Added documentation for the new policy and its configuration parameters.
Add semantic tool filtering policy and embedding cache implementation
Remove semantic tool filtering policy and update embedding cache impl…
Remove semantic tool filtering documentation
|
NaveenSandaruwan seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
WalkthroughAdds four embedding-provider config keys to gateway configuration and introduces a new thread-safe, per-API singleton embedding cache (SHA-256 keyed) with operations to get, add, remove, clear entries and report stats, all guarded by RWMutex. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Cache as EmbeddingCacheStore
participant Storage as APIEmbeddingCache
Client->>Cache: GetEntryByDescription(apiId, description)
Cache->>Cache: HashDescription(description)
Cache->>Storage: lookup by hashKey (read lock)
Storage-->>Cache: entry or nil
Cache-->>Client: return entry or nil
Client->>Cache: AddEntryByDescription(apiId, description, name, embedding)
Cache->>Cache: HashDescription(description)
Cache->>Cache: ensure API cache exists (write lock)
Cache->>Storage: remove existing by name, insert entry
Cache-->>Client: ack
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@sdk/gateway/policy/v1alpha/embedding_cache.go`:
- Around line 56-66: GetAPICache currently returns a direct reference to the
internal APIEmbeddingCache map which can be mutated concurrently by AddEntry or
RemoveAPI after the RLock is released; change EmbeddingCacheStore.GetAPICache to
build and return a defensive shallow copy of the APIEmbeddingCache (iterate over
ecs.cache[apiId] while holding the RLock and copy key/value pairs into a new
map) so callers can safely iterate/use the returned map without needing external
synchronization; alternatively, if you prefer iteration-with-lock semantics, add
a new method that accepts a callback to be executed under the lock, but do not
return the internal map directly from GetAPICache.
🧹 Nitpick comments (4)
gateway/configs/config.yaml (1)
413-417: Consider adding inline documentation for the new embedding provider configuration.The placement at root level is correct per the gateway-level infrastructure configuration pattern. However, unlike other configuration sections in this file, these new fields lack inline comments explaining their purpose and valid values.
Based on learnings, this configuration is shared across multiple policies (e.g., semantic-tool-filtering).
📝 Suggested documentation
+# Embedding provider configuration +# Used by policies that require semantic/embedding capabilities (e.g., semantic-tool-filtering) +# Supported providers: "OPENAI", "AZURE_OPENAI", "MISTRAL" embedding_provider: "MISTRAL" +# API key for the embedding provider embedding_provider_api_key: "<YOUR_MISTRAL_API_KEY>" +# Embedding model name (not required for AZURE_OPENAI which uses deployment name in endpoint) embedding_provider_model: "mistral-embed" +# Embedding provider API endpoint embedding_provider_endpoint: "https://api.mistral.ai/v1/embeddings"sdk/gateway/policy/v1alpha/embedding_cache.go (3)
79-98: Minor: returned*EmbeddingEntrycan be mutated by callers.
GetEntryreturns a pointer to the cachedEmbeddingEntry. Callers could mutateNameor theEmbeddingslice contents without synchronization. If callers are trusted to treat this as read-only, this is acceptable. Otherwise, consider returning a copy.
114-120: Performance: O(n) scan perAddEntryfor name deduplication.The loop scans all entries to find and remove an existing entry with the same
Name. For APIs with many tools, this becomes expensive. Consider maintaining a secondary index (name → hashKey) if this becomes a bottleneck.♻️ Optional: Add secondary index for O(1) name lookup
// In EmbeddingCacheStore, add: type APIEmbeddingCache struct { entries map[string]*EmbeddingEntry // hashKey → entry nameIndex map[string]string // name → hashKey }This would allow O(1) lookup and removal by name, but adds complexity. Only pursue if profiling shows this as a bottleneck.
19-23: Consider adding cache size limits or eviction policy.The cache can grow unbounded as more APIs and tool embeddings are added. Each embedding vector (typically 384-1536 float32s) consumes 1.5-6KB. With many APIs, memory usage could become significant.
Consider:
- Maximum entries per API or total
- LRU eviction for least-recently-used entries
- Metrics/alerts on cache size via
GetCacheStatsThis may be acceptable if APIs are removed via
RemoveAPIwhen undeployed, but worth monitoring in production.
returning internal map reference. GetAPICache returns a direct reference to the internal APIEmbeddingCache map. After the RLock is released, callers can access this map without synchronization while concurrent AddEntry or RemoveAPI calls modify it, causing a data race. Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
| max_export_batch_size: 512 | ||
| sampling_rate: 1.0 | ||
|
|
||
| embedding_provider: "MISTRAL" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove these. They are not needed by default
| @@ -0,0 +1,166 @@ | |||
| package policyv1alpha | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add missing license header
|
|
||
| // Singleton instance for EmbeddingCacheStore | ||
| var ( | ||
| embeddingCacheInstance *EmbeddingCacheStore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is only required for a single guardrail policy, we can move this to the policy implementation. So this will be a package level cache in the policy package.
Purpose
Goals
Approach
Summary by CodeRabbit
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.