Create cache in sdk on behalf of semantic tool filtering policy. #822

NaveenSandaruwan · 2026-01-27T10:25:13Z

Purpose

This caching mechanism need to store embedding of tool descriptions in semantic tool filtering policy.

Goals

Using this it can be improve the performance and embedding model usage. It will leads to reduce latency.

Approach

Double hash map is utilised. Each APIID has an Hashmap to store their embedding and it is retrieved using APIID. Then inside that use description hash to get particular embedding for tool description.

Summary by CodeRabbit

New Features
- Added configuration for embedding provider (provider, API key, model, and endpoint) to support external embedding services.
Chores
- Added a global, thread-safe embedding cache to improve performance and concurrent embedding lookups across APIs.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

- Introduced the semantic-tool-filtering policy to dynamically filter tools based on semantic relevance to user queries. - Updated policy manifests and configuration files to include the new policy. - Implemented an embedding cache to store and manage embeddings for tools, enhancing performance and efficiency. - Added documentation for the new policy and its configuration parameters.

Add semantic tool filtering policy and embedding cache implementation

…ementation

Remove semantic tool filtering policy and update embedding cache impl…

Remove semantic tool filtering documentation

CLAassistant · 2026-01-27T10:25:21Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

NaveenSandaruwan seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

coderabbitai · 2026-01-27T10:25:45Z

Walkthrough

Adds four embedding-provider config keys to gateway configuration and introduces a new thread-safe, per-API singleton embedding cache (SHA-256 keyed) with operations to get, add, remove, clear entries and report stats, all guarded by RWMutex.

Changes

Cohort / File(s)	Summary
Embedding Provider Configuration `gateway/configs/config.yaml`	Added top-level keys: `embedding_provider`, `embedding_provider_api_key`, `embedding_provider_model`, `embedding_provider_endpoint`.
Embedding Cache Implementation `sdk/gateway/policy/v1alpha/embedding_cache.go`	New file: defines `EmbeddingEntry`, `APIEmbeddingCache`, `EmbeddingCacheStore` (singleton). Implements SHA-256 `HashDescription`, per-API caches, thread-safe accessors/mutators (`HasAPI`, `GetAPICache`, `AddAPICache`, `GetEntry`, `GetEntryByDescription`, `AddEntry`, `AddEntryByDescription`, `RemoveAPI`, `ClearAll`, `GetCacheStats`). All operations use RWMutex.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Cache as EmbeddingCacheStore
  participant Storage as APIEmbeddingCache

  Client->>Cache: GetEntryByDescription(apiId, description)
  Cache->>Cache: HashDescription(description)
  Cache->>Storage: lookup by hashKey (read lock)
  Storage-->>Cache: entry or nil
  Cache-->>Client: return entry or nil

  Client->>Cache: AddEntryByDescription(apiId, description, name, embedding)
  Cache->>Cache: HashDescription(description)
  Cache->>Cache: ensure API cache exists (write lock)
  Cache->>Storage: remove existing by name, insert entry
  Cache-->>Client: ack

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I nibble hashes in the night,
Storing vectors snug and tight,
Mutex hugs keep races at bay,
Per-API burrows where they stay,
A hop, a stash, embeddings light.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description includes Purpose, Goals, and Approach sections as per the template, but is missing several required sections like Automation tests, Security checks, Documentation, and Test environment.	Add missing template sections: Documentation, Automation tests (with coverage details), Security checks (with yes/no confirmations), and Test environment details.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: implementing a cache for embeddings in the SDK for semantic tool filtering policy.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@sdk/gateway/policy/v1alpha/embedding_cache.go`:
- Around line 56-66: GetAPICache currently returns a direct reference to the
internal APIEmbeddingCache map which can be mutated concurrently by AddEntry or
RemoveAPI after the RLock is released; change EmbeddingCacheStore.GetAPICache to
build and return a defensive shallow copy of the APIEmbeddingCache (iterate over
ecs.cache[apiId] while holding the RLock and copy key/value pairs into a new
map) so callers can safely iterate/use the returned map without needing external
synchronization; alternatively, if you prefer iteration-with-lock semantics, add
a new method that accepts a callback to be executed under the lock, but do not
return the internal map directly from GetAPICache.

🧹 Nitpick comments (4)

gateway/configs/config.yaml (1)
413-417: Consider adding inline documentation for the new embedding provider configuration.

The placement at root level is correct per the gateway-level infrastructure configuration pattern. However, unlike other configuration sections in this file, these new fields lack inline comments explaining their purpose and valid values.

Based on learnings, this configuration is shared across multiple policies (e.g., semantic-tool-filtering).
📝 Suggested documentation
+# Embedding provider configuration
+# Used by policies that require semantic/embedding capabilities (e.g., semantic-tool-filtering)
+# Supported providers: "OPENAI", "AZURE_OPENAI", "MISTRAL"
 embedding_provider: "MISTRAL"
+# API key for the embedding provider
 embedding_provider_api_key: "<YOUR_MISTRAL_API_KEY>"
+# Embedding model name (not required for AZURE_OPENAI which uses deployment name in endpoint)
 embedding_provider_model: "mistral-embed"
+# Embedding provider API endpoint
 embedding_provider_endpoint: "https://api.mistral.ai/v1/embeddings"
sdk/gateway/policy/v1alpha/embedding_cache.go (3)
79-98: Minor: returned *EmbeddingEntry can be mutated by callers.

GetEntry returns a pointer to the cached EmbeddingEntry. Callers could mutate Name or the Embedding slice contents without synchronization. If callers are trusted to treat this as read-only, this is acceptable. Otherwise, consider returning a copy.

114-120: Performance: O(n) scan per AddEntry for name deduplication.

The loop scans all entries to find and remove an existing entry with the same Name. For APIs with many tools, this becomes expensive. Consider maintaining a secondary index (name → hashKey) if this becomes a bottleneck.
♻️ Optional: Add secondary index for O(1) name lookup
// In EmbeddingCacheStore, add:
type APIEmbeddingCache struct {
    entries   map[string]*EmbeddingEntry // hashKey → entry
    nameIndex map[string]string          // name → hashKey
}
This would allow O(1) lookup and removal by name, but adds complexity. Only pursue if profiling shows this as a bottleneck.
19-23: Consider adding cache size limits or eviction policy.

The cache can grow unbounded as more APIs and tool embeddings are added. Each embedding vector (typically 384-1536 float32s) consumes 1.5-6KB. With many APIs, memory usage could become significant.

Consider:

Maximum entries per API or total

LRU eviction for least-recently-used entries

Metrics/alerts on cache size via GetCacheStats

This may be acceptable if APIs are removed via RemoveAPI when undeployed, but worth monitoring in production.

sdk/gateway/policy/v1alpha/embedding_cache.go

returning internal map reference. GetAPICache returns a direct reference to the internal APIEmbeddingCache map. After the RLock is released, callers can access this map without synchronization while concurrent AddEntry or RemoveAPI calls modify it, causing a data race. Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

RakhithaRR · 2026-01-28T06:59:36Z

gateway/configs/config.yaml

  max_export_batch_size: 512
  sampling_rate: 1.0
+
+embedding_provider: "MISTRAL"


Let's remove these. They are not needed by default

RakhithaRR · 2026-01-28T06:59:47Z

sdk/gateway/policy/v1alpha/embedding_cache.go

@@ -0,0 +1,166 @@
+package policyv1alpha


Add missing license header

renuka-fernando · 2026-01-28T07:26:31Z

sdk/gateway/policy/v1alpha/embedding_cache.go

+
+// Singleton instance for EmbeddingCacheStore
+var (
+	embeddingCacheInstance *EmbeddingCacheStore


If this is only required for a single guardrail policy, we can move this to the policy implementation. So this will be a package level cache in the policy package.

NaveenSandaruwan and others added 6 commits January 27, 2026 12:39

Merge pull request #1 from NaveenSandaruwan/addoldchanges

00df76f

Add semantic tool filtering policy and embedding cache implementation

Remove semantic tool filtering policy and update embedding cache impl…

99115f8

…ementation

Merge pull request #2 from NaveenSandaruwan/addoldchanges

0d9048f

Remove semantic tool filtering policy and update embedding cache impl…

Remove semantic tool filtering documentation

114c071

Merge pull request #3 from NaveenSandaruwan/addoldchanges

555b487

Remove semantic tool filtering documentation

NaveenSandaruwan requested review from Krishanx92, malinthaprasan and pubudu538 as code owners January 27, 2026 10:25

coderabbitai bot reviewed Jan 27, 2026

View reviewed changes

sdk/gateway/policy/v1alpha/embedding_cache.go Show resolved Hide resolved

RakhithaRR reviewed Jan 28, 2026

View reviewed changes

RakhithaRR requested a review from renuka-fernando January 28, 2026 07:03

renuka-fernando reviewed Jan 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create cache in sdk on behalf of semantic tool filtering policy. #822

Create cache in sdk on behalf of semantic tool filtering policy. #822

NaveenSandaruwan commented Jan 27, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

CLAassistant commented Jan 27, 2026

Uh oh!

coderabbitai bot commented Jan 27, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

RakhithaRR Jan 28, 2026

Uh oh!

RakhithaRR Jan 28, 2026

Uh oh!

renuka-fernando Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Create cache in sdk on behalf of semantic tool filtering policy. #822

Are you sure you want to change the base?

Create cache in sdk on behalf of semantic tool filtering policy. #822

Conversation

NaveenSandaruwan commented Jan 27, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Goals

Approach

Summary by CodeRabbit

Uh oh!

CLAassistant commented Jan 27, 2026

Uh oh!

coderabbitai bot commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

RakhithaRR Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

RakhithaRR Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

renuka-fernando Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

NaveenSandaruwan commented Jan 27, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 27, 2026 •

edited

Loading