Skip to content

Feat/consumer based rl#170

Open
Saadha123 wants to merge 8 commits intowso2:mainfrom
Saadha123:Feat/consumer-based-rl
Open

Feat/consumer based rl#170
Saadha123 wants to merge 8 commits intowso2:mainfrom
Saadha123:Feat/consumer-based-rl

Conversation

@Saadha123
Copy link
Copy Markdown
Contributor

@Saadha123 Saadha123 commented Apr 9, 2026

Purpose

Adds consumer-based rate limiting support to gateway-controllers policies. Currently, token and cost rate limits are shared across all GenAI applications — if App A exhausts the token budget, App B gets blocked too. This change allows each application to have its own independent counter against the same limit.

Goals

  • Allow per-application rate limiting for token usage, cost, and request count
  • Each GenAI application (identified by x-wso2-application-id) gets its own independent counter
  • The limit value is the same for all apps, but counters are tracked separately per app
  • Support running backend-level and consumer-level limits simultaneously in the same request chain

Approach

api-key-auth

After a successful API key authentication, write the application ID into SharedContext.Metadata (in addition to AuthContext.Properties and AnalyticsMetadata where it was already written). This makes it accessible to downstream rate limiting policies via key extraction.

token-based-ratelimit

  • Added a consumerBased boolean parameter to policy-definition.yaml
  • When consumerBased: true, key extraction switches from routename alone to routename + x-wso2-application-id, giving each application its own counter

llm-cost-based-ratelimit

  • Added a consumerBased boolean parameter to policy-definition.yaml
  • Same key extraction switch as token-based-ratelimit
  • Fixed a metadata key collision: when backend and consumer instances are both in the request chain, the consumer instance was overwriting the backend's delegate reference stored under llm_cost_delegate. The backend's response phase then read the consumer's delegate and drained the consumer counter twice while never touching the backend counter. Fixed by using separate keys: llm_cost_delegate for backend and llm_cost_delegate_consumer for consumer.

advanced-ratelimit

Fixed a metadata key collision that occurs when two instances with different key extraction configs run in the same request chain (e.g. a backend request-count limit and a consumer request-count limit):

  • advanced-ratelimit passes state between its request and response phases via hardcoded metadata keys (ratelimit:keys, ratelimit:result, etc.)
  • The second instance blindly overwrites the first's entry. When both instances share the same key extraction config this is harmless (same value). But when a backend instance writes routename and a consumer instance writes routename:app-id, the backend's response phase reads the consumer's key and deducts from the wrong counter — backend budget is never exhausted
  • Fixed by adding an instanceID field to RateLimitPolicy — a short SHA-256 hash of the quota names and key extraction config. All metadata reads/writes go through p.metaKey(base) which appends the instance ID, namespacing each instance's metadata slots. Limiter objects and the limiter cache are unaffected.

Release note

Added consumer-based rate limiting support to token-based-ratelimit, llm-cost-based-ratelimit, and advanced-ratelimit policies. A new consumerBased parameter allows limits to be tracked independently per GenAI application rather than shared across all applications. Also fixed metadata key collisions in advanced-ratelimit and llm-cost-based-ratelimit that caused incorrect counter updates when backend and consumer instances ran in the same request chain.

Documentation

N/A

Summary by CodeRabbit

  • New Features
    • Per-consumer rate limiting option added to cost-based and token-based rate limit policies (consumerBased parameter)
    • Advanced rate limit now isolates multiple instances in the same request flow to avoid cross-policy conflicts
  • Bug Fixes
    • API key authentication now records the resolved application identifier for downstream policies
    • Rate limit key extraction supports metadata fallbacks (uses default when app id missing)
  • Chores
    • Policy versions updated and tests expanded across ratelimit and auth policies

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 9, 2026

Walkthrough

Advanced-ratelimit namespaced per-instance metadata and supports key-extraction fallbacks; api-key-auth writes resolved ApplicationID into shared metadata; llm-cost-based-ratelimit and token-based-ratelimit gained a consumerBased parameter to optionally include application-id in quota key extraction. Multiple policy versions bumped.

Changes

Cohort / File(s) Summary
Advanced Rate Limit (instance isolation & fallbacks)
policies/advanced-ratelimit/ratelimit.go, policies/advanced-ratelimit/policy-definition.yaml
Added deterministic instanceID (first 8 hex of SHA-256 of quota names+keyExtraction) and metaKey(base) namespacing for all SharedContext.Metadata keys; KeyComponent now supports Fallback and key extraction uses fallback when metadata/header is missing. Version bumped to v1.0.3.
API Key Auth (store ApplicationID in metadata)
policies/api-key-auth/apikey.go, policies/api-key-auth/apikey_test.go, policies/api-key-auth/policy-definition.yaml
On successful authentication, initializes shared.Metadata if nil and stores resolved ApplicationID under applicationIDMetadataKey. Added tests asserting metadata written on success and not written on failure. Version bumped to v1.0.2.
LLM Cost-Based Rate Limit (consumer-based)
policies/llm-cost-based-ratelimit/llm_cost_based_ratelimit.go, policies/llm-cost-based-ratelimit/llm_cost_based_ratelimit_test.go, policies/llm-cost-based-ratelimit/policy-definition.yaml
Added consumerBased boolean parameter (default false). When true, quota keyExtraction includes routename plus metadata x-wso2-application-id with fallback; metadata keys for delegate and scale-factor are consumer-specific. Tests added for consumer-based behavior. Version bumped to v1.0.3.
Token-Based Rate Limit (consumer-based)
policies/token-based-ratelimit/token_based_ratelimit.go, policies/token-based-ratelimit/token_based_ratelimit_test.go, policies/token-based-ratelimit/policy-definition.yaml
Added consumerBased boolean parameter (default false). When true, quota keyExtraction includes routename plus metadata x-wso2-application-id with fallback; tests added for both modes. Version bumped to v1.0.3.
Tests / Integration updates
policies/advanced-ratelimit/ratelimit_integration_test.go
Expanded integration tests to cover metadata/header fallback behavior, parsing of fallback field, and consumer-default vs app-id bucket behavior. Renamed table fields for alignment; added multiple cases.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Poem

🐰 I nibble keys and namespace strings,

instance IDs and fallback things.
App IDs tucked in shared metadata,
per-consumer counters—hop, hooray!
Versioned patches, carrot-bright wings.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 64.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Feat/consumer based rl' is vague and uses abbreviated terms ('rl') that don't clearly convey the pull request's core change to someone scanning history. Use a clearer, more descriptive title such as 'Add consumer-based rate limiting support' or 'Support per-application rate limit counters'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description check ✅ Passed The pull request description is comprehensive and addresses most template sections (Purpose, Goals, Approach, Release note), though Documentation and other sections are marked N/A or omitted.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
policies/llm-cost-based-ratelimit/llm_cost_based_ratelimit_test.go (1)

537-571: Add an explicit test for the “absent consumerBased” path.

This test currently verifies only explicit consumerBased: false, while the test description also claims behavior for the absent case.

Proposed test addition
+func TestTransformToRatelimitParams_ConsumerBased_Absent(t *testing.T) {
+	params := map[string]interface{}{
+		"budgetLimits": []interface{}{
+			map[string]interface{}{"amount": float64(10), "duration": "1h"},
+		},
+	}
+
+	result := transformToRatelimitParams(params)
+	quotas, ok := result["quotas"].([]interface{})
+	if !ok || len(quotas) != 1 {
+		t.Fatal("Expected 1 quota")
+	}
+
+	quota := quotas[0].(map[string]interface{})
+	keyExtraction, ok := quota["keyExtraction"].([]interface{})
+	if !ok || len(keyExtraction) != 1 {
+		t.Fatalf("Expected only routename key extraction, got %v", keyExtraction)
+	}
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@policies/llm-cost-based-ratelimit/llm_cost_based_ratelimit_test.go` around
lines 537 - 571, Update the tests to cover the "consumerBased absent" path by
adding a new test (e.g., TestTransformToRatelimitParams_ConsumerBased_Absent) or
extending TestTransformToRatelimitParams_ConsumerBased_False to call
transformToRatelimitParams with params that omit the "consumerBased" key (keep
the same budgetLimits shape) and assert the resulting quotas[0].keyExtraction
contains exactly one entry with entry["type"] == "routename", mirroring the
existing assertions so the absent-key behavior is explicitly verified.
policies/token-based-ratelimit/token_based_ratelimit_test.go (1)

480-514: Consider adding a dedicated “consumerBased omitted” test.

This test validates explicit false well, but not the omitted-parameter default path mentioned in the comment.

Proposed test addition
+func TestTransformToRatelimitParams_ConsumerBased_Absent(t *testing.T) {
+	params := map[string]interface{}{
+		"totalTokenLimits": []interface{}{
+			map[string]interface{}{"count": float64(1000), "duration": "1h"},
+		},
+	}
+
+	result := transformToRatelimitParams(params, nil)
+	quotas, ok := result["quotas"].([]interface{})
+	if !ok || len(quotas) != 1 {
+		t.Fatal("Expected 1 quota")
+	}
+
+	quota := quotas[0].(map[string]interface{})
+	keyExtraction, ok := quota["keyExtraction"].([]interface{})
+	if !ok || len(keyExtraction) != 1 {
+		t.Fatalf("Expected only routename key extraction, got %v", keyExtraction)
+	}
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@policies/token-based-ratelimit/token_based_ratelimit_test.go` around lines
480 - 514, Add a new unit test (e.g.,
TestTransformToRatelimitParams_ConsumerBased_Omitted) that omits the
"consumerBased" key from the input params and calls
transformToRatelimitParams(nil) to verify the default behavior matches the
existing false case: assert there is exactly one quota, that quota.keyExtraction
is a []interface{} of length 1, and that the single entry has type "routename";
mirror the same assertions used in
TestTransformToRatelimitParams_ConsumerBased_False so the omitted-parameter path
is explicitly covered.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@policies/llm-cost-based-ratelimit/policy-definition.yaml`:
- Around line 45-54: Update the top-level policy description to reflect that
per-consumer spending limits are optional by referencing the consumerBased
boolean flag (consumerBased defaults to false); change language that currently
states the policy "enforces per-consumer spending limits" to something like "can
enforce per-consumer spending limits when consumerBased is true, otherwise a
shared limit applies" and mention the default behavior (consumerBased: false) so
readers know per-consumer enforcement is opt-in.

---

Nitpick comments:
In `@policies/llm-cost-based-ratelimit/llm_cost_based_ratelimit_test.go`:
- Around line 537-571: Update the tests to cover the "consumerBased absent" path
by adding a new test (e.g., TestTransformToRatelimitParams_ConsumerBased_Absent)
or extending TestTransformToRatelimitParams_ConsumerBased_False to call
transformToRatelimitParams with params that omit the "consumerBased" key (keep
the same budgetLimits shape) and assert the resulting quotas[0].keyExtraction
contains exactly one entry with entry["type"] == "routename", mirroring the
existing assertions so the absent-key behavior is explicitly verified.

In `@policies/token-based-ratelimit/token_based_ratelimit_test.go`:
- Around line 480-514: Add a new unit test (e.g.,
TestTransformToRatelimitParams_ConsumerBased_Omitted) that omits the
"consumerBased" key from the input params and calls
transformToRatelimitParams(nil) to verify the default behavior matches the
existing false case: assert there is exactly one quota, that quota.keyExtraction
is a []interface{} of length 1, and that the single entry has type "routename";
mirror the same assertions used in
TestTransformToRatelimitParams_ConsumerBased_False so the omitted-parameter path
is explicitly covered.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b39ba54a-6e4f-46b2-9669-8de56ab93663

📥 Commits

Reviewing files that changed from the base of the PR and between aa18880 and 5bc2958.

📒 Files selected for processing (11)
  • policies/advanced-ratelimit/policy-definition.yaml
  • policies/advanced-ratelimit/ratelimit.go
  • policies/api-key-auth/apikey.go
  • policies/api-key-auth/apikey_test.go
  • policies/api-key-auth/policy-definition.yaml
  • policies/llm-cost-based-ratelimit/llm_cost_based_ratelimit.go
  • policies/llm-cost-based-ratelimit/llm_cost_based_ratelimit_test.go
  • policies/llm-cost-based-ratelimit/policy-definition.yaml
  • policies/token-based-ratelimit/policy-definition.yaml
  • policies/token-based-ratelimit/token_based_ratelimit.go
  • policies/token-based-ratelimit/token_based_ratelimit_test.go

Comment on lines +45 to +54
consumerBased:
type: boolean
x-wso2-policy-advanced-param: false
description: |
When true, rate limits are applied per consumer (GenAI application) identified
by the x-wso2-application-id metadata key set by the api-key-auth policy.
Each application gets its own independent cost counter.
When false (default), a single shared limit applies across all consumers.
default: false

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Align top-level policy description with conditional behavior.

With consumerBased defaulting to false, the file description at Line 6 (“enforces per-consumer spending limits”) is now too absolute. Please make it explicit that per-consumer enforcement is optional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@policies/llm-cost-based-ratelimit/policy-definition.yaml` around lines 45 -
54, Update the top-level policy description to reflect that per-consumer
spending limits are optional by referencing the consumerBased boolean flag
(consumerBased defaults to false); change language that currently states the
policy "enforces per-consumer spending limits" to something like "can enforce
per-consumer spending limits when consumerBased is true, otherwise a shared
limit applies" and mention the default behavior (consumerBased: false) so
readers know per-consumer enforcement is opt-in.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@policies/advanced-ratelimit/ratelimit.go`:
- Around line 861-868: The parser now reads keyExtraction[*].fallback into
comp.Fallback, but the policy schema still disallows that field; update
policies/advanced-ratelimit/policy-definition.yaml to add a "fallback" property
(string) to the keyExtraction item schema (or allow it via properties instead of
being blocked by additionalProperties: false) so configurations pass validation
at policy creation/attachment; ensure the schema entry for the keyExtraction
array includes the new "fallback" property with type: string (and any desired
default/description) to match the parser's expectation.
- Around line 394-410: The instanceID currently built from idBuf/quotas only
includes quota.Name and KeyExtraction fields, causing collisions when other
policy parameters differ; extend the deterministic hash input to include all
distinguishing configuration used by advanced-ratelimit (e.g., each quota's CEL
expressions, MetadataFallback values, limit values, cost extraction
identifiers/flags, and any other per-quota settings) before computing idHash so
instanceID uniquely namespaces SharedContext.Metadata for different policy
instances; update the loop that builds idBuf (and any other place around
instanceID/idHash usage such as the similar block at lines ~434-438) to append
those additional fields in a stable order prior to sha256.Sum256 and
hex.EncodeToString.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 146e3b18-e269-4aa6-8799-d952ea2010cd

📥 Commits

Reviewing files that changed from the base of the PR and between 5bc2958 and b859d51.

📒 Files selected for processing (6)
  • policies/advanced-ratelimit/ratelimit.go
  • policies/advanced-ratelimit/ratelimit_integration_test.go
  • policies/llm-cost-based-ratelimit/llm_cost_based_ratelimit.go
  • policies/llm-cost-based-ratelimit/llm_cost_based_ratelimit_test.go
  • policies/token-based-ratelimit/token_based_ratelimit.go
  • policies/token-based-ratelimit/token_based_ratelimit_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • policies/llm-cost-based-ratelimit/llm_cost_based_ratelimit_test.go
  • policies/llm-cost-based-ratelimit/llm_cost_based_ratelimit.go
  • policies/token-based-ratelimit/token_based_ratelimit.go

Comment on lines +394 to +410
// Compute a short deterministic instance ID from the quota names and key extraction
// types. This namespaces all SharedContext.Metadata keys so that multiple
// advanced-ratelimit instances in the same request chain do not overwrite each other.
var idBuf strings.Builder
for i, q := range quotas {
idBuf.WriteString(q.Name)
idBuf.WriteString(":")
for _, ke := range q.KeyExtraction {
idBuf.WriteString(ke.Type)
idBuf.WriteString(ke.Key)
}
if i < len(quotas)-1 {
idBuf.WriteString("|")
}
}
idHash := sha256.Sum256([]byte(idBuf.String()))
instanceID := hex.EncodeToString(idHash[:])[:8]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Broaden the instanceID hash; it still collides for distinct policy instances.

instanceID is derived only from quota name plus keyExtraction type/key. Two advanced-ratelimit instances with the same quota names and key types/keys but different CEL expressions, metadata fallbacks, limits, or cost-extraction behavior will still share the same metadata namespace and overwrite each other’s per-request state. That brings back the collision bug for a pretty realistic API-level + route-level combination.

💡 Suggested direction
 	var idBuf strings.Builder
 	for i, q := range quotas {
-		idBuf.WriteString(q.Name)
-		idBuf.WriteString(":")
+		idBuf.WriteString(fmt.Sprintf("[%d]%s:", i, q.Name))
+		for _, lim := range q.Limits {
+			idBuf.WriteString(fmt.Sprintf("l=%d,d=%s,b=%d;", lim.Limit, lim.Duration, lim.Burst))
+		}
 		for _, ke := range q.KeyExtraction {
-			idBuf.WriteString(ke.Type)
-			idBuf.WriteString(ke.Key)
+			idBuf.WriteString(ke.Type)
+			idBuf.WriteString("|")
+			idBuf.WriteString(ke.Key)
+			idBuf.WriteString("|")
+			idBuf.WriteString(ke.Expression)
+			idBuf.WriteString("|")
+			idBuf.WriteString(ke.Fallback)
+			idBuf.WriteString(";")
 		}
+		if q.CostExtractionEnabled && q.CostExtractor != nil {
+			idBuf.WriteString("cost-extraction-enabled;")
+		}
 		if i < len(quotas)-1 {
 			idBuf.WriteString("|")
 		}
 	}

Also applies to: 434-438

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@policies/advanced-ratelimit/ratelimit.go` around lines 394 - 410, The
instanceID currently built from idBuf/quotas only includes quota.Name and
KeyExtraction fields, causing collisions when other policy parameters differ;
extend the deterministic hash input to include all distinguishing configuration
used by advanced-ratelimit (e.g., each quota's CEL expressions, MetadataFallback
values, limit values, cost extraction identifiers/flags, and any other per-quota
settings) before computing idHash so instanceID uniquely namespaces
SharedContext.Metadata for different policy instances; update the loop that
builds idBuf (and any other place around instanceID/idHash usage such as the
similar block at lines ~434-438) to append those additional fields in a stable
order prior to sha256.Sum256 and hex.EncodeToString.

Comment on lines +861 to +868
// Parse optional fallback value
if fallbackRaw, ok := compMap["fallback"]; ok {
if fallbackStr, ok := fallbackRaw.(string); ok {
comp.Fallback = fallbackStr
} else {
return nil, fmt.Errorf("keyExtraction[%d].fallback must be a string", i)
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

fallback is still unconfigurable through the policy schema.

The parser now accepts keyExtraction[*].fallback, but policies/advanced-ratelimit/policy-definition.yaml still rejects that field (additionalProperties: false and no fallback property). In practice, any policy using this new behavior will be rejected before GetPolicy() runs, so the feature is not actually usable yet. Please update the schema in the same PR.

Based on learnings, policy configurations are validated against policy-definition.yaml schemas at API creation time or when policies are attached.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@policies/advanced-ratelimit/ratelimit.go` around lines 861 - 868, The parser
now reads keyExtraction[*].fallback into comp.Fallback, but the policy schema
still disallows that field; update
policies/advanced-ratelimit/policy-definition.yaml to add a "fallback" property
(string) to the keyExtraction item schema (or allow it via properties instead of
being blocked by additionalProperties: false) so configurations pass validation
at policy creation/attachment; ensure the schema entry for the keyExtraction
array includes the new "fallback" property with type: string (and any desired
default/description) to match the parser's expectation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant