Fix Application Update event to sync API keys based on events#1659
Fix Application Update event to sync API keys based on events#1659VirajSalaka wants to merge 8 commits intowso2:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughxDS refresh was moved from the control-plane client's application-update path into the event listener. The listener now performs DB-driven synchronization of application→API-key mappings, publishes events with removed key IDs, and triggers per-key xDS refresh during application CREATE/UPDATE/DELETE processing. Changes
Sequence DiagramsequenceDiagram
participant App as Application Event
participant EL as Event Listener
participant DB as Database
participant Store as In-memory Config Store
participant XDS as xDS Manager
rect rgba(100,150,255,0.5)
App->>EL: Application CREATE/UPDATE/DELETE event (optional EventData)
EL->>DB: GetAPIKeysByApplicationUUID(applicationUUID)
DB-->>EL: API key records
EL->>Store: Query in-memory keys & resolve affected set (DB + removed IDs + in-memory matches)
Store-->>EL: API configs/artifacts as needed
loop per affected API key
EL->>Store: Store/refresh API key in in-memory store
alt xDS manager present
EL->>XDS: StoreAPIKey(key, apiName, version)
XDS-->>EL: Ack/Error
end
end
EL-->>App: Publish replica-sync event (with RemovedAPIKeyIDs in EventData)
end
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
gateway/gateway-controller/pkg/eventlistener/subscription_processor.go (1)
131-149: Consider adding a dedicated query for application-specific API keys.
loadApplicationAPIKeysFromDBfetches all API keys viaGetAllAPIKeys()and filters client-side byApplicationID. This works correctly but may be inefficient at scale when there are many API keys across different applications.A more efficient approach would be to add a
GetAPIKeysByApplicationUUID(applicationUUID string)method to the storage interface that filters at the database level.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-controller/pkg/eventlistener/subscription_processor.go` around lines 131 - 149, Replace the client-side filtering in loadApplicationAPIKeysFromDB by adding and using a DB-level query: add GetAPIKeysByApplicationUUID(applicationUUID string) to the storage interface and implement it in the storage backend, then update EventListener.loadApplicationAPIKeysFromDB to call l.db.GetAPIKeysByApplicationUUID(applicationUUID) (removing use of GetAllAPIKeys and the manual loop/filter) and return the mapped results; ensure the new storage method returns []*models.APIKey and preserves nil/empty checks as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@gateway/gateway-controller/pkg/eventlistener/subscription_processor.go`:
- Around line 131-149: Replace the client-side filtering in
loadApplicationAPIKeysFromDB by adding and using a DB-level query: add
GetAPIKeysByApplicationUUID(applicationUUID string) to the storage interface and
implement it in the storage backend, then update
EventListener.loadApplicationAPIKeysFromDB to call
l.db.GetAPIKeysByApplicationUUID(applicationUUID) (removing use of GetAllAPIKeys
and the manual loop/filter) and return the mapped results; ensure the new
storage method returns []*models.APIKey and preserves nil/empty checks as
before.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0d822828-2c56-4f24-b54c-057d8d32ebd4
📒 Files selected for processing (5)
gateway/gateway-controller/pkg/controlplane/api_deleted_test.gogateway/gateway-controller/pkg/controlplane/client.gogateway/gateway-controller/pkg/eventlistener/listener_test.gogateway/gateway-controller/pkg/eventlistener/subscription_processor.gogateway/gateway-controller/pkg/eventlistener/subscription_processor_test.go
💤 Files with no reviewable changes (1)
- gateway/gateway-controller/pkg/controlplane/client.go
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
gateway/gateway-controller/pkg/utils/mock_db_test.go (1)
163-165: Return an empty slice from the stub.
Storage.GetAPIKeysByApplicationUUIDdocuments an empty slice on miss. Matching that here keeps nil-vs-empty behavior from drifting between unit tests and the real storage layer.♻️ Suggested tweak
func (m *testMockDB) GetAPIKeysByApplicationUUID(applicationUUID string) ([]*models.APIKey, error) { - return nil, nil + return []*models.APIKey{}, nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-controller/pkg/utils/mock_db_test.go` around lines 163 - 165, The stub testMockDB.GetAPIKeysByApplicationUUID currently returns (nil, nil); change it to return an empty slice and nil error to match Storage.GetAPIKeysByApplicationUUID's documented behavior (e.g., return []*models.APIKey{} , nil) so tests observe empty-slice-on-miss semantics instead of nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gateway/gateway-controller/pkg/eventlistener/subscription_processor.go`:
- Around line 182-190: The code reloads API keys using GetAPIKeyByID which
returns only the bare api_keys row (missing ApplicationID/ApplicationName) so
reloaded keys can lose current mapping; change the reload to use a mapping-aware
read (e.g., call GetAPIKeysByAPI filtered by apiKeyID or add a dedicated lookup
that joins application mapping) before appending to affectedKeys and before
calling StoreAPIKey; update the logic around l.db.GetAPIKeyByID / apiKeyID to
call the join-backed method and keep l.logger error handling unchanged.
---
Nitpick comments:
In `@gateway/gateway-controller/pkg/utils/mock_db_test.go`:
- Around line 163-165: The stub testMockDB.GetAPIKeysByApplicationUUID currently
returns (nil, nil); change it to return an empty slice and nil error to match
Storage.GetAPIKeysByApplicationUUID's documented behavior (e.g., return
[]*models.APIKey{} , nil) so tests observe empty-slice-on-miss semantics instead
of nil.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e8f3e945-500e-4fee-ba64-15a16c713ceb
📒 Files selected for processing (7)
gateway/gateway-controller/pkg/api/handlers/handlers_test.gogateway/gateway-controller/pkg/controlplane/api_deleted_test.gogateway/gateway-controller/pkg/eventlistener/subscription_processor.gogateway/gateway-controller/pkg/storage/interface.gogateway/gateway-controller/pkg/storage/sql_store.gogateway/gateway-controller/pkg/storage/sqlite_test.gogateway/gateway-controller/pkg/utils/mock_db_test.go
✅ Files skipped from review due to trivial changes (1)
- gateway/gateway-controller/pkg/storage/sqlite_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- gateway/gateway-controller/pkg/controlplane/api_deleted_test.go
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
gateway/gateway-controller/pkg/utils/subscription_resource.go (1)
209-229: Remove the deadrefreshSubscriptionSnapshotflag.
persistAndSync(...)still forwardsrefreshSubscriptionSnapshot, butpersistAndSyncWithEventData(...)never reads it. Right now every caller passes a value that has no effect, and the helper below is unreachable from this path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-controller/pkg/utils/subscription_resource.go` around lines 209 - 229, The refreshSubscriptionSnapshot boolean parameter is dead — remove it from the function signatures and all call sites: delete the refreshSubscriptionSnapshot parameter from persistAndSync and persistAndSyncWithEventData (and from any helper variants like the unreachable helper mentioned), update the single caller chain so persistAndSync no longer forwards that flag, and remove any dead code or logic that referenced refreshSubscriptionSnapshot; ensure all calls to persistAndSync(...) and persistAndSyncWithEventData(...) pass the new parameter list and update their argument lists accordingly (refer to the functions persistAndSync and persistAndSyncWithEventData to locate and modify definitions and callers).gateway/gateway-controller/pkg/eventlistener/subscription_processor.go (1)
202-215: Consider avoiding the full-store scan here.Every application replica-sync event now walks all configs and all in-memory API keys. On larger gateways that makes one application change O(all APIs + all keys) on every replica.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-controller/pkg/eventlistener/subscription_processor.go` around lines 202 - 215, The current code scans all configs via l.store.GetAll and then calls l.store.GetAPIKeysByAPI for each cfg.UUID, causing O(all APIs + all keys) work per application change; replace this with a direct query for keys by application to avoid the full-store scan. Add or use a store method like GetAPIKeysByApplication(applicationUUID) (or GetAPIKeysByApplicationID) that returns only API keys for the given application, then in subscription_processor.go call that method once, iterate returned apiKeys and populate affectedKeyIDs from apiKey.UUID where apiKey != nil and apiKey.UUID != "" and apiKey.ApplicationID == applicationUUID (or omit the last check if store guarantees it). If the store lacks such an index, implement it (or maintain an application->keys index) so the change is O(keys for application) instead of O(all configs + all keys).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gateway/gateway-controller/pkg/eventlistener/subscription_processor.go`:
- Around line 71-78: The parsing of event.EventData via
parseRemovedApplicationAPIKeyIDs may fail but the handler currently only logs
and continues, causing missed removals; change the code so that when
parseRemovedApplicationAPIKeyIDs returns an error you log the error with context
and then stop processing by returning the error (or otherwise marking the
handler as failed) instead of continuing — ensure removedAPIKeyIDs is only used
when parseRemovedApplicationAPIKeyIDs succeeds and that the handler does not
reach any success log or commit path after a parse error.
---
Nitpick comments:
In `@gateway/gateway-controller/pkg/eventlistener/subscription_processor.go`:
- Around line 202-215: The current code scans all configs via l.store.GetAll and
then calls l.store.GetAPIKeysByAPI for each cfg.UUID, causing O(all APIs + all
keys) work per application change; replace this with a direct query for keys by
application to avoid the full-store scan. Add or use a store method like
GetAPIKeysByApplication(applicationUUID) (or GetAPIKeysByApplicationID) that
returns only API keys for the given application, then in
subscription_processor.go call that method once, iterate returned apiKeys and
populate affectedKeyIDs from apiKey.UUID where apiKey != nil and apiKey.UUID !=
"" and apiKey.ApplicationID == applicationUUID (or omit the last check if store
guarantees it). If the store lacks such an index, implement it (or maintain an
application->keys index) so the change is O(keys for application) instead of
O(all configs + all keys).
In `@gateway/gateway-controller/pkg/utils/subscription_resource.go`:
- Around line 209-229: The refreshSubscriptionSnapshot boolean parameter is dead
— remove it from the function signatures and all call sites: delete the
refreshSubscriptionSnapshot parameter from persistAndSync and
persistAndSyncWithEventData (and from any helper variants like the unreachable
helper mentioned), update the single caller chain so persistAndSync no longer
forwards that flag, and remove any dead code or logic that referenced
refreshSubscriptionSnapshot; ensure all calls to persistAndSync(...) and
persistAndSyncWithEventData(...) pass the new parameter list and update their
argument lists accordingly (refer to the functions persistAndSync and
persistAndSyncWithEventData to locate and modify definitions and callers).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ab1ae6b4-1c9e-47a5-aa6c-19b633e793be
📒 Files selected for processing (5)
gateway/gateway-controller/pkg/eventlistener/subscription_processor.gogateway/gateway-controller/pkg/eventlistener/subscription_processor_test.gogateway/gateway-controller/pkg/models/application_event_data.gogateway/gateway-controller/pkg/utils/subscription_resource.gogateway/gateway-controller/pkg/utils/subscription_resource_test.go
✅ Files skipped from review due to trivial changes (1)
- gateway/gateway-controller/pkg/models/application_event_data.go
a3a4bc1 to
d6f716f
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
gateway/gateway-controller/pkg/utils/subscription_resource.go (1)
221-240:⚠️ Potential issue | 🟠 Major
refreshSubscriptionSnapshotis ignored now.
persistAndSyncWithEventData()never consultsrefreshSubscriptionSnapshot, so every caller still passingtruenow skips the local snapshot update and only emits the replica-sync event. That silently changes subscription and subscription-plan writes from immediate local refresh to eventual convergence through the event loop.Suggested fix
func (s *SubscriptionResourceService) persistAndSyncWithEventData( eventType eventhub.EventType, action string, entityID string, eventData string, refreshSubscriptionSnapshot bool, correlationID string, logger *slog.Logger, persist func() error, ) error { if logger == nil { logger = slog.Default() } if err := persist(); err != nil { return err } + if refreshSubscriptionSnapshot { + _ = s.refreshSubscriptionSnapshot(entityID, correlationID, logger) + } + s.publishEvent(eventType, action, entityID, eventData, correlationID, logger) return nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@gateway/gateway-controller/pkg/utils/subscription_resource.go` around lines 221 - 240, The function persistAndSyncWithEventData currently ignores the refreshSubscriptionSnapshot flag; update it to check if refreshSubscriptionSnapshot is true and, if so, perform the local snapshot refresh before publishing the replica-sync event (call the service's local refresh method — e.g. s.refreshSubscriptionSnapshot(entityID) or the appropriate cache/snapshot update function), handle/log any error returned (return it or log and proceed based on existing behavior), then continue to call s.publishEvent(eventType, action, entityID, eventData, correlationID, logger); ensure this uses the existing persistAndSyncWithEventData flow and retains error handling around persist().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@gateway/gateway-controller/pkg/controlplane/api_deleted_test.go`:
- Around line 61-84: The recorder struct recordingControlPlaneXDSManager should
track RefreshSnapshot() calls so the test can detect inline refreshes: add a
refreshCallCount field to the struct and increment it in the RefreshSnapshot
method (similar to storeCallCount/revokeCallCount/removeCallCount), and update
any assertions in
TestClient_handleApplicationUpdatedEvent_DoesNotRefreshXDSInline to check
refreshCallCount instead of relying only on store/revoke/remove counts.
In `@gateway/gateway-controller/pkg/utils/subscription_resource.go`:
- Around line 155-161: The event builds a snapshot via buildApplicationEventData
before calling ReplaceApplicationAPIKeyMappings, which allows race conditions
and stale RemovedAPIKeyIDs; instead, move the diffing into the storage layer so
ReplaceApplicationAPIKeyMappings computes and returns the actually deleted API
key IDs within the same DB transaction (e.g., change
ReplaceApplicationAPIKeyMappings to return (removedIDs []string, err error)),
update calls in this file (and the similar block at lines 169-207) to call
ReplaceApplicationAPIKeyMappings first (or use persistAndSyncWithEventData's
inner func to obtain removedIDs), then construct eventData using those returned
removedIDs and pass that into persistAndSyncWithEventData so the event reflects
the real transactional deletes.
---
Outside diff comments:
In `@gateway/gateway-controller/pkg/utils/subscription_resource.go`:
- Around line 221-240: The function persistAndSyncWithEventData currently
ignores the refreshSubscriptionSnapshot flag; update it to check if
refreshSubscriptionSnapshot is true and, if so, perform the local snapshot
refresh before publishing the replica-sync event (call the service's local
refresh method — e.g. s.refreshSubscriptionSnapshot(entityID) or the appropriate
cache/snapshot update function), handle/log any error returned (return it or log
and proceed based on existing behavior), then continue to call
s.publishEvent(eventType, action, entityID, eventData, correlationID, logger);
ensure this uses the existing persistAndSyncWithEventData flow and retains error
handling around persist().
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 51e20b3b-6361-4830-a7c2-365555b677ab
📒 Files selected for processing (13)
gateway/gateway-controller/pkg/api/handlers/handlers_test.gogateway/gateway-controller/pkg/controlplane/api_deleted_test.gogateway/gateway-controller/pkg/controlplane/client.gogateway/gateway-controller/pkg/eventlistener/listener_test.gogateway/gateway-controller/pkg/eventlistener/subscription_processor.gogateway/gateway-controller/pkg/eventlistener/subscription_processor_test.gogateway/gateway-controller/pkg/models/application_event_data.gogateway/gateway-controller/pkg/storage/interface.gogateway/gateway-controller/pkg/storage/sql_store.gogateway/gateway-controller/pkg/storage/sqlite_test.gogateway/gateway-controller/pkg/utils/mock_db_test.gogateway/gateway-controller/pkg/utils/subscription_resource.gogateway/gateway-controller/pkg/utils/subscription_resource_test.go
💤 Files with no reviewable changes (1)
- gateway/gateway-controller/pkg/controlplane/client.go
✅ Files skipped from review due to trivial changes (3)
- gateway/gateway-controller/pkg/utils/mock_db_test.go
- gateway/gateway-controller/pkg/models/application_event_data.go
- gateway/gateway-controller/pkg/storage/sqlite_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
- gateway/gateway-controller/pkg/eventlistener/listener_test.go
- gateway/gateway-controller/pkg/api/handlers/handlers_test.go
- gateway/gateway-controller/pkg/eventlistener/subscription_processor.go
- gateway/gateway-controller/pkg/storage/sql_store.go
| type recordingControlPlaneXDSManager struct { | ||
| storeCallCount int | ||
| revokeCallCount int | ||
| removeCallCount int | ||
| } | ||
|
|
||
| func (m *recordingControlPlaneXDSManager) StoreAPIKey(string, string, string, *models.APIKey, string) error { | ||
| m.storeCallCount++ | ||
| return nil | ||
| } | ||
|
|
||
| func (m *recordingControlPlaneXDSManager) RevokeAPIKey(string, string, string, string, string) error { | ||
| m.revokeCallCount++ | ||
| return nil | ||
| } | ||
|
|
||
| func (m *recordingControlPlaneXDSManager) RemoveAPIKeysByAPI(string, string, string, string) error { | ||
| m.removeCallCount++ | ||
| return nil | ||
| } | ||
|
|
||
| func (m *recordingControlPlaneXDSManager) RefreshSnapshot() error { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Track RefreshSnapshot() in this recorder too.
The new test is meant to catch any inline xDS work during handleApplicationUpdatedEvent, but this double only records store/revoke/remove calls. If the client starts calling RefreshSnapshot() inline again, TestClient_handleApplicationUpdatedEvent_DoesNotRefreshXDSInline still passes.
Suggested test fix
type recordingControlPlaneXDSManager struct {
storeCallCount int
revokeCallCount int
removeCallCount int
+ refreshCallCount int
}
@@
func (m *recordingControlPlaneXDSManager) RefreshSnapshot() error {
+ m.refreshCallCount++
return nil
} if xdsManager.removeCallCount != 0 {
t.Fatalf("expected no inline xDS remove calls, got %d", xdsManager.removeCallCount)
}
+ if xdsManager.refreshCallCount != 0 {
+ t.Fatalf("expected no inline xDS refresh calls, got %d", xdsManager.refreshCallCount)
+ }
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@gateway/gateway-controller/pkg/controlplane/api_deleted_test.go` around lines
61 - 84, The recorder struct recordingControlPlaneXDSManager should track
RefreshSnapshot() calls so the test can detect inline refreshes: add a
refreshCallCount field to the struct and increment it in the RefreshSnapshot
method (similar to storeCallCount/revokeCallCount/removeCallCount), and update
any assertions in
TestClient_handleApplicationUpdatedEvent_DoesNotRefreshXDSInline to check
refreshCallCount instead of relying only on store/revoke/remove counts.
d6f716f to
f68c54f
Compare
This pull request refactors how application update events are handled in the gateway controller, ensuring that application-to-API key mappings and xDS state are synchronized from the canonical database rather than being updated inline. The changes improve correctness, reliability, and test coverage for application replica synchronization.
Key changes:
Refactoring application update event handling:
Client.handleApplicationUpdatedEvent, so application updates no longer directly update xDS state; instead, synchronization is handled by the event listener. [1] [2] [3]processApplicationEventto fully synchronize in-memory API key/application state and xDS state from the canonical DB, including error handling and logging for each step.Testing improvements:
TestHandleEvent_ApplicationUpdate_SyncsMemoryAndXDSFromDB) to verify that application update events correctly reconcile memory and xDS state with the database.Supporting changes and cleanup:
recordingControlPlaneXDSManagerfor more granular test verification.These changes collectively ensure that application and API key state is reliably synchronized across replicas, with robust test coverage to catch regressions.## Purpose
Goals
Approach
User stories
Documentation
Automation tests
Security checks
Samples
Related PRs
Test environment
Summary by CodeRabbit
Refactor
New Features
Tests