feat(sqs): listenless SQSServer for admin-only deployments#740
feat(sqs): listenless SQSServer for admin-only deployments#740
Conversation
Before: a build started without --sqsAddress short-circuited startSQSServer to (nil, nil); newSqsQueuesSource then received nil and admin.NewServer left /admin/api/v1/sqs/* off the wire, so operators who wanted admin visibility into queues without exposing the public SigV4 surface had no way to do that. After: startSQSServer constructs the SQSServer even when sqsAddr is empty, passing a nil net.Listener through to NewSQSServer. (*SQSServer).Run() detects the nil listener and blocks on reaperCtx.Done() instead of httpServer.Serve, so: - The public SigV4 HTTP listener is suppressed (matches the prior behaviour for that surface). - The reaper and throttle-sweep goroutines still run, so retention math behind the admin counters stays correct. - The admin bridge in main_admin.go now sees a non-nil *adapter.SQSServer and wires /admin/api/v1/sqs/* on the wire — admin operations (list, describe, delete) work the same way they would on a build with a public listener. Caller / semantic audit: - adapter/sqs_catalog.go queueURL nil-guards s.listen already; no other reads of s.listen exist. - main.go runtime field: comment updated to reflect that r.sqsServer is always non-nil after startup. The actual field type is unchanged; the only change is that its zero- value pathway is no longer reachable. - newSqsQueuesSource still nil-guards its argument as a defensive check for test fixtures that pass nil; no caller in main.go passes nil any more. - Dynamo and S3 adapters keep the prior listener-required semantics. Symmetric refactor for those is a deliberate follow-up — this PR scopes to the immediate ask (SQS admin reachable on listenless builds). Tests: TestSQSServer_ListenlessRunStopRoundTrip pins the nil-listener Run/Stop round trip — Run() must enter the reaperCtx-block branch, Stop() must wind down both reaper and throttle-sweep goroutines, and Run() must return nil within a 2s deadline. TestSQSServer_StopShutsDown stays unchanged, anchoring the listening-path round trip alongside it. Self-review (5 lenses): 1. Data loss — N/A; no storage / Raft / FSM touch. 2. Concurrency — Run() now has two terminating paths (reaperCtx.Done() vs httpServer.Serve return); Stop() cancels the reaper ctx, which both paths observe. 3. Performance — listenless mode adds zero work to the listening path. Listenless construction adds no Raft round-trips on startup. 4. Data consistency — admin handlers continue to take a fresh nextTxnReadTS per call; the listener's presence never affected snapshot semantics. 5. Test coverage — new TestSQSServer_ListenlessRunStopRoundTrip pins the new branch end-to-end; the existing TestSQSServer_StopShutsDown anchors the listening branch so a regression in either does not silently swap them.
|
@claude review |
|
Warning Rate limit exceeded
To continue reviewing without waiting, purchase usage credits in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Claude finished @bootjp's task in 3m 14s —— View job Code Review
The implementation is sound and clean. The nil-listener branch in Behavior change: credential file is now loaded even when
|
There was a problem hiding this comment.
Code Review
This pull request introduces a 'listenless' mode for the SQS adapter, allowing the SQS server to be initialized without a public HTTP listener. This ensures that SQS admin endpoints remain functional even when the public SigV4 surface is disabled. The changes include updates to the Run method to handle nil listeners, a new regression test for this deployment shape, and modifications to startSQSServer to always instantiate the server. Feedback was provided regarding the silent ignoring of errors when closing the network listener, suggesting that these should be logged to improve visibility into potential resource leaks.
| if err != nil { | ||
| _ = sqsL.Close() | ||
| if sqsL != nil { | ||
| _ = sqsL.Close() |
There was a problem hiding this comment.
The error from sqsL.Close() is silently ignored. According to the general rules, errors from Close() methods on resources like network listeners should be logged to ensure that resource leaks or other cleanup problems are visible to operators.
References
- Do not silently ignore errors from Close() methods on resources like network connections. Log them to ensure resource leaks or other cleanup problems are visible.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: edeec0ff3d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| staticCreds, err := loadSigV4StaticCredentialsFile(credentialsFile, "sqs") | ||
| if err != nil { | ||
| _ = sqsL.Close() | ||
| if sqsL != nil { | ||
| _ = sqsL.Close() | ||
| } |
There was a problem hiding this comment.
Skip SQS credentials validation in listenless mode
When --sqsAddress is empty, this function still parses --sqsCredentialsFile and returns an error on any file issue before constructing the listenless server. That makes admin-only SQS deployments fail to start if the credentials path is stale/malformed, even though admin queue handlers do not use SigV4 credentials. This is a behavior regression from the previous "SQS disabled" path and can unintentionally keep /admin/api/v1/sqs/* unavailable in the exact mode this change is meant to support.
Useful? React with 👍 / 👎.
Summary
Make
/admin/api/v1/sqs/*reachable on builds that start without--sqsAddress. The admin handlers (list, describe, delete queues) only need the coordinator and store — not the public SigV4 listener — so this PR teachesstartSQSServerto construct a listenless*adapter.SQSServerinstead of returning(nil, nil)and dropping the admin bridge.Concretely:
startSQSServerno longer short-circuits whensqsAddris empty. It still skipslc.Listen(no port is opened), but it constructsNewSQSServer(nil, …)and threads the result back through the runtime.(*SQSServer).Run()detects the nil listener and blocks onreaperCtx.Done()instead of callinghttpServer.Serve. The reaper and throttle-sweep goroutines still run so retention math behind the admin counters stays correct.main_admin.go'snewSqsQueuesSourcenow sees a non-nil*adapter.SQSServerand wires/admin/api/v1/sqs/*; theif sqsServer == nil { return nil }guard stays as a defensive check for tests that still pass nil.r.sqsServerupdated to reflect that the field is always non-nil after startup.Why this is "option A" of the prior discussion
We discussed three options earlier:
*_proposed_*.mddesign doc per CLAUDE.md.You picked A. This PR scopes to SQS only since that was the immediate ask. Symmetric Dynamo/S3 changes can land as a parallel PR if you want; both follow the same structural pattern (drop early-return in
start*Server, teach(*XxxServer).Run()to handle nil listener) but each adapter'sRun()body has different background work to keep alive (S3 manifest cleanup workers, etc.) so a separate review pass per adapter is the cleaner cut.Caller / semantic audit
Per the standing rule on semantic-changing edits:
adapter/sqs_catalog.go::queueURLalready nil-guardss.listen; no other reads ofs.listenexist in the adapter package (verified via grep).main.goruntime field: only comment updated. The Go type is unchanged; the only change is that the zero-value pathway is no longer reachable.newSqsQueuesSourcestill nil-guards its argument; no caller inmain.gopasses nil any more, but the guard remains for tests that construct admin servers without an SQS adapter.(*SQSServer).Run()now has two terminating paths (reaper-ctx-block andhttpServer.Servereturn).Stop()cancels the reaper ctx, which both paths observe — same shutdown semantics on either branch.Tests
TestSQSServer_ListenlessRunStopRoundTrip(new) — pins the nil-listener Run/Stop round trip. ConstructsNewSQSServer(nil, nil, nil), launchesRun()in a goroutine, callsStop(), and assertsRun()returns nil within 2s. Without the news.listen == nilbranch inRun(),httpServer.Serve(nil)would panic and this test would fail.TestSQSServer_StopShutsDown(existing) — unchanged. Anchors the listening-path round trip so a regression in either branch does not silently swap them.Self-review (5 lenses)
Run()adds a second terminating path; both paths observe the samereaperCtxcancellation thatStop()triggers.nextTxnReadTSper call; the listener's presence never affected snapshot semantics.TestSQSServer_ListenlessRunStopRoundTrippins the new branch; the existingTestSQSServer_StopShutsDownanchors the listening branch.Test plan
go test -race -count=1 -run 'TestSQS' ./adapter/...(all SQS tests pass)golangci-lint --config=.golangci.yaml run ./...cleango build ./...clean--sqsAddress="", confirm/admin/api/v1/sqs/queuesreturns the catalog list (operator-driven, out of scope for the merge gate)Refs
adapter/sqs_admin.go— admin entrypointsmain_admin.go— admin bridge