feat(s3): add a blob-backed S3-compatible API by pthmas · Pull Request #53 · evstack/apex

pthmas · 2026-04-28T17:32:35Z

Summary

Add SQLite-backed S3-compatible API for bucket and object CRUD
Celestia as verification layer, not storage: PutObject submits a small JSON CommitmentEnvelope (~200 bytes) containing SHA-256 + ETag + object metadata to Celestia, not raw data
SQLite stores raw object data; GetObject serves from local cache; clients can verify authenticity by hashing the downloaded body and comparing against the on-chain SHA-256
All write operations (CreateBucket, DeleteBucket, PutObject, DeleteObject) return ErrReadOnly when no Celestia submitter is configured
AWS SigV4 authentication with ±15 minute timestamp skew enforcement; optional — server warns at startup when unconfigured, with a louder warning for non-loopback binds
parsePath uses r.URL.RawPath to avoid double URL-decoding of %2F in object keys
canonicalQueryString uses r.URL.RawQuery + url.PathUnescape to treat + as literal + (RFC3986), not space — critical for SigV4 correctness
DeleteBucket wrapped in a transaction to prevent TOCTOU race between empty-check and delete
maxObjectSize = 2 MB, independent of Celestia blob size since only the envelope is submitted
Empty objects rejected — no point submitting a commitment with no data
S3 socket pre-bound via net.ListenConfig.Listen(ctx, ...) so startup failures surface synchronously; s3Srv closed on all subsequent startup error paths
Credential pair validated only when S3 is enabled; credentials are env-var-only (APEX_S3_ACCESS_KEY_ID, APEX_S3_SECRET_ACCESS_KEY), never read from YAML

Architecture

PUT /bucket/key
  → authenticate (SigV4, optional)
  → read body (max 2 MB), reject empty
  → compute SHA-256 + MD5 (ETag)
  → marshal CommitmentEnvelope JSON
  → submit envelope to Celestia
  → store raw data + metadata in SQLite

GET /bucket/key
  → serve raw data from SQLite
  → client can verify: sha256sum(body) == envelope.sha256 (from Celestia)

CommitmentEnvelope (submitted to Celestia)

{
  "version": 1,
  "bucket": "my-bucket",
  "key": "path/to/file.txt",
  "content_type": "text/plain",
  "size": 1234,
  "sha256": "e3b0c44298fc1c149afbf4c8996fb924...",
  "etag": "d41d8cd98f00b204e9800998ecf8427e"
}

New files

File	Purpose
`pkg/s3/types.go`	Domain types: `Object`, `Bucket`, `CommitmentEnvelope`, `ListObjectsResult`
`pkg/s3/service.go`	Business logic: `ObjectStore` interface, `Service`, `validateBucketName`
`pkg/s3/auth.go`	Full SigV4 implementation: parse, validate, canonical request, signing key derivation
`pkg/s3/server.go`	HTTP handler: routing, XML marshaling, S3 error responses
`pkg/store/object.go`	SQLite `ObjectStore` implementation: upsert, list with prefix/delimiter/pagination
`pkg/store/migrations/004_s3_objects.sql`	`s3_buckets` + `s3_objects` tables, composite unique index

Test plan

Summary by CodeRabbit

New Features
- S3-compatible HTTP API (configurable address/region) with full bucket/object lifecycle and SigV4 auth.
- S3 uploads are submitted for verifiable auditing and require the SQLite-backed storage backend.
- Runtime config and validation for S3 settings (namespace, credentials, enablement) and optional server startup.
Migrations
- Database migration added to store S3 buckets and objects.
Tests
- Extensive unit, integration, and end-to-end tests plus an e2e-only test recipe.

coderabbitai · 2026-04-28T17:32:43Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: efad45ae-349a-43f1-a1c8-a9b51441c146

📥 Commits

Reviewing files that changed from the base of the PR and between 83ecea4 and 9451b9a.

📒 Files selected for processing (3)

cmd/apex/main.go
cmd/apex/main_test.go
justfile

✅ Files skipped from review due to trivial changes (1)

cmd/apex/main_test.go

🚧 Files skipped from review as they are similar to previous changes (1)

cmd/apex/main.go

📝 Walkthrough

Walkthrough

Adds a new S3-compatible API: config, SigV4 auth, SQLite-backed object store and migration, S3 service that submits Celestia commitments, HTTP server wiring into apex startup/shutdown, and comprehensive unit/integration/e2e tests. (50 words)

Changes

S3 API Implementation

Layer / File(s)	Summary
Data Models & Limits `pkg/s3/types.go`	Adds S3 models: `CommitmentEnvelope`, `Bucket`, `Object`, `ListObjectsResult`, `ObjectInfo` and internal limits (`maxObjectSize`, `maxKeyLength`).
Database Migration `pkg/store/migrations/004_s3_objects.sql`, `pkg/store/sqlite.go`	Adds `s3_buckets` and `s3_objects` tables and registers migration v4.
Persistence Layer `pkg/store/object.go`, `pkg/store/object_test.go`	Implements SQLite-backed `ObjectStore` with bucket/object CRUD, listing (prefix/delimiter), pagination, upsert semantics, and error mappings; tests exercise lifecycle, listing, delimiter/pagination behavior.
Service Layer `pkg/s3/service.go`, `pkg/s3/service_test.go`	Adds `Service` wrapping `ObjectStore` and a `submit.Submitter`: validates bucket names/keys, enforces size/empty/key-length rules, builds `CommitmentEnvelope`, submits blob to Celestia, persists object only after successful submission; supports read-only mode when submitter is nil; tests cover submission/rollback and errors.
HTTP Server `pkg/s3/server.go`, `pkg/s3/server_test.go`	Implements S3-compatible `Server` with routing for service/buckets/objects, List v1/v2, XML responses, S3 error mapping, body-size enforcement, URL-decoding, and extensive tests (including auth and payload-hash mismatch).
Authentication `pkg/s3/auth.go`	Implements AWS SigV4 validation: parse `Authorization`, validate credential scope and `X-Amz-Date` skew, build canonical request/string-to-sign, derive signing key, compute/compare signature, and return structured S3 error codes.
Commitment Building `pkg/submit/celestia_blob.go`, `pkg/submit/submit_test.go`	Adds `BuildBlob` to construct Square blob and compute a Celestia inclusion commitment using go-square/merkle; tests validate success and reject invalid share versions/reserved namespace.
Command Wiring `cmd/apex/main.go`, `cmd/apex/main_test.go`	Opens blob submitter earlier, adds `setupS3Server` and `isLoopbackBindAddr`, passes submitter into API service, conditionally starts S3 HTTP server, and wires S3 server into startup/shutdown lifecycle; adds loopback bind address tests.
Configuration `config/config.go`, `config/load.go`, `config/load_test.go`	Adds `S3APIConfig` to config and defaults; template `s3:` section; env-var overrides for credentials; `validateS3API` enforces sqlite backend, credential pairing, namespace validation, and submission interlock; tests added.
Integration Tests `pkg/s3/integration_test.go`	Adds HTTP integration tests using an httptest server and a mock submitter covering full lifecycle, XML structure, errors, empty-object rejection, and delimiter/pagination behavior.
End-to-end tests & tooling `e2e/s3_test.go`, `e2e/submission_test.go`, `e2e/go.mod`, `justfile`	Adds e2e test that runs APEX with S3 enabled and exercises S3 lifecycle via AWS SDK; extends e2e helpers, promotes AWS SDK modules, and adds `e2e-s3` justfile recipe.
Dependency updates `go.mod`, `e2e/go.mod`	Adds/adjusts direct dependencies (e.g., go-square/merkle) and AWS SDK modules for e2e testing.

Sequence Diagram

sequenceDiagram
    participant Client
    participant S3Server as S3 HTTP Server
    participant Service as S3 Service
    participant ObjectStore as ObjectStore (SQLite)
    participant Submitter as Submitter (Celestia)

    Client->>S3Server: PUT /bucket/key (SigV4, body)
    activate S3Server
    S3Server->>S3Server: Authenticate SigV4, validate headers
    S3Server->>Service: PutObject(bucket,key,data,contentType)
    deactivate S3Server

    activate Service
    Service->>Service: Compute SHA256, MD5(ETag), build CommitmentEnvelope
    Service->>Submitter: Submit(blob)
    activate Submitter
    Submitter-->>Service: success
    deactivate Submitter
    Service->>ObjectStore: PutObject(...) (persist metadata + data)
    activate ObjectStore
    ObjectStore-->>Service: persisted
    deactivate ObjectStore
    Service-->>S3Server: ETag / 200 OK
    deactivate Service
    S3Server-->>Client: 200 OK (ETag)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

evstack/apex#27: touches SQLiteStore and cmd/apex wiring used by this PR.
evstack/apex#45: related submitter/commitment submission implementation integrated by this PR.
evstack/apex#30: earlier startup/shutdown wiring changes in cmd/apex/main.go that may overlap.

Poem

🐰 I nibble keys and sign with glee,

Buckets sprout where bytes roam free.
I hop from Put to Get with cheer,
Commitments anchored, far and near.
Hooray — a rabbit built S3 here!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 8.11% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main feature being added: an S3-compatible API backed by blobs (Celestia commitments). It is specific, avoids vague terminology, and directly reflects the primary change across the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch pthmas/s3-blob-export

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Write operations (CreateBucket, DeleteBucket, PutObject, DeleteObject) now return ErrReadOnly immediately when no submitter is wired in. Previously PutObject would silently write to SQLite with no Celestia anchor (height=0, empty commitments), producing orphaned data. ErrReadOnly maps to 405 MethodNotAllowed in the HTTP error handler. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- object.go: escape LIKE wildcards (%, _, \) in ListObjects prefix to prevent unintended pattern matching on user-supplied keys - object.go: populate Object.Namespace from o.ns on read instead of scanning per-row DB value, eliminating stale-config drift - server.go: remove dead handleBucket wrapper, route GET bucket directly to handleListObjects - server.go: clamp max-keys to 1000 per S3 spec - server.go: apply http.MaxBytesReader before reading PUT body so oversized requests are rejected at the network layer, not after buffering the full payload in memory - service.go: detect http.MaxBytesError from MaxBytesReader and map to ErrObjectTooLarge - auth.go: validate X-Amz-Date is within ±15 min to prevent replay attacks with captured signed requests - tests: update to use mockSubmitter for write ops, fix stale hardcoded SigV4 timestamp, add TestService_ReadOnly coverage Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Empty PUTs skip Celestia submission and store locally with Height=0 and no commitments. This is intentional to preserve S3 tool compatibility (e.g. folder placeholder keys like "prefix/"). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Critical fixes: - server.go: parsePath now uses r.URL.RawPath to avoid double-decoding; percent-encoded characters in keys (e.g. %2F) are preserved through path splitting and decoded per-segment - server.go: remove query-param priority over HTTP method in bucket router; DELETE/PUT/HEAD on a bucket with query params now routes correctly instead of falling through to handleListObjects - object.go: wrap DeleteBucket count check and delete in a single transaction to close TOCTOU race where a concurrent write could sneak in between the two separate queries Medium fixes: - object.go: remove redundant GetBucket call from PutObject; the SQLite FK constraint enforces bucket existence and is detected via isSQLiteFKConstraint → ErrBucketNotFound - migrations/005: drop idx_s3_objects_bucket; the composite index on (bucket, key) already covers all bucket-only lookups - service.go: validate bucket names (3-63 chars, lowercase alphanum + hyphen, no leading/trailing hyphen, not an IP address) and key length (max 1024 bytes); new ErrInvalidBucketName and ErrKeyTooLong errors map to 400 in the HTTP layer Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- auth.go: refactor authenticateRequest into focused helpers (parseAuthorizationHeader, validateCredentialScope, validateAmzDate, payloadHashFromRequest) for readability and testability - server.go: additional hardening from review - service.go: minor fix - integration_test.go: expand S3 integration coverage - object.go: additional store hardening - object_test.go: expand ObjectStore test coverage Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Celestia now serves as a verification layer, not storage. PutObject computes SHA256 of the object, builds a ~200-byte CommitmentEnvelope (bucket, key, size, sha256, etag), and submits that JSON to Celestia. Raw object data remains cached in SQLite and served from there. - Add CommitmentEnvelope struct and SHA256 field to Object - Update ObjectStore.PutObject interface to carry sha256 param - Squash migrations 004+005 into single final schema with sha256 column - Update store and all tests for new interface and envelope assertions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Verify X-Amz-Content-Sha256 header matches actual body on PutObject (SigV4 compliance; prevents body substitution after signing) - Handle content-length as canonical header for SigV4 compatibility - Check bucket existence before submitting to Celestia to avoid orphaned on-chain commitments for non-existent buckets - Add readRequestBody helper to buffer+restore body for hash check Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- ListObjects: add LIMIT maxKeys+1 to flat (no-delimiter) queries so SQLite stops scanning after the page boundary instead of reading all remaining rows - ListObjects: remove dead `key <= marker` guard (SQL already filters `key > marker`); the existing `commonPrefix <= marker` guard is correct and retained - Config: move S3 SigV4 credentials out of YAML into env vars only (APEX_S3_ACCESS_KEY_ID / APEX_S3_SECRET_ACCESS_KEY), matching the APEX_AUTH_TOKEN convention; yaml:"-" prevents yaml decode - Simplify isLoopbackBindAddr from 33 lines to 15 by removing scheme-stripping logic that doesn't apply to bind addresses - Update e2e tests to pass credentials via env vars on the subprocess - Update config load test to use t.Setenv for partial-credential check Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

S3 PutObject submits a CommitmentEnvelope JSON to Celestia, not the raw object bytes. The previous test computed the commitment of raw data and passed it to waitForIndexedBlob — that commitment would never exist on-chain so the test would always timeout. Fix: compute the same CommitmentEnvelope (version, bucket, key, content_type, size, sha256, etag) that service.go builds, marshal it, derive its blob commitment, and verify that envelope lands on-chain. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Drop Height, Namespace, Commitments from Object struct and PutObject interface; these were write-only and never surfaced in HTTP responses or used downstream - Pass pre-computed etag through PutObject to avoid double MD5 computation - Remove ns field from ObjectStore; namespace tracking was write-only - Remove ensureBucketExists one-liner wrapper; inline GetBucket at call sites - Remove dead method guards in handleCreateBucket/DeleteBucket/HeadBucket; ServeHTTP already routes by method - Extract xmlContents/xmlCommonPrefix to package level (shared by V1 and V2) - Move readRequestBody from auth.go to server.go where it is used Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

height, namespace, commitments were added speculatively but are never written or read by the Go code. Remove from schema while still in PR. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…a commitment Empty objects bypassed Celestia submission and were stored locally with no SHA256 or ETag, diverging from S3 semantics. Reject them with ErrEmptyObject (400 InvalidRequest) so every stored object has a verifiable on-chain commitment. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Importing pkg/s3 pulled in pkg/submit → Cosmos SDK gRPC code, which registered coin.proto twice alongside the e2e module's own Tastora deps. Duplicate proto registration panics at init. Fix by duplicating the CommitmentEnvelope struct locally in the test file. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- config/config.go: remove field alignment (gofumpt) - pkg/s3/types.go: remove trailing spaces in struct tag (gofumpt) - pkg/s3/auth.go: replace fmt.Sprintf with strconv.FormatInt (perfsprint) - pkg/s3/server_test.go: hardcode auth creds in test helper (unparam) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

e2e/submission_test.go (1)

299-333: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard proc.logs against concurrent access.

cmd.Stdout/cmd.Stderr write into logs from background goroutines while waitForApexHTTP, waitForS3HTTP, waitForIndexedBlob, doRPC, and Stop read proc.logs.String() before the child exits. A plain bytes.Buffer is not thread-safe for this concurrent access pattern, causing data races flagged by go test -race.

Suggested fix

+type safeBuffer struct {
+	mu sync.Mutex
+	bytes.Buffer
+}
+
+func (b *safeBuffer) Write(p []byte) (int, error) {
+	b.mu.Lock()
+	defer b.mu.Unlock()
+	return b.Buffer.Write(p)
+}
+
+func (b *safeBuffer) String() string {
+	b.mu.Lock()
+	defer b.mu.Unlock()
+	return b.Buffer.String()
+}
+
 type apexProcess struct {
 	cmd     *exec.Cmd
 	done    chan struct{}
 	waitErr error
-	logs    *bytes.Buffer
+	logs    *safeBuffer
 }
 
 func startApexProcess(t *testing.T, binaryPath string, configPath string, envVars ...string) *apexProcess {
 	t.Helper()
@@
-	var logs bytes.Buffer
+	var logs safeBuffer
 	cmd.Stdout = &logs
 	cmd.Stderr = &logs

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e/submission_test.go` around lines 299 - 333, The bytes.Buffer in
apexProcess is written concurrently by cmd.Stdout/Stderr while other helpers
(waitForApexHTTP, waitForS3HTTP, waitForIndexedBlob, doRPC, Stop) read
proc.logs, causing data races; fix by making access to logs thread-safe: add a
sync.Mutex to apexProcess and either (a) create a small lockedWriter type whose
Write method locks the mutex and forwards to the underlying bytes.Buffer and set
cmd.Stdout/cmd.Stderr to that writer in startApexProcess, and expose a readLogs
method on apexProcess that locks the mutex and returns logs.String(), or (b) add
a read/write wrapper methods that lock around reads and writes and ensure all
code (startApexProcess, Stop, waitFor..., doRPC) uses those wrappers instead of
accessing proc.logs directly.

🧹 Nitpick comments (3)

config/load_test.go (1)
355-505: ⚡ Quick win

Consider collapsing these S3 validation cases into one table-driven test.

These cases all follow the same write/load/assert flow with only the YAML and expected error changing. One table would keep the file shorter and make the next S3 validation case cheaper to add.

As per coding guidelines, "Use table-driven tests in Go test files."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@config/load_test.go` around lines 355 - 505, The four near-duplicate tests
(TestLoadRejectsS3EnabledWithNonSQLiteStorage,
TestLoadRejectsS3SubmissionWithoutNamespace, TestLoadRejectsInvalidS3Namespace,
TestLoadRejectsPartialS3Credentials) should be collapsed into a single
table-driven test (e.g., TestLoadRejectsS3Validations) that iterates cases
containing: name, YAML content, optional env setup function (for the
partial-credentials case), and expected error substring; for each case create a
temp config file, apply env setup, call Load, and assert the error contains the
expected substring (use t.Run for subtests and only call t.Parallel() where
safe, avoid parallelizing the case that sets env vars). Reference the existing
Load function and the four test names when locating the code to replace.
pkg/store/object.go (1)
286-291: ⚡ Quick win

Use typed SQLite constraint codes here instead of matching error strings.

These helpers are coupled to the exact modernc.org/sqlite error text. If that wording changes, bucket/object constraint violations will start leaking as generic internal errors instead of ErrBucketAlreadyExists / ErrBucketNotFound. Prefer errors.As on the driver's error type and branch on the constraint code/subcode.

As per coding guidelines, **/*.go: Use modernc.org/sqlite for SQLite database operations (CGo-free implementation).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/store/object.go` around lines 286 - 291, The helpers
isSQLiteUniqueConstraint and isSQLiteFKConstraint currently match error strings;
change them to use errors.As to detect the modernc.org/sqlite error type (the
driver's error struct) and inspect its constraint/extended code (or Code/SubCode
fields) to determine UNIQUE vs FOREIGN KEY violations so that callers can
reliably map to ErrBucketAlreadyExists and ErrBucketNotFound; update imports to
include modernc.org/sqlite and errors, replace string.Contains checks with typed
checks against the sqlite error code constants, and preserve nil-safety and
behavior otherwise.
pkg/s3/server_test.go (1)
34-378: 🏗️ Heavy lift

Collapse the repeated HTTP cases into table-driven tests.

The coverage is solid, but most of this file repeats the same server setup, request construction, and status/body assertions in one-off tests. Converting the CRUD/auth permutations into table-driven cases will make this suite much easier to extend and keeps it aligned with the repo’s test conventions.

As per coding guidelines, **/*_test.go: Use table-driven tests pattern for test cases.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/s3/server_test.go` around lines 34 - 378, Many tests in this file repeat
the same setup, request construction, and assertions; convert them to
table-driven tests to reduce duplication. Create one or more test tables
(structs with fields like name, useAuth(bool), setupActions (e.g., call
setupHTTPTestServer or setupHTTPTestServerWithAuth and run PutBucket/PutObject
via store), method, path, headers, body, callHandler (ServeHTTP vs
handlePutObject), expectedStatus, expectedBodySubstrings) and iterate with
t.Run; reuse helpers setupHTTPTestServer, setupHTTPTestServerWithAuth,
signRequest, and server.handlePutObject to perform setup and execution, and
assert expectedStatus and substring assertions for responses (e.g., check
"<Code>...", "ListBucketResult", ETag, Location, Content-Type" etc.) so each
original test case (TestServer_ListBuckets, TestServer_CreateBucket,
TestServer_CreateBucket_AlreadyExists, TestServer_DeleteBucket,
TestServer_DeleteBucket_NotEmpty, TestServer_PutGetObject,
TestServer_PutObject_EmptyRejected, TestServer_GetObject_NotFound,
TestServer_HeadObject, TestServer_DeleteObject, TestServer_ListObjects,
TestServer_HeadBucket, TestServer_HeadBucket_NotFound, TestServer_ErrorFormat,
TestServer_URLDecoding, TestServer_AuthRejectsMissingAuthorization,
TestServer_AuthAcceptsSignedRequest, TestServer_AuthRejectsWrongSecret,
TestServer_HandlePutObjectRejectsPayloadHashMismatch) becomes an entry in one or
more concise table-driven tests.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/apex/main.go`:
- Around line 251-262: The code starts httpSrv.ListenAndServe in a goroutine
which delays socket bind errors until after setupS3Server returns; change it to
bind the socket synchronously first by creating a net.Listener for
cfg.S3.ListenAddr, return any bind error immediately, then start
httpSrv.Serve(listener) in the goroutine and log the actual bound address
(listener.Addr().String()); reference httpSrv, cfg.S3.ListenAddr and replace the
current httpSrv.ListenAndServe usage with a net.Listen + httpSrv.Serve flow and
ensure the listener is closed on shutdown.

In `@config/load.go`:
- Around line 416-420: The pair-validation of S3 credentials runs before
checking if S3 is enabled, causing startup to fail when stray
APEX_S3_ACCESS_KEY_ID or APEX_S3_SECRET_ACCESS_KEY are present even if
s3cfg.Enabled is false; update the logic in the load/config initialization so
the check for (s3cfg.AccessKeyID == "") != (s3cfg.SecretAccessKey == "") is
performed only when s3cfg.Enabled is true (i.e., move or guard the
pair-validation after the if !s3cfg.Enabled early return or wrap it with if
s3cfg.Enabled { ... }), keeping the existing error message and using the s3cfg
symbol to locate the code to change.

In `@e2e/go.mod`:
- Line 26: Update the vulnerable gRPC module version in go.mod by changing the
dependency declaration for google.golang.org/grpc from v1.79.1 to v1.79.3 (or
any later patched release) and then run the Go module update commands (e.g., go
get google.golang.org/grpc@v1.79.3 and go mod tidy) to ensure the new version is
pulled and the lockfile is updated.

In `@e2e/s3_test.go`:
- Around line 87-197: The test currently calls the AWS SDK with
context.Background() (e.g., client.ListBuckets, CreateBucket, PutObject,
HeadObject, GetObject, ListObjectsV2, DeleteObject, DeleteBucket) which ignores
the test timeout; update each call to use the test's timeout-aware ctx (or
derive per-call contexts with a short deadline) instead of context.Background()
so each SDK operation is bounded by the test timeout; ensure you pass ctx to
every AWS call and for longer operations derive ctxWithTimeout :=
context.WithTimeout(ctx, X) and defer cancel() where appropriate.

In `@pkg/s3/auth.go`:
- Around line 197-253: The current canonicalQueryString uses r.URL.Query() which
form-decodes '+' to space; change it to parse r.URL.RawQuery directly (call
sites: where canonicalQueryString(r.URL.Query()) is invoked, pass r.URL.RawQuery
instead) and update canonicalQueryString to split the raw query on '&' and each
pair on the first '=' to obtain raw key and raw value (allow empty values),
without using url.Query() or url.Values; keep building the pairs slice, sort by
key then value, and call awsEncode on the raw key and raw value when
constructing parts so '+' remains '%2B' and SigV4 canonicalization is
RFC3986-compliant.

In `@pkg/store/object_test.go`:
- Around line 71-75: The test fixture setup currently ignores errors from
PutBucket and PutObject (calls to PutBucket(ctx, "bucket") and PutObject(ctx,
"bucket", "greeting.txt", ...)), which can mask setup failures; update each
setup to capture the returned error and fail the test immediately (use t.Fatal
or t.Fatalf) when PutBucket or PutObject returns a non-nil error so the test
stops with the real setup error; apply the same fix to the other occurrences
noted (the PutBucket/PutObject calls around lines referenced) so all fixture
writes fail fast instead of discarding errors.

---

Outside diff comments:
In `@e2e/submission_test.go`:
- Around line 299-333: The bytes.Buffer in apexProcess is written concurrently
by cmd.Stdout/Stderr while other helpers (waitForApexHTTP, waitForS3HTTP,
waitForIndexedBlob, doRPC, Stop) read proc.logs, causing data races; fix by
making access to logs thread-safe: add a sync.Mutex to apexProcess and either
(a) create a small lockedWriter type whose Write method locks the mutex and
forwards to the underlying bytes.Buffer and set cmd.Stdout/cmd.Stderr to that
writer in startApexProcess, and expose a readLogs method on apexProcess that
locks the mutex and returns logs.String(), or (b) add a read/write wrapper
methods that lock around reads and writes and ensure all code (startApexProcess,
Stop, waitFor..., doRPC) uses those wrappers instead of accessing proc.logs
directly.

---

Nitpick comments:
In `@config/load_test.go`:
- Around line 355-505: The four near-duplicate tests
(TestLoadRejectsS3EnabledWithNonSQLiteStorage,
TestLoadRejectsS3SubmissionWithoutNamespace, TestLoadRejectsInvalidS3Namespace,
TestLoadRejectsPartialS3Credentials) should be collapsed into a single
table-driven test (e.g., TestLoadRejectsS3Validations) that iterates cases
containing: name, YAML content, optional env setup function (for the
partial-credentials case), and expected error substring; for each case create a
temp config file, apply env setup, call Load, and assert the error contains the
expected substring (use t.Run for subtests and only call t.Parallel() where
safe, avoid parallelizing the case that sets env vars). Reference the existing
Load function and the four test names when locating the code to replace.

In `@pkg/s3/server_test.go`:
- Around line 34-378: Many tests in this file repeat the same setup, request
construction, and assertions; convert them to table-driven tests to reduce
duplication. Create one or more test tables (structs with fields like name,
useAuth(bool), setupActions (e.g., call setupHTTPTestServer or
setupHTTPTestServerWithAuth and run PutBucket/PutObject via store), method,
path, headers, body, callHandler (ServeHTTP vs handlePutObject), expectedStatus,
expectedBodySubstrings) and iterate with t.Run; reuse helpers
setupHTTPTestServer, setupHTTPTestServerWithAuth, signRequest, and
server.handlePutObject to perform setup and execution, and assert expectedStatus
and substring assertions for responses (e.g., check "<Code>...",
"ListBucketResult", ETag, Location, Content-Type" etc.) so each original test
case (TestServer_ListBuckets, TestServer_CreateBucket,
TestServer_CreateBucket_AlreadyExists, TestServer_DeleteBucket,
TestServer_DeleteBucket_NotEmpty, TestServer_PutGetObject,
TestServer_PutObject_EmptyRejected, TestServer_GetObject_NotFound,
TestServer_HeadObject, TestServer_DeleteObject, TestServer_ListObjects,
TestServer_HeadBucket, TestServer_HeadBucket_NotFound, TestServer_ErrorFormat,
TestServer_URLDecoding, TestServer_AuthRejectsMissingAuthorization,
TestServer_AuthAcceptsSignedRequest, TestServer_AuthRejectsWrongSecret,
TestServer_HandlePutObjectRejectsPayloadHashMismatch) becomes an entry in one or
more concise table-driven tests.

In `@pkg/store/object.go`:
- Around line 286-291: The helpers isSQLiteUniqueConstraint and
isSQLiteFKConstraint currently match error strings; change them to use errors.As
to detect the modernc.org/sqlite error type (the driver's error struct) and
inspect its constraint/extended code (or Code/SubCode fields) to determine
UNIQUE vs FOREIGN KEY violations so that callers can reliably map to
ErrBucketAlreadyExists and ErrBucketNotFound; update imports to include
modernc.org/sqlite and errors, replace string.Contains checks with typed checks
against the sqlite error code constants, and preserve nil-safety and behavior
otherwise.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cdf414fd-3ddb-4736-a4fc-4414b8032b06

📥 Commits

Reviewing files that changed from the base of the PR and between 4bbc118 and 5a3d2e0.

⛔ Files ignored due to path filters (2)

e2e/go.sum is excluded by !**/*.sum
go.sum is excluded by !**/*.sum

📒 Files selected for processing (22)

cmd/apex/main.go
cmd/apex/main_test.go
config/config.go
config/load.go
config/load_test.go
e2e/go.mod
e2e/s3_test.go
e2e/submission_test.go
go.mod
pkg/s3/auth.go
pkg/s3/integration_test.go
pkg/s3/server.go
pkg/s3/server_test.go
pkg/s3/service.go
pkg/s3/service_test.go
pkg/s3/types.go
pkg/store/migrations/004_s3_objects.sql
pkg/store/object.go
pkg/store/object_test.go
pkg/store/sqlite.go
pkg/submit/celestia_blob.go
pkg/submit/submit_test.go

- Pre-bind S3 listener socket so startup failures surface synchronously - Move S3 credential validation after the !Enabled early return - Fix SigV4 canonical query string to use RawQuery + PathUnescape, avoiding form-decoding of '+' as space (RFC3986 compliance) - Pass test context to all AWS SDK calls in e2e test - Fail fast on fixture errors in object store tests - Bump e2e grpc dependency v1.79.1 → v1.79.3

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/apex/main.go`:
- Around line 373-376: After setupS3Server returns successfully, the code must
close the running s3Srv if subsequent startup steps fail; add a deferred cleanup
that closes s3Srv on error paths (e.g., defer func() { if err != nil { _ =
s3Srv.Close() } }() or use a boolean startedOK flag and set it true only after
startup completes) immediately after calling setupS3Server so that failures in
buildCoordinatorOptions, gRPC listener bind, or in runIndexer will close the S3
server; refer to setupS3Server and the local s3Srv variable, and ensure the
defer runs before any early returns from the function.
- Around line 251-253: The listener is created with net.Listen which violates
the noctx rule; change setupS3Server to accept a context parameter (ctx) and
replace net.Listen with a context-aware call using
net.ListenConfig{}.ListenContext(ctx, "tcp", cfg.S3.ListenAddr), then update the
caller runIndexer to pass its ctx into setupS3Server (mirroring the gRPC
listener pattern used elsewhere) so the original context is threaded through.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6d43f828-e459-45b7-bb56-8de608a7ccdf

📥 Commits

Reviewing files that changed from the base of the PR and between 5a3d2e0 and c3bd226.

⛔ Files ignored due to path filters (1)

e2e/go.sum is excluded by !**/*.sum

📒 Files selected for processing (6)

cmd/apex/main.go
config/load.go
e2e/go.mod
e2e/s3_test.go
pkg/s3/auth.go
pkg/store/object_test.go

✅ Files skipped from review due to trivial changes (1)

pkg/store/object_test.go

🚧 Files skipped from review as they are similar to previous changes (4)

e2e/s3_test.go
config/load.go
pkg/s3/auth.go
e2e/go.mod

Use ListenConfig.Listen for noctx compliance. Close s3Srv if buildCoordinatorOptions or gRPC listen fails after S3 is already serving.

runIndexer was at complexity 17 (limit 15). Extract HTTP mux + gRPC listener startup into startRPCServers so the complexity count drops to 13.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

cmd/apex/main.go (1)

219-219: 💤 Low value

Rename the log parameter to avoid shadowing the imported log package.

The log zerolog.Logger parameter shadows github.com/rs/zerolog/log for the entire function body, so log.Info()/log.Warn()/log.Error() inside setupS3Server and the goroutine resolve to the parameter rather than the package. It works (both expose Info/Warn/Error), but it's a readability footgun and inconsistent with the rest of the file which uses the package log directly. Rename to logger (matching the convention used in maybeBackfillSourceOption).

♻️ Proposed rename

-func setupS3Server(ctx context.Context, cfg *config.Config, db store.Store, blobSubmitter submit.Submitter, log zerolog.Logger) (*http.Server, error) {
+func setupS3Server(ctx context.Context, cfg *config.Config, db store.Store, blobSubmitter submit.Submitter, logger zerolog.Logger) (*http.Server, error) {
 	if !cfg.S3.Enabled {
 		return nil, nil
 	}
 
 	if cfg.S3.AccessKeyID == "" && cfg.S3.SecretAccessKey == "" {
-		warnLog := log.Warn().Str("addr", cfg.S3.ListenAddr)
+		warnLog := logger.Warn().Str("addr", cfg.S3.ListenAddr)
 		if isLoopbackBindAddr(cfg.S3.ListenAddr) {
@@
-	s3Srv := apexs3.NewServer(s3Svc, cfg.S3.Region, cfg.S3.AccessKeyID, cfg.S3.SecretAccessKey, log)
+	s3Srv := apexs3.NewServer(s3Svc, cfg.S3.Region, cfg.S3.AccessKeyID, cfg.S3.SecretAccessKey, logger)
@@
 	go func() {
-		log.Info().Str("addr", cfg.S3.ListenAddr).Msg("S3 API server listening")
+		logger.Info().Str("addr", cfg.S3.ListenAddr).Msg("S3 API server listening")
 		if err := httpSrv.Serve(lis); err != nil && !errors.Is(err, http.ErrServerClosed) {
-			log.Error().Err(err).Msg("S3 API server error")
+			logger.Error().Err(err).Msg("S3 API server error")
 		}
 	}()

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/apex/main.go` at line 219, The parameter name log in the function
setupS3Server shadows the imported package github.com/rs/zerolog/log; rename the
parameter to logger to match existing conventions (e.g.,
maybeBackfillSourceOption) and update all usages inside setupS3Server (including
any goroutines) to use logger instead of log so package-level log remains
accessible and readability is preserved.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/apex/main.go`:
- Around line 364-378: openBlobSubmitter currently returns a
*submit.DirectSubmitter which can be a nil pointer that becomes a typed-nil when
boxed into submit.Submitter, causing downstream nil checks in setupS3Server and
setupAPIService to fail; fix by ensuring you never pass a typed-nil interface:
either change openBlobSubmitter to return the submit.Submitter interface (so a
disabled case returns a plain nil interface) or, if keeping the pointer return,
normalize before boxing (e.g. create a local var of type submit.Submitter and
only assign the *submit.DirectSubmitter to it when blobSubmitter != nil) and
pass that submit.Submitter to setupS3Server and setupAPIService so their nil
guards (and ErrReadOnly/ErrDisabled flows) work correctly.

---

Nitpick comments:
In `@cmd/apex/main.go`:
- Line 219: The parameter name log in the function setupS3Server shadows the
imported package github.com/rs/zerolog/log; rename the parameter to logger to
match existing conventions (e.g., maybeBackfillSourceOption) and update all
usages inside setupS3Server (including any goroutines) to use logger instead of
log so package-level log remains accessible and readability is preserved.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e439a55e-c7e0-4e38-bfd1-db828ae05098

📥 Commits

Reviewing files that changed from the base of the PR and between c3bd226 and 83ecea4.

📒 Files selected for processing (1)

cmd/apex/main.go

pthmas · 2026-05-05T16:42:26Z

@coderabbitai review

coderabbitai · 2026-05-05T16:42:32Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

feat(s3): add e2e coverage for blob-backed API

fa8bcf6

pthmas self-assigned this Apr 28, 2026

pthmas added 2 commits April 29, 2026 09:43

fix(lint): simplify apex service setup wiring

d1eec81

Merge remote-tracking branch 'origin/main' into pthmas/s3-blob-export

d9453ce

pthmas changed the title ~~feat(s3): add e2e coverage for blob-backed API~~ feat(s3): add a blob-backed S3-compatible API Apr 29, 2026

pthmas and others added 16 commits April 29, 2026 10:59

feat(s3): add configurable sigv4 auth

f618715

warn on unauthenticated s3 startup

8d1fb3e

chore(s3): drop unused columns from migration

3e61846

height, namespace, commitments were added speculatively but are never written or read by the Go code. Remove from schema while still in PR. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

pthmas marked this pull request as ready for review May 4, 2026 20:48

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

Comment thread cmd/apex/main.go

Comment thread config/load.go Outdated

Comment thread e2e/go.mod Outdated

Comment thread e2e/s3_test.go Outdated

Comment thread pkg/s3/auth.go Outdated

Comment thread pkg/s3/service.go

Comment thread pkg/store/object_test.go Outdated

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

Comment thread cmd/apex/main.go Outdated

Comment thread cmd/apex/main.go Outdated

pthmas added 3 commits May 5, 2026 11:39

fix(s3): thread ctx into setupS3Server, close on startup failure

1a25d38

Use ListenConfig.Listen for noctx compliance. Close s3Srv if buildCoordinatorOptions or gRPC listen fails after S3 is already serving.

refactor(s3): extract startRPCServers to reduce gocyclo

83ecea4

runIndexer was at complexity 17 (limit 15). Extract HTTP mux + gRPC listener startup into startRPCServers so the complexity count drops to 13.

chore(justfile): add e2e-s3 target for S3 lifecycle test

e14fea8

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

Comment thread cmd/apex/main.go Outdated

fix s3 empty upload and submitter wiring

9451b9a

Conversation

pthmas commented Apr 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Architecture

CommitmentEnvelope (submitted to Celestia)

New files

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pthmas commented May 5, 2026

Uh oh!

coderabbitai Bot commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pthmas commented Apr 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 28, 2026 •

edited

Loading