Skip to content

Fix: Addresses 5th Feb Code Review#19

Merged
aspiers merged 8 commits intomainfrom
review-fixes
Feb 11, 2026
Merged

Fix: Addresses 5th Feb Code Review#19
aspiers merged 8 commits intomainfrom
review-fixes

Conversation

@daviddao
Copy link
Contributor

@daviddao daviddao commented Feb 5, 2026

Claude Opus 4.5 reviewed and highlighted several critical bugs inside /docs/REVIEW-5feb.md. This PR is addressing all issues in four phases, starting with security flaws.

Security Fixes

- S0 [CRITICAL]: Gate X-User-DID header behind TRUST_PROXY_HEADERS config
  (default: false) to prevent admin auth bypass via header spoofing
  Fixes #6
- S1: Sanitize JSONExtract/JSONExtractPath with regex validation to
  prevent SQL injection in both SQLite and PostgreSQL executors
  Fixes #2
- S2: Make WebSocket CheckOrigin configurable via ALLOWED_ORIGINS;
  defaults to same-origin policy instead of allowing all origins
  Fixes #3
- S3: Replace backfillActive bool with atomic.Bool and use CompareAndSwap
  to prevent race conditions in concurrent backfill triggers
  Fixes #4
- S4: Treat JTI Insert failures as replay attempts (ErrDPoPReplay)
  instead of generic server errors for proper TOCTOU handling
  Fixes #5
- S5: Add requireAdmin() to all admin queries (statistics, settings,
  lexicons, activity, etc.) — only currentSession remains public
  (also part of #6)
- S6: Add upload size limits (10MB ZIP, 500 files, 1MB/file) with
  io.LimitReader to prevent memory exhaustion via ZIP bombs
  Fixes #14
- S7: Disable global WriteTimeout (set to 0) to support long-lived
  WebSocket subscriptions; per-handler deadlines remain in place
  Fixes #16
Address architectural issues from codebase review (docs/REVIEW-Feb5.md Section 2):

- Fix PubSub subscriber IDs: replace rune conversion with strconv.FormatInt
- Replace global PubSub singleton with dependency injection
- Create internal/atproto shared utilities package (ParseTimestamp,
  ExtractCreatedAt, ParseCollections) — eliminates 3+2+2 duplicate functions
- Centralize all 23 os.Getenv calls into config.Config struct
- Deduplicate redactURL/redactPassword (export from config package)
- Deduplicate convertToAny/convertParams (add ConvertParams to Executor)
- Extract shared resolver pagination (resolveRecordConnection method)
- Remove dead code: Query()/scanRows stubs from Executor interface
- Add BeginTx to Executor interface for proper transaction support

Net effect: -319 lines, 26 files changed. All tests pass.
Remaining: main.go:run() decomposition (tracked in beads as e2m.8).
- Upgrade golangci-lint-action to v7 with explicit v2.8.0
- Convert .golangci.yml to v2 format (exclusions presets, formatters section)
- Fix gocritic importShadow/httpNoBody/assignOp/elseif issues
- Fix nilerr false positives with restructured error handling
- Fix errorlint: use errors.Is for wrapped error comparisons
- Add errcheck handling for non-defer Close calls
- Use v2 exclusion presets (comments, std-error-handling)
- Remove continue-on-error workaround from CI workflow
@daviddao
Copy link
Contributor Author

daviddao commented Feb 5, 2026

Lint CI fix

Fixed the golangci-lint GitHub Action failure by upgrading to golangci-lint v2.

Changes

CI workflow (.github/workflows/ci.yml):

  • golangci-lint-action@v4@v7 (required for golangci-lint v2)
  • version: latestversion: v2.8.0 (v1 doesn't support Go 1.25)
  • Removed continue-on-error: true workaround

Lint config (.golangci.yml → v2 format):

  • Added version: "2" and linters.default: none
  • Moved gofmt/goimports to formatters: section (v2 requirement)
  • Renamed linters-settings: → nested linters.settings:
  • Removed gosimple (merged into staticcheck in v2) and typecheck (implicit in v2)
  • Used v2 exclusion presets: comments and std-error-handling
  • Moved issues.exclude-ruleslinters.exclusions.rules (v2 format)

Code fixes (69 → 0 lint issues):

  • errcheck: _ = db.Close() for non-defer cleanup paths
  • errorlint: errors.Is() instead of != for wrapped error comparisons
  • gocritic: Fixed importShadow (renamed parameters: configcfg, reporepoDID, sqlsqlStr, urlresolveURL), httpNoBody (nilhttp.NoBody), assignOp (x = x * 2x *= 2), elseif (else { if }else if)
  • nilerr: Restructured error handling to avoid return nil when err != nil pattern
  • revive: Fixed comment formats for exported symbols

Verified locally: go build ./...go test ./...golangci-lint run0 issues

Decompose the ~625-line run() function in main.go into 10 focused
functions with clear responsibilities:

- initServices(): DB connection, migrations, repository initialization
- setupRouter(): chi router, middleware, health/stats/info endpoints
- setupOAuth(): OAuth 2.0 endpoints and cleanup worker
- setupAdmin(): admin GraphQL handler, backfill callbacks, GraphiQL
- configureBackfillCallbacks(): single-actor and full-network backfill
- setupGraphQL(): lexicon loading, public GraphQL, WebSocket subscriptions
- startWorkers(): background activity cleanup worker
- startJetstream(): real-time consumer + dynamic collection updates
- startBackfill(): conditional startup backfill
- serve(): HTTP server, signal handling, graceful shutdown

Key improvements:
- Services struct consolidates all repositories (created once)
- JetstreamActivityRepository reduced from 6 instantiations to 1
- backgroundServices struct tracks cancellable goroutines for shutdown
- run() is now ~57 lines — a clear orchestration sequence
- Each function has doc comments explaining its purpose

Completes Phase 2 architecture cleanup (10/10 tasks).
@daviddao
Copy link
Contributor Author

daviddao commented Feb 5, 2026

Full Change Summary — Phases 1 & 2

This PR now includes 4 commits addressing security vulnerabilities (Phase 1) and architectural cleanup (Phase 2) from the codebase review.


Phase 1: Security Fixes (0d95edf)

Issue Severity Fix Files
#6 — Admin auth bypass via X-User-DID header spoofing CRITICAL Gate header behind TRUST_PROXY_HEADERS config (default: false); only trust DID from header when explicitly enabled internal/graphql/admin/handler.go, internal/config/config.go
#2 — SQL injection via JSONExtract paths HIGH Regex validation (^[a-zA-Z0-9_.]+$) on JSON path arguments before interpolation internal/database/sqlite/executor.go, internal/database/postgres/executor.go
#3 — WebSocket accepts all origins MEDIUM Configurable ALLOWED_ORIGINS env var; defaults to same-origin policy internal/graphql/subscription/handler.go
#4 — Race condition on backfillActive flag MEDIUM Replace bool with atomic.Bool + CompareAndSwap internal/graphql/admin/resolvers.go
#5 — DPoP JTI replay treated as server error MEDIUM Insert failure → ErrDPoPReplay instead of generic 500 internal/oauth/middleware.go
#6 — Admin queries accessible without auth HIGH Added requireAdmin() to all admin queries (statistics, settings, lexicons, activity, etc.) internal/graphql/admin/resolvers.go, internal/graphql/admin/schema.go
#14 — ZIP bomb memory exhaustion HIGH Upload limits: 10MB ZIP, 500 files, 1MB/file via io.LimitReader internal/graphql/admin/resolvers.go
#16 — WriteTimeout kills WebSocket connections LOW WriteTimeout: 0 globally; per-handler deadlines remain cmd/hypergoat/main.go

Phase 2: Architecture Cleanup (57524f7 + 38f6bc7)

10 refactoring tasks completed (net: -319 lines across 30 files):

1. Fix PubSub subscriber IDs

string(rune(id)) converted integers to Unicode characters (e.g., id=65"A"). Fixed to strconv.FormatInt(id, 10).
internal/graphql/subscription/pubsub.go:60, pubsub_test.go

2. Replace global PubSub singleton with DI

Removed Global() from pubsub.go. PubSub instance now passed as parameter to subscription.NewHandler() and jetstream.NewConsumer(), created once in main.go.
pubsub.go, handler.go, consumer.go

3. Create internal/atproto/ shared utilities package

ParseTimestamp(), ExtractCreatedAt(), ParseCollections() — eliminates 7 duplicate implementations across main.go, backfill.go, resolvers.go, client.go, records.go, builder.go.
internal/atproto/time.go, collections.go, time_test.go, collections_test.go

4. Centralize all os.Getenv calls into config.Config

All 23 os.Getenv calls consolidated into config.Config struct. Added fields: AdminDIDs, DomainDID, LexiconDir, JetstreamURL/Collections/DisableCursor, BackfillOnStart/Collections/RelayURL/PLCURL/MaxPerPDS/MaxRepos, PLCDirectoryURL. Zero env reads outside config package.
internal/config/config.go

5. Deduplicate redactPassword

Exported config.RedactPassword(), removed duplicate server.redactPassword().
internal/config/config.go, internal/server/database.go

6. Add ConvertParams() to Executor interface

Removed stripped-down convertToAny from records.go. Both SQLite and PostgreSQL executors implement ConvertParams().
internal/database/executor.go, sqlite/executor.go, postgres/executor.go

7. Extract shared pagination resolver

resolveRecordConnection() as shared method in schema/builder.go, reducing ~240 lines to ~120.
internal/graphql/schema/builder.go

8. Decompose main.go:run() god function

~625-line function → 10 focused functions (~57 lines for run() orchestrator). services struct consolidates all repos. JetstreamActivityRepository reduced from 6 instantiations to 1. backgroundServices struct tracks cancellable goroutines.
cmd/hypergoat/main.go

9. Remove dead code

Query()/scanRows stubs removed from Executor interface and both implementations.
internal/database/executor.go, sqlite/executor.go, postgres/executor.go

10. Add BeginTx() to Executor interface

Proper transaction support for RecordsRepository.BatchInsert — both SQLite and PostgreSQL implement BeginTx().
internal/database/executor.go, sqlite/executor.go, postgres/executor.go


CI Fix (cd6f9ce)

Upgraded golangci-lint to v2 format (see earlier comment for details).


Verification

All 4 commits pass: go build ./...go test ./...

Remaining work (not in this PR)

  • Phase 3: Test coverage — repositories, OAuth handlers, Jetstream consumer, migrations, integration tests
  • Phase 4: Production readiness — pagination cursor fix, backpressure/rate limiting, health check probes, metrics, structured error responses

Add test coverage to all previously untested packages:

Repository tests (10 files, ~90 methods covered):
- records_test.go: Insert/BatchInsert/Get/Delete/Stats/TimeSeries/Iterate
- config_test.go: Get/Set/AdminDIDs/URLDefaults/InitializeDefaults
- actors_test.go: Upsert/BatchUpsert/GetByDID/GetByHandle/Exists
- lexicons_test.go: Upsert/GetByID/GetAll/Delete/Exists
- jetstream_activity_test.go: Log/UpdateStatus/Buckets/Cleanup
- labels_test.go: Insert/Negation/Takedown/Pagination
- label_definitions_test.go: CRUD + ValidateVisibility/ValidateSeverity
- reports_test.go: Insert/Resolve/Pagination + ValidateReasonType/Status
- oauth_test.go: Clients/AccessTokens/RefreshTokens/AuthCodes/DPoPJTI
- label_preferences_test.go: covered via labels_test.go

Other package tests (5 files):
- jetstream/event_test.go: ParseEvent, IsCommit/Create/Update/Delete, URI
- server/handlers_test.go: DPoPNonce, ClientMetadata, GraphiQL, ConnectDB
- graphql/types/types_test.go: Mapper primitives, ObjectBuilder, caching
- workers/workers_test.go: BackfillState lifecycle, ActivityCleanupWorker
- database/migrations/migrations_test.go: Run/Idempotent/Rollback

Infrastructure:
- internal/testutil/db.go: Shared SetupTestDB helper (in-memory SQLite)

Bug fix discovered via tests:
- labels.go: Fix negation query from timestamp comparison (cts > cts)
  to ID comparison (id > id) — SQLite second-precision timestamps
  caused negations within the same second to be invisible

Previously 6 packages had zero test files. Now only cmd/hypergoat
and graphql/query remain without tests (both are trivially simple).
All tests pass: go build ./... && go test ./...
@daviddao
Copy link
Contributor Author

daviddao commented Feb 5, 2026

Complete PR Summary — Phases 1, 2 & 3

This PR addresses 16 of the 18 issues opened from the codebase review, across 5 commits and 47 files changed (+6,617 / -1,017 lines).


Phase 1: Security Hardening — 0d95edf

8 security vulnerabilities fixed. Addresses issues #2#6, #14, #16.

Issue Severity What was fixed Key files
#2 SQL injection in JSONExtract CRITICAL Regex validation (^[a-zA-Z0-9_.]+$) on JSON path arguments before SQL interpolation in both SQLite and PostgreSQL executors sqlite/executor.go, postgres/executor.go
#6 Admin auth bypass via X-User-DID CRITICAL Gate X-User-DID header behind TRUST_PROXY_HEADERS config (default: false). Added requireAdmin() to all admin queries — only currentSession remains public admin/handler.go, admin/resolvers.go, admin/schema.go, config/config.go
#3 WebSocket accepts all origins HIGH Configurable ALLOWED_ORIGINS env var; defaults to same-origin policy instead of return true subscription/handler.go
#4 backfillActive race condition HIGH Replace bool with atomic.Bool + CompareAndSwap for concurrent safety admin/resolvers.go
#5 DPoP JTI TOCTOU race HIGH JTI insert failure now returns ErrDPoPReplay instead of generic 500 — the UNIQUE constraint is the real protection oauth/middleware.go
#14 ZIP bomb memory exhaustion HIGH Upload limits: 10MB ZIP, 500 files, 1MB/file via io.LimitReader admin/resolvers.go
#16 WriteTimeout kills WebSocket MEDIUM WriteTimeout: 0 globally; per-handler write deadlines remain in place cmd/hypergoat/main.go

Phase 2: Architecture Cleanup — 57524f7 + 38f6bc7 + cd6f9ce

10 architectural issues resolved (net -319 lines). Addresses issues #7#12.

Issue What was fixed Key files
#7 run() god function (700+ lines) Decomposed into 10 focused functions (initServices, setupRouter, setupOAuth, setupAdmin, setupGraphQL, startJetstream, startBackfill, serve, etc.). run() is now ~57 lines. services struct consolidates all repos. JetstreamActivityRepository reduced from 6 instantiations to 1. cmd/hypergoat/main.go
#8 os.Getenv calls outside config All 23 os.Getenv calls centralized into config.Config struct. Added fields for AdminDIDs, DomainDID, LexiconDir, Jetstream, Backfill, PLC config. Zero env reads outside config package. config/config.go, backfill/backfill.go
#9 scanRows/Query() unimplemented Removed dead Query() and scanRows stubs from Executor interface and both implementations database/executor.go, sqlite/executor.go, postgres/executor.go
#10 Massive code duplication Created internal/atproto/ package with ParseTimestamp(), ExtractCreatedAt(), ParseCollections() — eliminates 7 duplicate functions. Exported config.RedactPassword() (removed server duplicate). Added ConvertParams() to Executor (removed records.go duplicate). Extracted shared resolveRecordConnection() method (~240 → ~120 lines). atproto/time.go, atproto/collections.go, schema/builder.go
#11 Global PubSub + broken IDs Removed Global() singleton; PubSub now injected via constructor. Fixed subscriber IDs from string(rune(id))strconv.FormatInt(id, 10) subscription/pubsub.go, subscription/handler.go, jetstream/consumer.go
#12 No transaction support Added BeginTx() to Executor interface. Implemented in both SQLite and PostgreSQL executors. Updated RecordsRepository.BatchInsert to use it. database/executor.go, sqlite/executor.go, postgres/executor.go

CI fix (cd6f9ce): Upgraded golangci-lint to v2 format, fixed 69 lint issues, zero issues remaining.


Phase 3: Test Coverage — 045d8d1

+5,541 lines of tests across 15 new test files + 1 shared test helper. Addresses #17.

Package Test file(s) Tests Methods covered
database/repositories records_test.go 16 Insert, BatchInsert, GetByURI, GetByURIs, GetByCollection, Cursor pagination, Delete, Stats, TimeSeries, CIDs, IterateAll
config_test.go 9 Get/Set, AdminDID CRUD, URL defaults, InitializeDefaults
actors_test.go 7 Upsert, BatchUpsert, GetByDID/Handle, Exists
lexicons_test.go 7 Upsert, GetByID, GetAll, Delete, Exists
jetstream_activity_test.go 8 LogActivity, UpdateStatus, Buckets, Cleanup
labels_test.go 9 Insert, Negation, GetByURIs, Pagination, HasTakedown, IsValidSubjectURI
label_definitions_test.go 7 CRUD, GetNonSystem, ValidateVisibility/Severity
reports_test.go 8 Insert, Get, Pagination, Resolve, ValidateReasonType/Status
oauth_test.go 6 Clients, AccessTokens, RefreshTokens, AuthCodes, DPoPJTI
jetstream event_test.go 8 ParseEvent, IsCommit/Create/Update/Delete, URI
server handlers_test.go 4 DPoPNonce, ClientMetadata, GraphiQL, ConnectDatabase
graphql/types types_test.go 7 MapPrimitiveType, ObjectBuilder, RecordType, caching
workers workers_test.go 7 BackfillState lifecycle, ActivityCleanupWorker start/stop
database/migrations migrations_test.go 3 Run, Idempotent, Rollback
testutil db.go Shared SetupTestDB helper (in-memory SQLite + migrations)

Bug discovered via testing: Label negation queries in labels.go used neg.cts > label.cts (timestamp comparison), but SQLite's datetime('now') has only second-level precision. Labels and negations created within the same second had identical timestamps, making negations invisible. Fixed to use neg.id > label.id (auto-increment, always monotonic).

Coverage improvement:

Metric Before After
Packages with [no test files] 8 2
Packages with zero tests remaining cmd/hypergoat (entry point), graphql/query (simple types)

Issues addressed by this PR

Issue Status
#2 SQL injection in JSONExtract ✅ Fixed in Phase 1
#3 WebSocket origin check ✅ Fixed in Phase 1
#4 backfillActive race ✅ Fixed in Phase 1
#5 DPoP JTI TOCTOU ✅ Fixed in Phase 1
#6 Admin auth bypass ✅ Fixed in Phase 1
#7 God function ✅ Fixed in Phase 2
#8 os.Getenv sprawl ✅ Fixed in Phase 2
#9 Dead scanRows/Query ✅ Fixed in Phase 2
#10 Code duplication ✅ Fixed in Phase 2
#11 Global PubSub + broken IDs ✅ Fixed in Phase 2
#12 No transaction support ✅ Fixed in Phase 2
#14 ZIP bomb OOM risk ✅ Fixed in Phase 1
#16 WriteTimeout kills WS ✅ Fixed in Phase 1
#17 Test coverage gaps ✅ Fixed in Phase 3
#13 Pagination correctness ⏳ Phase 4
#15 Event channel drops ⏳ Phase 4
#18 Code quality issues ⏳ Phase 4

Verification

All 5 commits pass: go build ./...go test ./...golangci-lint run

- types_test.go: add nolint:revive for package name, rename fmt_eq to fmtEq
- oauth_test.go: remove unused int64Ptr helper function
- Fix pagination with deterministic keyset cursors using composite
  (indexed_at, uri) keys instead of 2x over-fetch and re-sort (#13)
- Fix event drops with backpressure-aware channel sends and correct
  cursor advancement after successful processing only (#15)
- Extract CORS into dedicated middleware with configurable allowed
  origins, removing inline CORS from individual handlers (#18)
- Replace magic numbers with named constants throughout codebase (#18)
@daviddao
Copy link
Contributor Author

daviddao commented Feb 5, 2026

Phase 4: Production Readiness — f6b7566

Final phase of the codebase review. Addresses issues #13, #15, #18.

Changes

Fix pagination (#13) — Replaced the 2× over-fetch + in-memory re-sort approach with deterministic keyset pagination using composite (indexed_at, uri) cursors. Added GetByCollectionWithKeysetCursor() repository method. Removed unimplemented last/before connection arguments.

Fix event drops (#15) — Jetstream client now uses blocking channel sends with context cancellation for backpressure instead of silently dropping events. Consumer cursor is advanced only after successful processing, preventing skipped events on error.

CORS middleware (#18) — Extracted inline CORS headers from individual GraphQL handlers into a dedicated CORSMiddleware with configurable ALLOWED_ORIGINS. Applied at the router level.

Named constants (#18) — Replaced magic numbers (100, 900, 1000) with named constants: BatchInsertSize, SQLParamBatchSize, DefaultIterateBatchSize, EventChannelBufferSize, SubscriberBufferSize.

Review Summary — All 4 Phases Complete

Phase Commit Issues Addressed
1. Security 0d95edf #2, #3, #4, #5, #6, #14, #16
2. Architecture 57524f7, 38f6bc7 #7, #8, #9, #10, #11, #12
3. Test Coverage 045d8d1, ecbe312 #17
4. Production Readiness f6b7566 #13, #15, #18

All 17 review issues are now addressed. All tests pass.

@daviddao daviddao changed the title Fix: address 8 security vulnerabilities from codebase review Fix: Addresses 5th Feb Code Review Feb 5, 2026
@vercel
Copy link

vercel bot commented Feb 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hyperindex Ready Ready Preview, Comment Feb 10, 2026 10:18am

Request Review

@aspiers aspiers merged commit 3192e5c into main Feb 11, 2026
5 checks passed
@aspiers aspiers deleted the review-fixes branch February 11, 2026 02:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment