Sketch: admin rate-limits dashboard architecture (CF Access + Tunnel) by ptrlrd · Pull Request #308 · ptrlrd/spire-codex

ptrlrd · 2026-05-20T08:00:15Z

Summary

Scope-locking sketch — no live behavior change. Establishes the architecture for runtime-tunable rate limits so the next PR can be focused "fill in TODOs."

What it looks like end-to-end

You (browser)
  │ OAuth (Google/GitHub via CF Access)
  ▼
Cloudflare edge — terminates TLS, checks Access policy
  │
  ▼
cloudflared (sidecar)   — outbound HTTPS to CF, NO published port
  │
  ▼
admin-dashboard         — tiny static UI
  │ X-Admin-Token header
  ▼
spire-codex-backend /api/admin/rate-limits/*
  │ (5s TTL cache)
  ▼
Mongo `rate_limits` collection — source of truth

Defense in depth

Layer 1 — CF Access OAuth at the edge. Only operators in the access policy can reach admin.spire-codex.com at all.
Layer 2 — X-Admin-Token in the backend. Survives single-layer misconfiguration of CF Access.
Layer 3 — No published port. admin-dashboard has zero port mappings; only path in is cloudflared's outbound connection to CF. Nothing for the public internet to scan.

What this PR contains

File	Status
`app/services/rate_limits_store.py`	Skeleton with 5s TTL cache, TODOs for Mongo I/O
`app/routers/admin_rate_limits.py`	GET/PUT/DEL endpoints returning 501 + token check
`docker-compose.admin.yml`	Two-container deploy: admin UI + cloudflared sidecar
`playbooks/admin-install.yml`	Runbook header documenting one-time CF dashboard setup

Per-request cost

get_limit(slug, default) is the hot path — called from every @limiter.limit(...) evaluation. With the TTL cache: 1 Mongo hit per worker per 5s = ~12 reads/min across the fleet. Per-request overhead is a dict lookup and a monotonic-time compare.

Follow-up PR scope (separate)

Implement _refresh_cache + Mongo collection
Implement set_override / clear_override + validation via slowapi's limit-string parser
Populate REGISTRY with the actual slugs in use today
Flip one decorator (submit_run) from a hardcoded string to get_limit("submit_run", "3000/hour") as the smoke test
Build the static admin UI (one HTML file is probably plenty for v1)
Push CF Tunnel + Access setup steps as a one-time runbook task

Not deploying anything from this PR

main.py doesn't import the new router. The compose file isn't referenced by any playbook. The endpoints don't exist on prod after this merges. Pure architectural commit.

The 600/hour ceiling was sized for "a few hundred backlog runs on first install" but silently caps users with larger histories — a Discord report of a 1000+ run backlog would have dropped 400 at the old cap. Backend work per submission is ~10ms (Mongo insert + JSON file + metrics) and duplicate-hash dedup short-circuits in ~3ms, so actual write load is bounded by *distinct* runs per uploader rather than raw submission count. 3000/hr leaves headroom for legitimate multi-thousand backlog uploads while still cutting off scraper abuse. Easy revert: bump back down if `spire_codex_api_errors_total{path="/api/runs",status_code="429"}` starts climbing against this endpoint specifically.

Scope-locking commit, not behavior-changing. Establishes the architecture for a runtime-tunable rate-limit system without implementing the bodies — so the next PR can be a focused "fill in TODOs" rather than a 1000-line design+impl combo. Pieces: - `app/services/rate_limits_store.py`: Mongo-backed config with a 5s in-process TTL cache. Per-request lookup stays cheap; admin writes take effect within one cache tick. Mongo I/O stubbed out with explanatory TODOs. - `app/routers/admin_rate_limits.py`: CRUD endpoints under `/api/admin/rate-limits`. Gated by `X-Admin-Token` header (defense in depth — CF Access is the outer layer). Endpoint bodies are 501s with TODO markers. - `docker-compose.admin.yml`: sketches a two-container deployment — a static admin UI behind cloudflared. NO published ports on the host; the only ingress is the Tunnel's outbound connection back to CF. - `playbooks/admin-install.yml`: documents the one-time CF dashboard config (create Tunnel, add Access policy, generate ADMIN_TOKEN) as a runbook in the header. Tasks are placeholders until the image + tunnel actually exist. Defense-in-depth: CF Access OAuth at the edge + X-Admin-Token at the backend. Either alone is enough to block public traffic; together they survive a single-layer misconfiguration. No live endpoint behavior changes in this PR. Importing the new router into main.py + flipping any decorator from a hardcoded string to `get_limit(slug, default=...)` is the next PR.

Promotes the `X-Admin-Token` check from `admin_rate_limits.py` to `dependencies.py::require_admin` since every admin router needs it. Drops the inline `require_admin_token` calls from the rate-limits router in favor of `dependencies=[Depends(require_admin)]` at router-construction time — single declaration, gates every endpoint on the router automatically. New router skeletons (all 501 stubs, shape-locking only): - `admin_moderation.py` — soft-delete runs / guides / usernames. Hide pattern (set hidden_at field) instead of hard delete so undo is one update and audit references stay valid. - `admin_ops.py` — feature flags + CF cache purge + manual data refresh + maintenance banner. Wraps existing service-layer functions (`refresh_stats_summary`, news parser, etc.) plus the existing `playbooks/purge-cache.yml` CF API call. - `admin_observability.py` — recent errors / rate-limit hits / search. Notes the writer-side work needed: a ring-buffer in RequestLoggingMiddleware for per-request error detail, and a small TTL'd Mongo collection for individual rate-limit events. - `admin_bulk.py` — long-running ops with a job-id pattern (rehash, dedupe, reattach files, import beta version, recompute scores). Each kicks off a background thread, status pollable via GET /jobs/{id}. - `admin_audit.py` — append-only read view over the audit_log collection. No delete endpoint, by design. - `admin_api_keys.py` — issue / list / rotate / revoke. Single-show semantics on plaintext (creation + rotation return it once, never persisted in plaintext). Prefix `sk_codex_` so leaks are scannable via GitHub secret-scanning + grep across logs. New services (skeletons): - `services/audit_log.py` — `record()` and `list_recent()`. Best-effort writes (Mongo down ≠ block the admin action). - `services/api_keys_store.py` — sha256-hashed at rest (high-entropy keys don't need bcrypt), 30s in-process cache for the hot lookup path, soft-revoke, rotate. Nothing wired into `main.py` yet — bodies all return 501, no router is imported, no compose change. Same scope-lock pattern as the rate-limits skeleton already in this PR: lock the shape, follow-up PRs fill bodies one surface at a time.

Sibling to admin_moderation: that surface sets `hidden_at`, this one exposes the population of hidden runs publicly. Mirrored Mongo match clauses on the same field (`{hidden_at: None}` vs `{hidden_at: {$ne: None}}`). - `backend/app/routers/hall_of_shame.py` — public GET endpoint returning the standard leaderboard response shape + `hidden_at` and `hidden_reason` per entry. Same filter params as the regular leaderboard (category, players, game_mode, character) so users can ask "fastest hidden run on Multi / Custom" etc. - `frontend/app/leaderboards/hall-of-shame/page.tsx` — public route at /leaderboards/hall-of-shame. Sketch only — real table lands with the moderation pipeline. Includes the editorial policy inline so visitors see the curation rules without a separate FAQ. Editorial guard rails baked into the docstrings: - Strictly admin-curated; no auto-flagging onto this page - `hidden_reason` is required + visible in every row, so vague judgment calls ("looks suspicious") are caught at moderation time, not after they're public - `robots: noindex` on the page so flagged usernames don't get indexed by search engines Pipe is end-to-end stubbed: moderation route sets the field, hall-of-shame route reads it, frontend renders it. Bodies land in the moderation follow-up PR.

…, query console Five more skeletons rounding out the "every routine operator question has a one-click answer" surface: - `services/umami_client.py` + `routers/admin_umami.py` — pulls Umami's REST API into the admin dashboard. Backend holds the Umami admin creds so the dashboard never sees them; we proxy active/summary/top-pages/referrers/countries/browsers with a 5-60s TTL cache per endpoint. Single pane of glass instead of bouncing between two UIs for the 80%-case glance. - `routers/admin_integrations.py` — outbound integration health + one-click test fire for Discord webhooks, Resend, Sentry, GitHub App, Cloudflare API, IndexNow. Catches credential rotation breaks before users do — currently the half-life on "token expired, no one noticed" is weeks. - `routers/admin_schedules.py` — unified view of GH Actions cron workflows (news-refresh, runs-db-backup) + backend in-process daemons (stats_summary refresher, run-entity-stats warmer). Shows last-run-at + result for each. Makes silent cron failure audible. - `routers/admin_query.py` — locked-down Mongo read-only query console. Whitelists collections + ops (find / find_one / count_documents / aggregate / distinct), forbids dangerous pipeline stages ($out, $merge, $where, $function), 100-doc result cap, 5s maxTimeMS, every query logged to admin_audit. Lets operators answer one-offs without shelling into mongosh. Refined the `admin_ops.py` banner docstring to explicitly cover announcements (patch news with level=info), maintenance warnings (level=warn), AND incidents (level=error) on the same endpoint / data model. `expires_at` enforces a self-vacating window so an announcement doesn't outlive its relevance if the operator forgets to clear it. Same scope-lock pattern: every body is 501, no router imported into main.py, no live behavior change. Follow-up PRs fill in one surface at a time.

ptrlrd added 4 commits May 20, 2026 00:53

ptrlrd mentioned this pull request May 20, 2026

Throttle every endpoint: SlowAPIMiddleware + tight cap on /api/exports #309

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sketch: admin rate-limits dashboard architecture (CF Access + Tunnel)#308

Sketch: admin rate-limits dashboard architecture (CF Access + Tunnel)#308
ptrlrd wants to merge 5 commits into
mainfrom
feat/admin-rate-limits-sketch

ptrlrd commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ptrlrd commented May 20, 2026

Summary

What it looks like end-to-end

Defense in depth

What this PR contains

Per-request cost

Follow-up PR scope (separate)

Not deploying anything from this PR

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant