Sketch: admin rate-limits dashboard architecture (CF Access + Tunnel)#308
Draft
ptrlrd wants to merge 5 commits into
Draft
Sketch: admin rate-limits dashboard architecture (CF Access + Tunnel)#308ptrlrd wants to merge 5 commits into
ptrlrd wants to merge 5 commits into
Conversation
The 600/hour ceiling was sized for "a few hundred backlog runs on
first install" but silently caps users with larger histories — a
Discord report of a 1000+ run backlog would have dropped 400 at the
old cap. Backend work per submission is ~10ms (Mongo insert + JSON
file + metrics) and duplicate-hash dedup short-circuits in ~3ms, so
actual write load is bounded by *distinct* runs per uploader rather
than raw submission count. 3000/hr leaves headroom for legitimate
multi-thousand backlog uploads while still cutting off scraper
abuse.
Easy revert: bump back down if
`spire_codex_api_errors_total{path="/api/runs",status_code="429"}`
starts climbing against this endpoint specifically.
Scope-locking commit, not behavior-changing. Establishes the architecture for a runtime-tunable rate-limit system without implementing the bodies — so the next PR can be a focused "fill in TODOs" rather than a 1000-line design+impl combo. Pieces: - `app/services/rate_limits_store.py`: Mongo-backed config with a 5s in-process TTL cache. Per-request lookup stays cheap; admin writes take effect within one cache tick. Mongo I/O stubbed out with explanatory TODOs. - `app/routers/admin_rate_limits.py`: CRUD endpoints under `/api/admin/rate-limits`. Gated by `X-Admin-Token` header (defense in depth — CF Access is the outer layer). Endpoint bodies are 501s with TODO markers. - `docker-compose.admin.yml`: sketches a two-container deployment — a static admin UI behind cloudflared. NO published ports on the host; the only ingress is the Tunnel's outbound connection back to CF. - `playbooks/admin-install.yml`: documents the one-time CF dashboard config (create Tunnel, add Access policy, generate ADMIN_TOKEN) as a runbook in the header. Tasks are placeholders until the image + tunnel actually exist. Defense-in-depth: CF Access OAuth at the edge + X-Admin-Token at the backend. Either alone is enough to block public traffic; together they survive a single-layer misconfiguration. No live endpoint behavior changes in this PR. Importing the new router into main.py + flipping any decorator from a hardcoded string to `get_limit(slug, default=...)` is the next PR.
Promotes the `X-Admin-Token` check from `admin_rate_limits.py` to
`dependencies.py::require_admin` since every admin router needs it.
Drops the inline `require_admin_token` calls from the rate-limits
router in favor of `dependencies=[Depends(require_admin)]` at
router-construction time — single declaration, gates every endpoint
on the router automatically.
New router skeletons (all 501 stubs, shape-locking only):
- `admin_moderation.py` — soft-delete runs / guides / usernames.
Hide pattern (set hidden_at field) instead of hard delete so undo
is one update and audit references stay valid.
- `admin_ops.py` — feature flags + CF cache purge + manual
data refresh + maintenance banner. Wraps existing service-layer
functions (`refresh_stats_summary`, news parser, etc.) plus the
existing `playbooks/purge-cache.yml` CF API call.
- `admin_observability.py` — recent errors / rate-limit hits / search.
Notes the writer-side work needed: a ring-buffer in
RequestLoggingMiddleware for per-request error detail, and a small
TTL'd Mongo collection for individual rate-limit events.
- `admin_bulk.py` — long-running ops with a job-id pattern
(rehash, dedupe, reattach files, import beta version, recompute
scores). Each kicks off a background thread, status pollable via
GET /jobs/{id}.
- `admin_audit.py` — append-only read view over the audit_log
collection. No delete endpoint, by design.
- `admin_api_keys.py` — issue / list / rotate / revoke. Single-show
semantics on plaintext (creation + rotation return it once,
never persisted in plaintext). Prefix `sk_codex_` so leaks are
scannable via GitHub secret-scanning + grep across logs.
New services (skeletons):
- `services/audit_log.py` — `record()` and `list_recent()`. Best-effort
writes (Mongo down ≠ block the admin action).
- `services/api_keys_store.py` — sha256-hashed at rest (high-entropy
keys don't need bcrypt), 30s in-process cache for the hot lookup
path, soft-revoke, rotate.
Nothing wired into `main.py` yet — bodies all return 501, no router
is imported, no compose change. Same scope-lock pattern as the
rate-limits skeleton already in this PR: lock the shape, follow-up
PRs fill bodies one surface at a time.
Sibling to admin_moderation: that surface sets `hidden_at`, this
one exposes the population of hidden runs publicly. Mirrored Mongo
match clauses on the same field (`{hidden_at: None}` vs
`{hidden_at: {$ne: None}}`).
- `backend/app/routers/hall_of_shame.py` — public GET endpoint
returning the standard leaderboard response shape + `hidden_at`
and `hidden_reason` per entry. Same filter params as the regular
leaderboard (category, players, game_mode, character) so users
can ask "fastest hidden run on Multi / Custom" etc.
- `frontend/app/leaderboards/hall-of-shame/page.tsx` — public route
at /leaderboards/hall-of-shame. Sketch only — real table lands
with the moderation pipeline. Includes the editorial policy
inline so visitors see the curation rules without a separate FAQ.
Editorial guard rails baked into the docstrings:
- Strictly admin-curated; no auto-flagging onto this page
- `hidden_reason` is required + visible in every row, so vague
judgment calls ("looks suspicious") are caught at moderation
time, not after they're public
- `robots: noindex` on the page so flagged usernames don't get
indexed by search engines
Pipe is end-to-end stubbed: moderation route sets the field,
hall-of-shame route reads it, frontend renders it. Bodies land in
the moderation follow-up PR.
…, query console Five more skeletons rounding out the "every routine operator question has a one-click answer" surface: - `services/umami_client.py` + `routers/admin_umami.py` — pulls Umami's REST API into the admin dashboard. Backend holds the Umami admin creds so the dashboard never sees them; we proxy active/summary/top-pages/referrers/countries/browsers with a 5-60s TTL cache per endpoint. Single pane of glass instead of bouncing between two UIs for the 80%-case glance. - `routers/admin_integrations.py` — outbound integration health + one-click test fire for Discord webhooks, Resend, Sentry, GitHub App, Cloudflare API, IndexNow. Catches credential rotation breaks before users do — currently the half-life on "token expired, no one noticed" is weeks. - `routers/admin_schedules.py` — unified view of GH Actions cron workflows (news-refresh, runs-db-backup) + backend in-process daemons (stats_summary refresher, run-entity-stats warmer). Shows last-run-at + result for each. Makes silent cron failure audible. - `routers/admin_query.py` — locked-down Mongo read-only query console. Whitelists collections + ops (find / find_one / count_documents / aggregate / distinct), forbids dangerous pipeline stages ($out, $merge, $where, $function), 100-doc result cap, 5s maxTimeMS, every query logged to admin_audit. Lets operators answer one-offs without shelling into mongosh. Refined the `admin_ops.py` banner docstring to explicitly cover announcements (patch news with level=info), maintenance warnings (level=warn), AND incidents (level=error) on the same endpoint / data model. `expires_at` enforces a self-vacating window so an announcement doesn't outlive its relevance if the operator forgets to clear it. Same scope-lock pattern: every body is 501, no router imported into main.py, no live behavior change. Follow-up PRs fill in one surface at a time.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Scope-locking sketch — no live behavior change. Establishes the architecture for runtime-tunable rate limits so the next PR can be focused "fill in TODOs."
What it looks like end-to-end
Defense in depth
admin.spire-codex.comat all.admin-dashboardhas zero port mappings; only path in is cloudflared's outbound connection to CF. Nothing for the public internet to scan.What this PR contains
app/services/rate_limits_store.pyapp/routers/admin_rate_limits.pydocker-compose.admin.ymlplaybooks/admin-install.ymlPer-request cost
get_limit(slug, default)is the hot path — called from every@limiter.limit(...)evaluation. With the TTL cache: 1 Mongo hit per worker per 5s = ~12 reads/min across the fleet. Per-request overhead is a dict lookup and a monotonic-time compare.Follow-up PR scope (separate)
_refresh_cache+ Mongo collectionset_override/clear_override+ validation via slowapi's limit-string parserREGISTRYwith the actual slugs in use todaysubmit_run) from a hardcoded string toget_limit("submit_run", "3000/hour")as the smoke testNot deploying anything from this PR
main.pydoesn't import the new router. The compose file isn't referenced by any playbook. The endpoints don't exist on prod after this merges. Pure architectural commit.