Skip to content

Fro Bot presence webhook: POST /v1/announce for control-plane events #671

@marcusrbrown

Description

@marcusrbrown

The control plane (fro-bot/.github) needs to post messages in Fronomenal as the Fro Bot user identity when notable autonomous activity happens — surveys completing, collaboration invitations accepted, etc. Discord webhooks aren't an option because they post as a webhook bot, not as the user. The gateway already holds DISCORD_TOKEN and is logged in as Fro Bot via discord.js, so the natural integration is a webhook on the gateway that the control plane signs and POSTs to, and the gateway turns it into a Discord message posted from its existing client.

This issue tracks the gateway-side build. The control-plane side is captured in fro-bot/.github's requirements doc and ships in parallel.

Endpoint

POST /v1/announce — HTTPS, authenticated. Path versioned so future contract changes can land non-disruptively.

Authentication

HMAC-SHA256 over the canonicalized request body with a shared secret. No mTLS, no OAuth — the trust boundary is just the shared secret + replay window.

  • Header X-Gateway-Signature: <hex> carries the lowercase hex-encoded HMAC
  • Header X-Gateway-Timestamp: <iso8601> carries the same timestamp as the body's fired_at field
  • Shared secret comes from GATEWAY_WEBHOOK_SECRET (already pattern-matches existing secret-file convention; the deploy will need to add it via marcusrbrown/infra)
  • Verification MUST use constant-time comparison to avoid timing oracles on the HMAC check

Replay protection

Reject when |now - fired_at| > 5 minutes. Both directions — too old AND too new (clock skew). Return 4xx without posting to Discord.

If you want to be paranoid, add a small LRU of recent signature+timestamp pairs to reject exact replays within the window. Not strictly required for v1.

Canonicalization

The control plane signs the JSON body encoded with lexicographically sorted keys at every level, no whitespace, UTF-8 bytes. The gateway needs to reproduce that exact encoding before HMAC verification. The simplest implementation: parse the incoming body, re-encode with a canonical-JSON library or JSON.stringify with sorted keys, then HMAC the resulting bytes. Or: have the control plane and gateway agree on a canonicalization that survives any reasonable serializer (raw body verbatim — but then both sides have to handle it identically end-to-end). Lean toward the former.

Payload contract (v1)

{
  "v": 1,
  "event_type": "survey_completed",   // | "invitation_accepted" (v1 set; more coming)
  "fired_at": "2026-05-23T19:30:00Z",
  "context": {
    // event-specific keys; see below
  },
  "rendered_text": null               // v2 forward: pre-composed in-character text. Null in v1.
}

Unknown event_type → 4xx. Don't try to be forward-compatible by accepting unknown types; we'd rather catch contract drift loudly.

v1 event types

event_type context shape template hint
invitation_accepted {"count": number, "repos": [{"owner": string, "name": string}, ...]} "Just accepted N collaboration invitation(s): repo1, repo2, ..." in-character. Suggest blue embed accent.
survey_completed {"owner": string, "repo": string, "slug": string, "wiki_pages_changed": number} "Surveyed owner/repo, added N wiki entries" in-character. Suggest green embed accent.

Fast-follower event types (out of scope for this issue, but worth stubbing template slots)

  • reconcile_notable — daily reconcile cron when something interesting happened. Purple accent.
  • wiki_lint_findings — weekly wiki-lint with findings ≥ 1. Yellow accent.

v2 forward compatibility

When the control plane gets an LLM composer (next phase), payloads will populate rendered_text with the in-character message text. Gateway behavior: if rendered_text is non-null, use it verbatim as the Discord message content; if null, fall back to gateway-side template rendering for the event type. This split lets us ship v1 with templates and add the composer later without touching the gateway contract.

Channel routing

Single Fronomenal channel for all v1 drops. Channel ID configured via deploy env var — suggest GATEWAY_PRESENCE_CHANNEL_ID. Hard-coded in env, not in the payload (callers shouldn't be able to target arbitrary channels).

Posting behavior

  • Discord message posted via the existing discord.js Client (i.e., AS Fro Bot user) — not via webhook URL
  • Format: Discord embed with event-type-specific accent color, the rendered text as description, optional footer identifying it as control-plane-driven
  • Return 2xx after the Discord post is accepted by the API
  • On Discord API failure: return 5xx to caller (the control plane will retry once)
  • No internal queue or buffer — best-effort delivery. If the gateway is mid-reconnect, returning 5xx is acceptable

Observability

Log every accepted announce request with a redacted summary: event_type, fired_at, response status from Discord. Don't echo context or rendered_text to logs (those carry repo names and could grow); the redacted summary gives operators an audit trail without ballooning log volume.

Also log rejected requests with the rejection reason (hmac_invalid, timestamp_expired, unknown_event_type, malformed_body) — without echoing the request body. Useful when wiring up the control plane for the first time.

DoS posture

  • Max body size: 8 KB should be plenty for v1 payloads
  • Rate limit: not critical for v1 (control plane volume is single-digit POSTs per day) but worth adding a per-IP or per-secret-identity floor (e.g., 60 req/min) so a stuck loop on the control plane side can't burn through the gateway's Discord rate-limit budget

Deployment

The deploy in marcusrbrown/infra will need:

  • New deploy secret: GATEWAY_WEBHOOK_SECRET
  • New deploy env: GATEWAY_PRESENCE_CHANNEL_ID
  • The deploy gate (currently waits for Discord command registration) should probably also confirm the new webhook endpoint is reachable before declaring success. Up to you.

Out of scope for this issue

  • The control-plane side (event detection, payload construction, HMAC signing, POST + retry) — covered in fro-bot/.github's requirements doc, separate work
  • LLM composition of message text — that's v2 control-plane work
  • High-risk privacy events (visibility transitions, integrity alerts) — those stay on GitHub issue surfaces, not Discord
  • Multi-channel routing — single channel in v1
  • Two-way conversation features — existing @fro-bot mention handling stays, no changes there

Success criteria

  • A signed POST with a known-good payload lands a message in the target channel within ~30 seconds, posted by the Fro Bot user account
  • A signed POST with a wrong signature returns 4xx and produces no Discord post
  • A signed POST with a stale fired_at returns 4xx and produces no Discord post
  • An unsigned or malformed POST returns 4xx and produces no Discord post
  • Logs include the event_type and outcome for every accepted and rejected request, without leaking payload contents

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions