Skip to content

Queue-based CRL invalidation — close the 5-minute revocation window #167

@vrknetha

Description

@vrknetha

Epic: #211 (CF Queues backbone) — Phase 1

Problem

When an agent is revoked, proxies do not know until their next CRL poll (every 5 minutes via CRL_REFRESH_INTERVAL_MS). During that window, the revoked agent can still send messages.

Current Flow

  1. Registry revokes agent → adds to CRL in D1 + publishes agent.auth.revoked to clawdentity-events Queue (already implemented)
  2. Proxies poll /v1/crl every 5 minutes
  3. Proxy refreshes local CRL cache
  4. Next request from revoked agent gets rejected

Gap: Up to 5 minutes of continued access after revocation. The registry already publishes to the queue — nobody consumes it.

Implementation

1. Add Queue consumer binding to proxy

File: apps/proxy/wrangler.jsonc

Add to each env (dev, production, staging):

"queues": {
  "consumers": [
    {
      "queue": "clawdentity-events-dev",  // or "clawdentity-events" for prod
      "max_batch_size": 10,
      "max_batch_timeout": 5,
      "dead_letter_queue": "clawdentity-events-dlq-dev"
    }
  ]
}

2. Implement queue() handler on proxy worker

File: apps/proxy/src/worker.ts

export default {
  fetch: ...,  // existing
  async queue(batch: MessageBatch<string>, env: Bindings): Promise<void> {
    for (const message of batch.messages) {
      try {
        const event = JSON.parse(message.body) as EventEnvelope<Record<string, unknown>>;
        await handleRegistryEvent(event, env);
        message.ack();
      } catch (error) {
        message.retry();
      }
    }
  },
};

3. Handle revocation events

New file: apps/proxy/src/queue-consumer/registry-events.ts

export async function handleRegistryEvent(
  event: EventEnvelope<Record<string, unknown>>,
  env: Bindings,
): Promise<void> {
  switch (event.type) {
    case "agent.auth.revoked": {
      const { agentId, sessionId } = event.data;
      // Option A: Invalidate CRL cache globally
      // The middleware recreates CRL cache on next request if stale
      // Set a KV flag that forces CRL refresh on next auth check
      
      // Option B: Directly mark agent as revoked in trust state DO
      const trustState = env.PROXY_TRUST_STATE.idFromName("global");
      const stub = env.PROXY_TRUST_STATE.get(trustState);
      await stub.fetch(new Request("http://internal/revoke", {
        method: "POST",
        body: JSON.stringify({ agentDid: event.data.agentId }),
      }));
      break;
    }
    default:
      // Ignore unknown events (forward compat)
      break;
  }
}

4. CRL cache invalidation path

The proxy auth middleware caches CRL with TTL (registryKeysCacheTtlMs). Two options:

Option A (simple): Queue consumer sets a Durable Object flag. Middleware checks flag before using cached CRL. If flagged, force-refreshes.

Option B (direct): Queue consumer writes the revoked agent JTI directly into the ProxyTrustState DO. Middleware checks DO for revocations before CRL check. Keep CRL polling as fallback at 15-30 min interval.

Recommendation: Option B — eliminates the need for a full CRL re-fetch on every revocation. The DO acts as a real-time revocation overlay.

5. Keep CRL polling as safety net

Increase CRL_REFRESH_INTERVAL_MS from 5 minutes to 15-30 minutes. Queue handles the fast path. Polling catches anything the queue misses.

Files to Change

  • apps/proxy/wrangler.jsonc — add queue consumer binding (all envs)
  • apps/proxy/src/worker.ts — add queue() export
  • New: apps/proxy/src/queue-consumer/registry-events.ts — event handler
  • apps/proxy/src/auth-middleware/middleware.ts — check DO revocation overlay before CRL cache
  • apps/proxy/src/trust-policy.ts — add revocation check to ProxyTrustState DO

Acceptance Criteria

  • Proxy consumes agent.auth.revoked events from clawdentity-events queue
  • Revoked agent is rejected within seconds (not minutes)
  • CRL polling remains as fallback at longer interval
  • Unknown event types are silently ignored (forward compat)
  • Dead letter queue catches consumer failures
  • Test: revoke agent in registry, verify proxy rejects within <10s

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions