chronicle: gap analysis & follow-ups vs. OpenAI Codex Chronicle (binary archaeology)

## Context

We inspected `/Applications/Codex.app/Contents/Resources/codex_chronicle` (4.2 MB arm64 Rust binary, signed `Developer ID Application: OpenAI OpCo, LLC (2DC432GLL2)`, hardened runtime, notarization stapled). Findings, with what they imply for our roadmap. This is an umbrella issue; line items below should be split into focused PRs/issues as they get picked up.

## What Codex Chronicle actually is

- **Language:** Rust, calling Apple frameworks directly (`ScreenCaptureKit`, `Vision`, `CoreML`, `AVFoundation`).
- **Process model:** main daemon + per-display `--list-displays-child` + `--capture-screenshot-child` + an SCKit child. Single-instance lockfile (`codex_chronicle.lock`) and pidfile (`\$TMPDIR/codex_chronicle/chronicle-started.pid`).
- **Sampling primary:** OCR diff. \"One JSON object per material text change\" in append-only `.ocr.jsonl` sidecars. Not fixed-fps + pHash like ours.
- **Storage:**
  - Ephemeral: `\$TMPDIR/chronicle/screen_recording/<segment_ts>-display-<id>-{latest.jpg,.capture,.capture.json,.ocr.jsonl}` plus historical `frame-<idx>-<min_bucket>Z.jpg`.
  - Memories: `~/.codex/memories_extensions/chronicle/{instructions.md, YYYY-MM-DDTHH-MM-SS-XXXX-(10min|6h)-<slug>.md}`.
- **Privacy filter:** window-identity-based only. `PrivacyFilter { signature, stable_observations }` + `BrowserWindowObservation { name, ... }`. Hard-coded rules for Chrome incognito, Safari Private Browsing, Google Meet (`meet.google.com`). No content scrub. There is no secret regex anywhere in the binary.
- **Wire format includes** a per-frame `safe_to_persist` flag on `ScreenshotChildSuccess`.
- **Where data goes:** frames + OCR ship to OpenAI servers via an internal `model_provider=\"openai-memgen\"` (`requires_openai_auth=true`, `supports_websockets=true`, header `X-OpenAI-Memgen-Request: true`). Summaries return as plaintext markdown stored locally.
- **Summarizer:** recursive 10min → 6h with convergence check, run as an ephemeral Codex sub-process with `--ephemeral --ignore-user-config --ignore-rules --sandbox read-only` and almost every feature disabled (`features.memories=false`, `features.apps=false`, `features.plugins=false`, `features.multi_agent=false`, `web_search=\"disabled\"`, `mcp_servers={}`, `analytics.enabled=false`, `otel.exporter=\"none\"`, `skills.config=[{name=\"chronicle\", enabled=false}]`).
- **Prompt-injection defense:** purely prompt-engineering. Embedded `BEGIN UNTRUSTED OBSERVED INPUT` framing plus an explicit attack taxonomy (authority-boundary, role-claim, future-agent-instruction, destructive-cleanup-as-rule, expected-memory-in-fixtures, attorney-client euphemisms, ambiguous sensitive content).
- **Audio:** wired (`captureMicrophone`, `microphoneCaptureDeviceID`, `excludesCurrentProcessAudio`). Info.plist declares `NSMicrophoneUsageDescription` + `NSAudioCaptureUsageDescription`. Not user-facing yet.
- **Entitlements:** `app-sandbox: false`, `allow-jit`, `allow-unsigned-executable-memory`, `device.audio-input`, `network.client`, `files.user-selected.read-write` (Electron parent posture).
- **macOS minimum:** 12.0 with version-gated paths (`captureImageInRect` ≥15.2, `captureScreenshot` ≥26.0).

## Things to add or harden in `agentd` / Chronicle pipeline

### Capture-side parity (worth taking from them)

- [ ] **Material-text-change OCR-diff gate as a secondary sampler** after pHash. Catches \"pixel-identical, text-shifted\" runs (modal toggles, focus changes inside the same canvas) that pHash misses. Land it as an optional gate in `CaptureService.swift` writing an analogue of their `.ocr.jsonl` sidecar.
- [ ] **Multi-display concurrent capture.** Builds on existing #34. Match Codex's per-display segment files and combine-by-timestamp pattern, with a per-display `displayId` field on `SubmitBatchRequest` frames so server can fan-in correctly.
- [ ] **Multi-process crash isolation for ScreenCaptureKit.** Spawn SCKit in a child process as Codex does — survives SCKit crashes/leaks without taking the menu-bar app down. Out-of-scope if it's a big lift; at minimum add an SCKit-watchdog that restarts the in-process pipeline.
- [ ] **`safe_to_persist` flag on the wire.** Add an explicit per-frame boolean (and reason enum) to the `chronicle.v1` `Frame`/`SubmitBatchRequest` proto so audit reviewers can see which fail-closed rail fired (secret-scrub-hit, app-deny, path-deny, window-title-pause, server-policy-pause). Belongs in `evalops/platform`; cross-link the proto change.
- [ ] **Pidfile / heartbeat-recency precondition for downstream consumers.** Document that any consumer (pipeline, audit query, future MCP read-surface) MUST verify heartbeat freshness before treating frames as current. Mirrors Codex's pidfile-validity rule. Surface lives server-side; client should expose its `RegisterDevice`/`Heartbeat` last-seen.
- [ ] **Browser-window-aware default deny.** Add the equivalent of Codex's `BrowserWindowObservation` + `chrome_incognito_title` / `safari_private_browsing_title` / `meet.google.com` / `Meet -` detection to default `denyWindowTitlePatterns`. This is a small, well-defined addition that complements (does not replace) our content scrub. Cross-references #36.
- [ ] **`browser_window_unstable` / `browser_window_missing_title` failure-mode handling.** When AX/window observation is unreliable, fall through to fail-closed deny for that frame. Today we likely persist with degraded metadata.

### Pipeline-side adds (`evalops/platform`)

- [ ] **Recursive chronological summarizer for audit roll-ups.** 10-min summaries → 6-hour roll-ups → daily, with convergence check and provenance back to the originating frames. This is the audit-reviewability story; today reviewers walk raw frames. Server-side, never the device.
- [ ] **URL stripping in any agent-readable view of the audit trail.** Codex strips URLs from summaries entirely as a leak-vector + prompt-injection mitigation. Worth a default-redact-with-opt-in posture in audit query results that an internal copilot would consume.
- [ ] **Recursive-summarizer ephemeral-sandbox config posture.** When/if any agent ever touches Chronicle evidence to summarize, run with the equivalent of Codex's `--sandbox read-only --ephemeral --ignore-user-config --ignore-rules`, telemetry/MCP/plugins/multi-agent/web-search all disabled. Document this as the required posture for `evalops/maestro` or any future internal consumer.

### Things we are ahead on — harden + document

- [ ] **Fail-closed content-aware secret scrub** is category-defining versus Codex Chronicle, the OSS clones, and Microsoft Recall. Move from \"a paragraph in the README\" to a first-class capability page (`docs/secret-scrub.md`?) with the regex family list, the fail-closed semantics (frame dropped, never partial-redacted), the OCR-text + window-title + document-path coverage, and a table comparing it to Codex's window-identity-only filter and Recall's app exclusions.
- [ ] **Fleet `CapturePolicy` + local hard-deny rails.** Codex Chronicle is single-user-only by construction. Document the `RegisterDevice` / `Heartbeat` / server-pushed policy, and especially the local-hard-deny-wins-over-server-allow rail. This is the IT/security buyer story.
- [ ] **Encryption at rest by default in remote/broker mode** (Keychain-backed AES-GCM, `.agentdbatch` extension). Codex stores plaintext JPEGs + plaintext OCR sidecars + plaintext markdown. Make the default explicit on the README diff.
- [ ] **Optional ASB Secret Broker artifact wrap.** `chronicle_frame_batch_json` artifact, only the artifact ref leaves the device, meterable and revocable through ASB. Codex has nothing analogous.
- [ ] **Hardware-backed permission smoke** (existing #25) and **notarized signed release** (#24) — pull both into the README hero comparison; the OSS clones explicitly cannot ship a notarized binary today.

### Devex / positioning

- [ ] **README hero diff vs. OpenAI Codex Chronicle.** Anyone Googling \"Chronicle macOS screen capture\" lands on OpenAI first. The agentd README should frame the difference up front: subject of capture (humans-and-agents-at-work vs. \"help my Codex agent remember me\"), governance posture (fleet-policy + fail-closed vs. single-user opt-in), data plane (self-hosted Connect/proto vs. OpenAI-hosted summarizer round-trip), evidence model (frames + scrub vs. LLM-summarized markdown). One table, no marketing copy.
- [ ] **Comparison page covering Codex Chronicle, Einsia/OpenChronicle, Screenata/open-chronicle, Microsoft Recall.** Same table, axis-by-axis.
- [ ] **Threat-model doc.** Explicitly state that we do not feed observed content into an on-device LLM, so the entire prompt-injection class Codex is patching with words is architecturally absent. This is a real differentiator that's invisible until said out loud.

## Things we deliberately do NOT want to copy

- [ ] **Shipping frames or OCR text to a third-party LLM provider for summarization.** This is the load-bearing reason Codex's posture cannot meet enterprise audit. Any internal summarization must run inside the customer's control plane, with the ephemeral-sandbox config posture above.
- [ ] **Window-identity-only privacy filter without content scrub.** Do not regress to the simpler model just because it's what the market sees as \"good enough.\"
- [ ] **Plaintext markdown memories on disk.** Even for \"local-only\" demo modes, default-on encrypt-at-rest stays on.
- [ ] **Audio capture.** Stay screen-only until we have a stated audit reason. \"Chronicle observes work product, not conversation\" is a defensible scope claim worth keeping.
- [ ] **Sparkle-style auto-update from anywhere.** Use a signed update channel only (#33), and never an auto-update path that can land code on the device without re-running the notarization/policy gate.

## Existing related issues

- #24 Developer ID notarization
- #25 Permission-flow smoke test
- #31 Local diagnostics + queued-capture review
- #32 Mock Chronicle + Secret Broker harness
- #33 Launch-at-login + signed update channel
- #34 Multi-display observability + adaptive OCR budgets
- #35 Generated Chronicle client types / drift gate
- #36 Scheduled / policy-driven auto-pause windows

## Source

Findings from local inspection of `/Applications/Codex.app/Contents/Resources/codex_chronicle` (`com.openai.codex` `26.422.30944`, signed Apr 24 2026), cross-checked against [Chronicle – Codex docs](https://developers.openai.com/codex/memories/chronicle).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chronicle: gap analysis & follow-ups vs. OpenAI Codex Chronicle (binary archaeology) #39

Context

What Codex Chronicle actually is

Things to add or harden in `agentd` / Chronicle pipeline

Capture-side parity (worth taking from them)

Pipeline-side adds (`evalops/platform`)

Things we are ahead on — harden + document

Devex / positioning

Things we deliberately do NOT want to copy

Existing related issues

Source

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

chronicle: gap analysis & follow-ups vs. OpenAI Codex Chronicle (binary archaeology) #39

Description

Context

What Codex Chronicle actually is

Things to add or harden in agentd / Chronicle pipeline

Capture-side parity (worth taking from them)

Pipeline-side adds (evalops/platform)

Things we are ahead on — harden + document

Devex / positioning

Things we deliberately do NOT want to copy

Existing related issues

Source

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Things to add or harden in `agentd` / Chronicle pipeline

Pipeline-side adds (`evalops/platform`)