feat(agents): drive coder via Claude/Codex SDKs instead of CLI binaries#4263
feat(agents): drive coder via Claude/Codex SDKs instead of CLI binaries#4263
Conversation
The coder entity used to spawn the `claude` / `codex` binaries and mirror events by tailing JSONL files on disk. Replace that with two SDK-driven runners that iterate the official SDK async generators and forward each event through `agent-session-protocol`'s per-event normalisers, so the entity no longer depends on global CLI installs and stops touching the filesystem to discover sessions. - Claude runner adapts each `SDKMessage` to `ClaudeEntry` and routes it through `normalizeClaudeEvent`. - Codex runner translates each completed `ThreadItem` directly into `NormalizedEvent`s — Codex's SDK envelope is higher-level than the rollout JSONL, so the JSONL normaliser doesn't apply. - `CodingSessionCliRunner.run()` gains `onEvent` and `onSessionId` callbacks; the orchestrator loses its file-watcher / pre+post diff plumbing and just forwards events live. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
✅ Deploy Preview for electric-next ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4263 +/- ##
==========================================
+ Coverage 64.61% 64.83% +0.22%
==========================================
Files 144 147 +3
Lines 19308 19373 +65
Branches 4747 4785 +38
==========================================
+ Hits 12476 12561 +85
+ Misses 6829 6807 -22
- Partials 3 5 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
✅ Deploy Preview for electric-next ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Claude Code ReviewSummaryThis PR replaces CLI binary spawning with SDK-driven runners for both Claude and Codex. The latest commit adds env-stripping so the coder subprocess falls back to user-configured credentials ( What's Working Well
Issues FoundSuggestions (Nice to Have)
File: The function is simple but touches security-relevant behavior: accidentally including a key that should be stripped (or stripping one that shouldn't be) would silently change auth behavior. A 5-line test covering the key-removal case and the undefined-value-skipping case would give future maintainers a safety net. it('strips the named key and preserves everything else', () => {
const original = process.env
process.env = { HOME: '/home/user', ANTHROPIC_API_KEY: 'secret', PATH: '/usr/bin' }
const result = subprocessEnvWithoutKey('ANTHROPIC_API_KEY')
expect(result).toEqual({ HOME: '/home/user', PATH: '/usr/bin' })
process.env = original
})
The bump from
File: The current comment explains why the same // Codex doesn't emit a `file_change` item.started event, so we
// synthesise both tool_call and tool_result at completion time.
// Synthesise a tool_call + tool_result pair for the patch as a whole.Behavioral change worth calling out in changeset Stripping Issue ConformanceNo linked issue — same warning as previous iterations. Previous Review StatusAll items from iteration 2 have been addressed or remain as suggestions:
Review iteration: 3 | 2026-05-04 |
The set was populated on item.started but never read — neither guarded duplicate processing nor drove downstream logic, just leaked memory across long sessions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously web_search items emitted a tool_call on item.started but nothing on item.completed, leaving the lifecycle dangling — UIs that render tool calls would show a perpetually-pending web search. Codex's WebSearchItem doesn't expose result content (only `query`), so emit an empty tool_result to honour the contract every other tool-style item already follows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewer flagged permissionMode + allowDangerouslySkipPermissions as possibly redundant. They aren't — the SDK requires both: the mode selects bypass and the boolean is the explicit acknowledgement gate (SDK throws if missing when the bypass mode is used). Document that in the call site so the next reader doesn't try to drop one. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewer flagged the SDK→NormalizedEvent conversion helpers as the most likely place for silent regressions when the underlying SDKs evolve. They are pure functions and don't need the SDK to be mocked, so cover each branch directly: - `sdkMessageToClaudeEntry` for system/init, system/compact_boundary, user, assistant (with usage + stop_reason), result (renames `duration_ms`), and SDK-only types that should return null. - `threadItemStartedToEvents` and `threadItemCompletedToEvents` for every ThreadItem variant — including the new web_search lifecycle pairing, file_change kind classification (add → file_write, mixed → file_edit), MCP success/failure, and command exit-code → isError. Helpers are exported with a doc comment marking them as test-only entry points so it's clear they're not part of the runner's public surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Strip ANTHROPIC_API_KEY / OPENAI_API_KEY from the env passed to the spawned `claude` / `codex` subprocesses so they fall back to whatever the user has set up via `claude login` / `codex login` (OAuth tokens in ~/.claude/.credentials.json and ~/.codex/auth.json). The parent process can still hold those keys for other uses — Horton talks to the Anthropic API directly via @anthropic-ai/sdk and depends on the key being in scope — they just don't leak into the coder's CLI subprocess. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The coder entity used to spawn the
claude/codexbinaries and mirror events by tailing JSONL files on disk. Replace that with two SDK-driven runners that iterate the official SDK async generators and forward each event throughagent-session-protocol's per-event normalisers, so the entity no longer depends on global CLI installs and stops touching the filesystem to discover sessions.SDKMessagetoClaudeEntryand routes it throughnormalizeClaudeEvent.ThreadItemdirectly intoNormalizedEvents — Codex's SDK envelope is higher-level than the rollout JSONL, so the JSONL normaliser doesn't apply.CodingSessionCliRunner.run()gainsonEventandonSessionIdcallbacks; the orchestrator loses its file-watcher / pre+post diff plumbing and just forwards events live.Note: changeset is set as minor, lmk if it needs to be patch instead.