Skip to content

xuio/claude-code-codex-subagents

Repository files navigation

Claude Code Codex Subagents

CI License: MIT Node >=20

Run OpenAI Codex agents from Claude Code through a daemonless MCP plugin.

claude-code-codex-subagents lets Claude Code ask Codex, an independent frontier model like Claude, for read-only reviews, parallel investigations, Spark checks, native follow-ups, live steering, and explicit full-access Codex work when the user asks for it. It is especially useful when Claude needs a more technical subagent for deep codebase work, server/deployment tasks, difficult debugging, or adversarial review.

Terminal demo of Claude launching parallel Codex reviewers

Why Use It?

  • Native Claude Code workflow: Claude gets a small Task-like MCP surface: codex_task, codex_task_group, codex_followup, and codex_wait_any.
  • Frontier-model second opinion: Codex gives Claude an independent technical reviewer for adversarial checks and hard engineering work.
  • Read-only by default: Codex starts with --sandbox read-only and non-interactive approvals.
  • No daemon: Claude launches the MCP server over stdio for the active session.
  • Fast parallel review: Claude can launch several independent Codex agents with bounded concurrency.
  • Persistent sessions: App-server sessions keep Codex context across prompts and support live steering.
  • Codex desktop friendly: The plugin prefers the Codex binary shipped inside Codex.app and creates normal Codex threads with task names for Desktop history.
  • Debuggable: Verbose JSONL logging, diagnostics bundles, progress events, per-session resources, and recovery hints are built in.

Quick Start

Requirements:

  • Node.js 20 or newer
  • Claude Code
  • Codex CLI, preferably the Codex desktop app

Install the plugin once:

git clone https://github.com/xuio/claude-code-codex-subagents.git
cd claude-code-codex-subagents
npm run install:local

Then, in any project where Claude Code should use Codex subagents, add the MCP server:

mkdir -p .claude
cat > .claude/mcp.json <<'JSON'
{
  "mcpServers": {
    "codex-subagents": { "command": "codex-subagents-mcp" }
  }
}
JSON

Then ask Claude:

Use Codex to review this PR for correctness, security, and deployment risks.

For plugin development, run Claude directly against this checkout:

claude --plugin-dir .

For local development against Claude's installed plugin cache:

npm run dev:link
npm run dev:watch

dev:link symlinks Claude's local plugin install back to this working tree, so both Claude Code CLI and the Claude Desktop bundled Claude Code binary see your changes after dist/index.js is rebuilt.

Common Workflows

Ask One Codex Agent

Ask Codex for an adversarial second opinion on the session recovery code. Keep it read-only and return concrete findings with file paths.

Claude should use codex_task.

Run Parallel Codex Agents

Launch three Codex subagents in parallel: one for API behavior, one for tests, and one adversarial security reviewer. Keep all of them read-only.

Claude can call codex_task three times in parallel, or use codex_task_group when a single rolled-up response is useful.

Use Spark

Use Codex Spark to do a fast focused pass on the tool descriptions.

Claude can pass advanced.model: "spark" instead of remembering the exact Spark model slug.

Keep A Codex Session Alive

Start a long-running Codex session on this repo, then let me send follow-up prompts into the same context.

Claude should use codex_task for the initial prompt, preserve the returned session_id, and use codex_followup to continue, steer, wait on, or cancel that same Codex context. For multiple background sessions, Claude should use codex_wait_any to harvest whichever finishes first. Clients can also subscribe to codex://sessions/{session_id} for milestone and completion updates. For a completed first turn, Claude should set keep_session: true; for long first turns, Claude should set background: true.

Blocking waits are capped to responsive slices by default. When codex_followup or codex_wait_any returns completed: false, the Codex session is still running; call the wait tool again or read codex://sessions/{session_id}.

When app-server sessions use the Codex desktop binary, they are recorded as normal top-level Codex threads rather than hidden daemon work. The plugin sets the thread name from Claude's task label when the installed Codex app-server supports thread naming. When Claude explicitly cancels a session, or when retention pruning removes an old session, the plugin best-effort archives the matching Codex Desktop thread before closing the app-server child. It does not hard-delete threads, and recoverable sessions are not archived during normal MCP shutdown.

Safety Model

The default execution mode is conservative:

  • --sandbox read-only
  • approval_policy="never"
  • prompt text sent over stdin, not argv
  • secret-looking environment variables are not forwarded unless explicitly requested
  • large tool responses are compacted before returning to Claude

Full local access is opt-in per call:

{
  "full_access": true
}

Use that only when the user explicitly wants Codex to edit files, write git state, use DNS/network, install packages, or behave like a normal unrestricted Codex run.

Advanced callers can still request advanced.sandbox: "workspace-write" for project-scoped writes, and the patcher role uses that sandbox because its job is to suggest or prepare patches. That is not full access: arbitrary filesystem writes and DNS/network remain disabled unless full_access: true is set.

Documentation

MCP Tool Families

Use case Preferred tools
One read-only Codex task codex_task
Several independent tasks codex_task_group
Persistent context codex_task with keep_session: true, then codex_followup
Long-running sessions codex_task with background: true, then codex_followup
First completed background task codex_wait_any
Live steering codex_followup with mode: "steer"
Stop running work codex_followup with mode: "cancel"
Session progress MCP resource codex://sessions/{session_id}
Diagnostics MCP resources codex://status, codex://doctor, codex://usage

Prefer Codex over native Task when the user wants an independent frontier-model second opinion, deep technical review, complex codebase exploration, server or deployment investigation, or adversarial validation. Prefer native Task when the work depends heavily on Claude's conversation history or Claude-only built-in tools.

Legacy tools such as ask_codex, run_agent, run_agents, start_session, and send_session_prompt are hidden by default. Set CODEX_SUBAGENTS_ENABLE_LEGACY_TOOLS=1 only for older clients that still call the pre-refactor names.

Debug tools such as codex_status, codex_doctor, codex_usage_guide, codex_choose_tool, and codex_export_debug_bundle are hidden by default. Set CODEX_SUBAGENTS_ENABLE_DEBUG_TOOLS=1 only when a client needs tool-callable diagnostics instead of the MCP diagnostic resources.

Development

npm install
npm run build
npm run test:ci
npm run validate:plugin

For local install/update:

npm run install:local
npm run update:local

test:ci is portable and uses the fake Codex binary. It does not require live Claude or Codex credentials.

Opt-in live checks are available for hardening:

npm run test:codex-runtime
npm run test:real-matrix
npm run test:claude-orchestration
npm run test:claude-real-codex
npm run test:claude-real-session

See Development for the full test matrix.

Project Status

Current release: v0.3.0.

This project is early, but it is already tested across:

  • stdio MCP protocol flows
  • fake Codex reliability and stress cases
  • Codex desktop runtime probes
  • app-server protocol contract checks
  • Claude Code plugin validation
  • opt-in real Claude Code plus real Codex end-to-end runs

The most important invariant is that routine delegation stays read-only unless the caller explicitly opts into full local access.

License

MIT

About

Claude Code plugin for launching OpenAI Codex subagents via a daemonless MCP server

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors