Skip to content

feat: bundle Codex and Claude Code CLIs#1093

Open
Nikhil (shadowfax92) wants to merge 6 commits into
mainfrom
fix/0601-1722-pony
Open

feat: bundle Codex and Claude Code CLIs#1093
Nikhil (shadowfax92) wants to merge 6 commits into
mainfrom
fix/0601-1722-pony

Conversation

@shadowfax92
Copy link
Copy Markdown
Contributor

Summary

  • add R2 upload commands for Codex rust-v0.136.0 and Claude Code 2.1.159 across macOS, Linux, and Windows targets
  • stage packaged Codex/Claude binaries into server resources and register them for macOS/Windows signing
  • prefer packaged native CLIs for health probes and ACP adapter subprocesses while preserving existing bundled Bun support
  • add cross-contract coverage so storage uploader keys match the production server resource manifest

Verification

  • cd packages/browseros && python -m unittest build.cli.storage_test build.common.server_binaries_test
  • cd packages/browseros-agent && bun test scripts/build/server/stage.test.ts apps/server/tests/lib/agents/bundled-bun.test.ts apps/server/tests/lib/agents/bundled-native-binary.test.ts apps/server/tests/lib/agents/adapter-detection.test.ts apps/server/tests/lib/agents/acpx-runtime.test.ts
  • cd packages/browseros-agent && bun run typecheck
  • cd packages/browseros-agent && bunx biome check apps/server/src/lib/agents/host-acp/bundled-bun.ts apps/server/tests/lib/agents/bundled-bun.test.ts scripts/build/config/server-prod-resources.json scripts/build/server/stage.test.ts apps/server/src/lib/agents/host-acp/bundled-native-binary.ts apps/server/src/lib/agents/host-acp/detection.ts apps/server/src/lib/agents/acpx/runtime.ts apps/server/tests/lib/agents/bundled-native-binary.test.ts apps/server/tests/lib/agents/adapter-detection.test.ts apps/server/tests/lib/agents/acpx-runtime.test.ts
  • cd packages/browseros && python -m black --check build/cli/storage.py build/cli/storage_test.py build/common/server_binaries.py build/common/server_binaries_test.py

@shadowfax92
Copy link
Copy Markdown
Contributor Author

Greptile (@greptileai) review

@shadowfax92
Copy link
Copy Markdown
Contributor Author

Claude (@claude) review

@claude
Copy link
Copy Markdown

claude Bot commented Jun 2, 2026

Claude finished Nikhil (@shadowfax92)'s task in 3m 15s —— View job


Code Review

  • Gather context and understand the PR
  • Read changed files
  • Review Python build/storage changes
  • Review TypeScript agent changes
  • Review test coverage
  • Post comprehensive review feedback

Overall this is a solid, well-structured PR. The layering — R2 upload tooling → build manifest → runtime binary resolution — is clean and the test coverage is thorough. A few findings below, ordered by severity.


Bugs / Correctness Issues

1. _extract_bun_file unconditionally chmods Windows .exe files

_extract_bun_file always calls dest.chmod(0o755) after extraction, even for the windows-x64-baseline target whose binary is bun.exe. Contrast this with _process_claude_code_platform, which correctly guards the chmod:

# storage.py — _process_claude_code_platform  ✅
if not binary_name.endswith(".exe"):
    local_path.chmod(0o755)

# storage.py — _extract_bun_file  ❌
dest.chmod(0o755)   # always, even for bun.exe

The chmod succeeds because uploads run from macOS/Linux, but it's inconsistent and would break if this ever ran on Windows. Fix this →


Dead Code

2. CodexPlatform.entrypoint is never read

CodexPlatform defines an entrypoint field (e.g., "bin/codex") for each platform, but _process_codex_platform ignores it — the actual entrypoint is read from codex-package.json inside the archive via _read_codex_package_entrypoint. The hardcoded values create a silent inconsistency risk: if Codex ever ships a platform with a different entrypoint, the dataclass will silently lie.

Options: (a) use the hardcoded value for validation/assertion against what the metadata says, or (b) remove the field entirely and trust the archive metadata. Fix this →


Minor / Nits

3. _build_claude_code_manifest sets binary_shas_upstream and r2_object_shas to the same dict

def _build_claude_code_manifest(...):
    return {
        ...
        "binary_shas_upstream": binary_shas,
        "r2_object_shas": binary_shas,   # same object
    }

For Claude Code this is technically correct (no zip extraction — upstream binary == R2 object), but it's subtly inconsistent with the Codex manifest where the two fields are semantically distinct (package_shas_upstream are tarball SHAs, r2_object_shas are extracted binary SHAs). A short comment explaining why they're the same here would prevent future confusion.

4. reasonFor "needs-install" message doesn't account for bundled native CLIs

packages/browseros-agent/apps/server/src/lib/agents/host-acp/detection.ts:302:

case 'needs-install':
  return `${input.displayName} adapter cannot launch because neither bundled Bun nor npx is available.`

This message is emitted when launch.source === 'none', which can now happen even when a bundled native CLI is present (the native CLI resolves the health probe, but the ACP adapter package still needs Bun or npx to launch). The message is technically accurate but misleading — a user sees "Codex cannot launch, no Bun or npx" while a Codex binary is sitting in bin/third_party. Worth clarifying the message or adding context.

5. nativeCli resolution IIFE is hard to read

detection.ts:100-107:

(async () =>
  resolveBundledNative({...}) ?? (await resolveBinary(config.nativeBinary)))().catch(() => null),

The closing () after the IIFE body is easy to miss. A named helper or a two-step const would make the fallback logic clearer:

async function resolveNativeCli(...): Promise<ResolvedHostBinary | null> {
  return resolveBundledNative({...}) ?? resolveBinary(config.nativeBinary)
}

Test Coverage

Coverage is good overall. A few gaps worth noting:

  • macOS native CLI staging not tested in stage.test.ts — Linux and Windows paths are exercised, but there's no equivalent test for darwin-arm64 / darwin-x64 native CLIs in the stage pipeline.
  • withBundledNativeBinaryPath with a missing directory is tested implicitly but not explicitly (the statSync catch path). Trivial to add.
  • The new bundled-native-binary.test.ts is well-structured and the Windows PATH key case sensitivity test is a nice touch.

Overall Assessment

The approach is sound: bundled native CLIs are preferred for health probes (avoiding PATH pollution) while Bun/npx remains the adapter launch mechanism. The upload pipeline for Claude Code and Codex follows the same pattern as Bun, with appropriate per-tool customization. The cross-contract storage key test in stage.test.ts is a great addition.

The blocking issue is the chmod inconsistency for Bun on Windows; the dead entrypoint field is a code quality concern worth fixing before merge.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jun 2, 2026

Greptile Summary

This PR bundles Codex (rust-v0.136.0) and Claude Code (2.1.159) native CLIs alongside the existing Bun binary for macOS, Linux, and Windows, and wires them into the ACP adapter runtime so health probes and subprocesses prefer the packaged binaries before falling back to the host PATH.

  • Upload tooling (storage.py): New codex and claude-code CLI commands download release packages, verify SHA-256/size from upstream manifests, extract the native binary, and upload to R2 with rollback on failure; Bun upload is extended from macOS-only to Linux/Windows targets.
  • Runtime integration (bundled-native-binary.ts, detection.ts, runtime.ts): resolveBundledNativeBinary resolves the packaged binary before host-PATH lookup during adapter detection; withBundledNativeBinaryPath prepends the bin/third_party directory to the subprocess environment for all ACP adapter launches.
  • Build config (server-prod-resources.json, server_binaries.py): Resource manifest gains twelve new R2-sourced entries (Codex + Claude Code across five targets); both binaries are registered for macOS hardened-runtime signing and Windows Authenticode signing.

Confidence Score: 4/5

The change is safe to merge; all new code paths are covered by unit and integration tests, SHA verification guards every upload, and rollback logic protects R2 on partial failures.

The core runtime, detection, and upload logic is well-structured and well-tested. The two findings are a dead entrypoint field on CodexPlatform (never read; the extractor reads entrypoints dynamically from package metadata) and a manifest builder that aliases the same dict to two semantically distinct fields. Neither affects runtime correctness, but the dead field violates the repo's cleanup rule and the aliased dict contradicts the pattern established by the Bun and Codex manifests.

packages/browseros/build/cli/storage.py — unused CodexPlatform.entrypoint field and the _build_claude_code_manifest dict aliasing are both in this file.

Important Files Changed

Filename Overview
packages/browseros-agent/apps/server/src/lib/agents/host-acp/bundled-native-binary.ts New module to resolve and prepend bundled Codex/Claude native binaries to PATH, with correct Windows path key discovery and duplicate-path deduplication.
packages/browseros-agent/apps/server/src/lib/agents/host-acp/bundled-bun.ts Extended Bun resolver to support Linux and Windows (bun.exe) in addition to macOS, with platform-aware executable-bit check skipped on win32.
packages/browseros-agent/apps/server/src/lib/agents/host-acp/detection.ts Detection now prefers bundled native CLIs over host PATH resolution via an async IIFE with nullish-coalescing fallback; logic is correct.
packages/browseros-agent/apps/server/src/lib/agents/acpx/runtime.ts Prepends bundled native CLI directory to commandEnv before wrapping ACP adapter commands; correctly threads updated env through the Bun env path.
packages/browseros-agent/scripts/build/config/server-prod-resources.json Adds R2 resource entries for Bun (Linux/Windows), Codex, and Claude Code across macOS/Linux/Windows targets with correct OS/arch filters and executable flags.
packages/browseros/build/cli/storage.py Adds upload_codex and upload_claude_code CLI commands with SHA/size verification, rollback support, and cross-contract test; CodexPlatform.entrypoint field is defined but never read.
packages/browseros/build/common/server_binaries.py Adds codex and claude to macOS signing dict (runtime, no extra entitlements) and codex.exe/claude.exe to Windows signing list; matches resource manifest layout.

Sequence Diagram

sequenceDiagram
    participant RT as AcpxRuntime
    participant NB as withBundledNativeBinaryPath
    participant BN as resolveBundledNativeBinary
    participant DH as detectHostAdapter
    participant RB as resolveHostBinary (fallback)

    RT->>NB: prepend bin/third_party to PATH (commandEnv)
    NB-->>RT: updated env

    RT->>DH: detect adapter (resourcesDir, env)
    DH->>BN: resolveBundledNativeBinary(adapter, resourcesDir)
    alt bundled binary found
        BN-->>DH: "{path, env} with prepended dir"
    else not found
        BN-->>DH: null
        DH->>RB: resolveHostBinary(config.nativeBinary)
        RB-->>DH: "{path, env} or null"
    end
    DH-->>RT: nativeCli + launch config

    RT->>RT: wrapCommandWithEnv(launch.command, commandEnv)
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
packages/browseros/build/cli/storage.py:101-136
The `entrypoint` field is declared in `CodexPlatform` and populated in every `CODEX_PLATFORMS` entry, but it is never read anywhere in the codebase — `_extract_codex_file` discovers the entrypoint dynamically by reading `codex-package.json` from the tarball rather than using this static value. Per the repo rule on removing unused/dead code, the field should be dropped.

```suggestion
@dataclass(frozen=True)
class CodexPlatform:
    """BrowserOS target plus the upstream Codex package coordinates."""

    target: str
    upstream: str

CODEX_PLATFORMS: Tuple[CodexPlatform, ...] = (
    CodexPlatform(
        target="darwin-arm64",
        upstream="aarch64-apple-darwin",
    ),
    CodexPlatform(
        target="darwin-x64",
        upstream="x86_64-apple-darwin",
    ),
    CodexPlatform(
        target="linux-arm64",
        upstream="aarch64-unknown-linux-musl",
    ),
    CodexPlatform(
        target="linux-x64",
        upstream="x86_64-unknown-linux-musl",
    ),
    CodexPlatform(
        target="windows-x64",
        upstream="x86_64-pc-windows-msvc",
    ),
)
```

### Issue 2 of 2
packages/browseros/build/cli/storage.py:1650-1657
`binary_shas_upstream` and `r2_object_shas` are assigned the same dictionary reference. For Bun and Codex the analogous fields carry *different* values (archive checksum vs extracted binary checksum), so a reader will expect these to differ here too. While they happen to be equal for Claude Code (the binary is uploaded without transformation), sharing the same object makes the manifest structure misleading and means any future in-place mutation of one key silently affects the other.

Reviews (1): Last reviewed commit: "fix(build): preserve bundled Bun platfor..." | Re-trigger Greptile

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jun 2, 2026

Greptile Summary

This PR bundles Codex (rust-v0.136.0) and Claude Code (2.1.159) CLIs alongside the existing Bun runtime, extending the existing bundled-binary infrastructure to cover Linux and Windows in addition to macOS. The runtime now prepends bin/third_party to the PATH of every ACP adapter subprocess and prefers the bundled binary over the host PATH for health probes and version checks.

  • Python upload tooling (storage.py): new codex and claude-code CLI commands download verified upstream releases, extract the platform binary, and push it to R2; Bun handling is refactored from BunArch to BunTarget to support the same multi-platform model.
  • TypeScript server runtime (bundled-native-binary.ts, detection.ts, runtime.ts): new resolveBundledNativeBinary and withBundledNativeBinaryPath helpers inject the packaged CLIs into subprocess environments; detectHostAdapter now prefers bundled binaries before falling back to host PATH resolution.
  • Build config (server-prod-resources.json, server_binaries.py): all new binary artefacts are registered for staging, macOS signing, and Windows code-signing.

Confidence Score: 4/5

Safe to merge; all new code paths have tests and the bundled-binary fallback is non-destructive.

The TypeScript changes are clean and well-tested. The main concerns are in storage.py: CodexPlatform.entrypoint is a dead field never read by the extraction logic, and _extract_bun_file / _extract_codex_file inconsistently apply chmod to Windows .exe files while the Claude Code path correctly skips it. Neither causes a runtime failure.

packages/browseros/build/cli/storage.py — dead CodexPlatform.entrypoint field and inconsistent chmod calls for Windows EXEs.

Important Files Changed

Filename Overview
packages/browseros/build/cli/storage.py Adds Codex and Claude Code upload commands; CodexPlatform.entrypoint is dead code, chmod inconsistency for .exe files, and binary_shas_upstream/r2_object_shas aliasing.
packages/browseros-agent/apps/server/src/lib/agents/host-acp/bundled-native-binary.ts New module: resolves and injects bundled native Codex/Claude binaries into PATH; well-implemented with Windows PATH key detection and deduplication.
packages/browseros-agent/apps/server/src/lib/agents/host-acp/detection.ts Prefers bundled native binary over host PATH before falling back to resolveBinary; async IIFE with .catch is correct.
packages/browseros-agent/apps/server/src/lib/agents/acpx/runtime.ts Injects bundled native CLI directory into commandEnv before launching host ACP adapters; layering with bundled Bun env is correct.
packages/browseros-agent/scripts/build/config/server-prod-resources.json Adds Linux/Windows Bun entries and Codex/Claude Code resources for all platforms; R2 keys align with Python upload tool expected paths.
packages/browseros/build/common/server_binaries.py Registers codex and claude for macOS signing (plain hardened runtime, no entitlements) and adds Windows EXE paths.

Sequence Diagram

sequenceDiagram
    participant Runtime as AcpxRuntime
    participant Detection as detectHostAdapter
    participant BundledNative as resolveBundledNativeBinary
    participant BundledBun as resolveBundledBun
    participant HostPath as resolveBinary (host PATH)
    participant Subprocess as ACP Adapter Process

    Runtime->>Detection: "detectHostAdapter(adapter, {resourcesDir, ...})"
    Detection->>BundledNative: "resolveBundledNativeBinary({adapter, resourcesDir})"
    alt bundled binary present and executable
        BundledNative-->>Detection: "{path, env with dir prepended}"
        Detection-->>Runtime: "nativeCli = bundled binary"
    else bundled binary absent
        BundledNative-->>Detection: null
        Detection->>HostPath: resolveBinary(config.nativeBinary)
        HostPath-->>Detection: nativeCli from host PATH
        Detection-->>Runtime: "nativeCli = host binary"
    end
    Runtime->>BundledNative: "withBundledNativeBinaryPath({env, resourcesDir})"
    Note over Runtime: Prepends bin/third_party to commandEnv PATH
    alt addBundledBunAdapterEnv
        Runtime->>BundledBun: withBundledBunAcpAdapterEnv(commandEnv, browserosDir)
        BundledBun-->>Runtime: env with bundled bun prepended
    end
    Runtime->>Subprocess: spawn(command, enriched env)
Loading
Prompt To Fix All With AI
Fix the following 4 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 4
packages/browseros/build/cli/storage.py:102-108
**Unused `entrypoint` field on `CodexPlatform`**

`CodexPlatform.entrypoint` is declared but never read anywhere in the codebase. `_extract_codex_file` derives the entrypoint by parsing `codex-package.json` inside the tarball via `_read_codex_package_entrypoint`, so the field on the dataclass is completely ignored at runtime. Every instantiation of `CodexPlatform` passes a value for it (e.g. `entrypoint="bin/codex"`) that is silently discarded, creating a misleading contract. The field should be removed.

### Issue 2 of 4
packages/browseros/build/cli/storage.py:747-749
`_extract_bun_file` unconditionally calls `dest.chmod(0o755)`, including when `binary_name` is `bun.exe` for the Windows target. Only `_process_claude_code_platform` correctly guards against setting execute bits on `.exe` files. While harmless on a Unix build server, this is inconsistent with the rest of the codebase.

```suggestion
            if not str(dest).endswith(".exe"):
                dest.chmod(0o755)
            return
    raise RuntimeError(f"{binary_name} not found in Bun zip")
```

### Issue 3 of 4
packages/browseros/build/cli/storage.py:769-771
`_extract_codex_file` always applies `dest.chmod(0o755)`, including for `codex-windows-x64.exe`. This is inconsistent with how `_process_claude_code_platform` handles Windows EXEs. The entrypoint string returned by `_read_codex_package_entrypoint` already ends in `.exe` for Windows packages, so the same guard can be applied here.

```suggestion
            if not entrypoint.endswith(".exe"):
                dest.chmod(0o755)
            return
    raise RuntimeError(f"Codex entrypoint {entrypoint} not found in package")
```

### Issue 4 of 4
packages/browseros/build/cli/storage.py:912-916
**`binary_shas_upstream` and `r2_object_shas` reference the same dict object**

Both manifest keys are assigned the same `binary_shas` dict. For Claude Code the values happen to be identical (no transformation), but aliasing the same object means any future in-place mutation would silently update both fields. Passing separate copies would make the intent clearer.

Reviews (2): Last reviewed commit: "fix(build): preserve bundled Bun platfor..." | Re-trigger Greptile

Comment thread packages/browseros/build/cli/storage.py
@shadowfax92
Copy link
Copy Markdown
Contributor Author

Nikhil (shadowfax92) commented Jun 2, 2026

Addressed review feedback in 19888b3:

  • Removed unused CodexPlatform.entrypoint.
  • Skipped chmod for Bun/Codex Windows .exe extraction and added tests.
  • Copied Claude manifest dicts instead of aliasing shared objects.
  • Extracted bundled/native CLI resolution into a named helper.
  • Clarified the missing adapter package runner reason when a native CLI is present.

Verification:

  • cd packages/browseros && python -m unittest build.cli.storage_test build.common.server_binaries_test
  • cd packages/browseros-agent && bun test scripts/build/server/stage.test.ts apps/server/tests/lib/agents/bundled-bun.test.ts apps/server/tests/lib/agents/bundled-native-binary.test.ts apps/server/tests/lib/agents/adapter-detection.test.ts apps/server/tests/lib/agents/acpx-runtime.test.ts
  • cd packages/browseros-agent && bun run typecheck
  • cd packages/browseros-agent && bunx biome check scripts/build/server/stage.ts scripts/build/server/stage.test.ts apps/server/src/lib/agents/host-acp/detection.ts apps/server/src/lib/agents/host-acp/bundled-bun.ts apps/server/src/lib/agents/host-acp/bundled-native-binary.ts apps/server/tests/lib/agents/adapter-detection.test.ts apps/server/tests/lib/agents/bundled-bun.test.ts apps/server/tests/lib/agents/bundled-native-binary.test.ts apps/server/tests/lib/agents/acpx-runtime.test.ts
  • cd packages/browseros && python -m black --check build/cli/storage.py build/cli/storage_test.py build/common/server_binaries.py build/common/server_binaries_test.py
  • git diff --check

Also rebased on latest main to pick up BrowserOS#1090 and BrowserOS#1094; Git dropped the redundant bundled-Bun compatibility commit because it is now upstream.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

✅ Tests passed — 1112/1117

Suite Passed Failed Skipped
agent 99/99 0 0
build 17/17 0 0
eval 95/95 0 0
server-agent 246/246 0 0
server-api 83/83 0 0
server-browser 4/4 0 0
server-integration 9/10 0 1
server-lib 271/272 0 1
server-root 59/62 0 3
server-tools 229/229 0 0

View workflow run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant