devbox is an isolated, containerized development environment. Each project runs in its own Docker container with strict network enforcement, API observability, and all AI/dev tools pre-installed. The container is infrastructure — users exec into it to run claude, opencode, nvim, or any other tool.
- Isolation by default — each project gets its own container with no host filesystem access beyond the project directory.
- Defense in depth — dual-layer network enforcement (iptables + mitmproxy) ensures no unapproved egress, even from compromised agents or plugins.
- Tool-agnostic — Claude Code, OpenCode, Gemini CLI, Codex, nvim — all run inside the same isolated environment. No tool is privileged over another.
- Observable — every API call is logged to SQLite, queryable via
devbox logs. - Zero trust for agents — credentials injected at runtime, config mounted read-only, network policy controlled exclusively from the host.
devbox builds on three open-source projects (all MIT-licensed):
- claudebox (RchGrav) — profile system, per-project container architecture, allowlist CLI, DX patterns.
- agent-sandbox (mattolson) — dual-layer network enforcement: mitmproxy sidecar + iptables.
- claude-container (nezhar) — API logging proxy pattern and SQLite observability.
Host (macOS / Linux)
│
├── devbox CLI (bash) ← orchestration, secrets, allowlist
│ │
│ ├── Docker Compose
│ │ │
│ │ ▼
│ │ ┌─────────────────────────────────────────────────┐
│ │ │ sandbox network (internal: true) │
│ │ │ │
│ │ │ ┌───────────────────────────────────────────┐ │
│ │ │ │ Agent Container (per-project) │ │
│ │ │ │ │ │
│ │ │ │ Tools: claude, opencode, nvim, gemini, │ │
│ │ │ │ codex, gh, zsh, tmux │ │
│ │ │ │ │ │
│ │ │ │ iptables: OUTPUT → DROP │ │
│ │ │ │ except → bridge:8080 (proxy) │ │
│ │ │ │ except → 127.0.0.11:53 (DNS) │ │
│ │ │ │ │ │
│ │ │ │ Mounts: │ │
│ │ │ │ /workspace ← project dir (rw) │ │
│ │ │ │ /devbox ← global config (ro) │ │
│ │ │ └──────────────┬────────────────────────────┘ │
│ │ │ │ HTTP_PROXY / HTTPS_PROXY │
│ │ │ ▼ │
│ │ │ ┌───────────────────────────────────────────┐ │
│ │ │ │ Proxy Sidecar │ │
│ │ │ │ │ │
│ │ │ │ mitmproxy:8080 │ │
│ │ │ │ ├── enforcer.py (allowlist) │ │
│ │ │ │ ├── injector.py (credentials) │ │
│ │ │ │ ├── notifier.py (cmux) │ │
│ │ │ │ └── logger.py (SQLite) │ │
│ │ │ │ │ │
│ │ │ │ Mounts: │ │
│ │ │ │ /proxy/policy.yml ← allowlist (ro) │ │
│ │ │ │ /data/api.db ← API log (rw) │ │
│ │ │ └───────────────────────────────────────────┘ │
│ │ │ │ │
│ │ └─────────────────┼───────────────────────────────┘
│ │ │
│ │ ┌─────────────────▼───────────────────────────────┐
│ │ │ external network │
│ │ │ (proxy only — internet access) │
│ │ └─────────────────────────────────────────────────┘
│ │
│ └── devbox logs, devbox allowlist, devbox secrets
│ ↕ direct host filesystem (SQLite, policy.yml, .env)
│
└── ~/.devbox/<hash>/ ← per-project data (logs, history, secrets)
~/.config/devbox/ ← global config (opencode, .private/)
Every project runs exactly two containers orchestrated by Docker Compose:
-
Agent container (
devbox-agent) — Ubuntu 24.04 base with all dev tools. Lives exclusively on thesandboxnetwork (internal: true), meaning it has no route to the internet. All outbound is further locked down by iptables. Users exec into this container viadevbox resume. -
Proxy sidecar (
devbox-proxy) — Python 3.12-slim with mitmproxy. Bridges thesandboxandexternalnetworks — the sole egress path for the agent. Runs four chained addons: domain enforcement, credential injection, cmux notification, and request logging.
Docker Compose defines two networks:
sandbox—internal: truebridge. Only the agent and proxy join. Theinternalflag means Docker does not create a gateway, so containers on this network literally cannot reach the internet even without iptables.external— standard bridge. Only the proxy joins. This gives the proxy (and only the proxy) internet access.
The agent container connects to sandbox only. The proxy connects to both. This is the foundation of the isolation model — even if iptables is somehow bypassed, Docker's network topology prevents direct egress.
The agent container is not ephemeral. It starts once per devbox start, holds itself open with tail -f /dev/null, and accepts multiple concurrent exec sessions. Users open shells with devbox resume <name> (which runs docker compose exec agent gosu devbox zsh). This "exec-in" model means:
- Multiple tmux panes can each
devbox resume <name>into the same environment - All tools share the same firewall, proxy, secrets, and filesystem
- The container persists until
devbox stop— workspace state, installed packages, and shell history survive across shell sessions - No port mapping, no SSH, no serve/attach complexity
On macOS, OrbStack is the recommended Docker runtime over Docker Desktop. The key reason: OrbStack's Linux VM uses a shared filesystem that correctly handles Unix domain sockets, which Docker Desktop's VirtioFS/gRPC-FUSE layer does not.
This matters for devbox because tools like cmux (the Claude Code multiplexer) communicate via Unix sockets. When running under Docker Desktop, these sockets silently fail or hang because VirtioFS doesn't fully support AF_UNIX over the VM boundary. OrbStack uses a purpose-built filesystem layer that handles sockets natively, so cmux sessions work correctly inside devbox containers.
Additional OrbStack advantages:
- Lower resource usage — smaller memory footprint than Docker Desktop's VM
- Faster startup — containers launch in ~1s vs. 3-5s
- Native
dockerCLI — drop-in compatible, no wrapper shims - Rosetta x86 emulation — transparent emulation for x86 images on Apple Silicon
If you encounter hanging or broken tool sessions inside devbox on macOS, switching from Docker Desktop to OrbStack is the first troubleshooting step.
Standard Docker Engine (24.0+) with Compose v2 works without modifications. No VM layer means Unix sockets, iptables, and all kernel interfaces work natively.
The project directory and a small set of state directories are mounted read-write. Everything else is read-only or ephemeral:
| Mount | Access | Purpose |
|---|---|---|
/workspace |
rw |
Project source code (bind-mount from host) |
/devbox |
ro |
Global config — OpenCode config, private overlay |
/devbox/.private |
ro |
Private config overlay |
/run/proxy-ca |
ro |
Shared proxy CA certificate (Docker volume) |
/data/history |
rw |
Persistent shell history |
/home/devbox/.claude |
rw |
Claude Code state (credentials, conversations) |
/home/devbox/.opencode-mem/project |
rw |
OpenCode project memory |
/home/devbox/.opencode-mem/shared |
rw |
Shared OpenCode memory (Docker volume) |
/tmp |
rw (tmpfs) |
Ephemeral temp, 256 MB limit |
No access to ~/.ssh, ~/.aws, ~/.config, or any other host directory. The container cannot read or modify host state beyond the project and the state mounts listed above.
Two independent mechanisms block unauthorized egress:
The agent container's entrypoint initializes iptables before any user code runs. All three chains are locked down:
INPUT chain: DROP (default deny inbound)
1. lo — loopback
2. ESTABLISHED — responses to outbound connections
FORWARD chain: DROP (prevents use as network gateway)
OUTPUT chain: DROP (default deny outbound)
1. lo — loopback (localhost)
2. ESTABLISHED — responses to accepted connections
3. bridge:8080 — proxy sidecar (the only egress path)
4. 127.0.0.11:53 — Docker's embedded DNS resolver (UDP + TCP)
5. ICMP → DROP — blocks covert channels and network reconnaissance
IPv6: all chains DROP (fail-closed — if ip6tables setup fails, firewall_init fails)
Rule order matters — loopback first (tools need localhost), then conntrack for performance, then the proxy exception, then DNS. ICMP is explicitly dropped last (it would be caught by the default DROP policy, but explicit rules are self-documenting and survive policy changes). The bridge subnet is auto-detected from ip route at container startup and validated against a strict CIDR pattern with octet range checking.
INPUT DROP prevents external connections from reaching services inside the container if network configuration is ever misconfigured. FORWARD DROP prevents the container from being used as a network gateway. These are defense-in-depth — the internal: true network should prevent both scenarios, but firewall rules survive Docker bugs.
iptables runs as root during Phase 1 of the entrypoint, before dropping to the unprivileged devbox user. The NET_ADMIN capability is required for this — it's the only elevated capability the container has (all others are dropped via cap_drop: ALL).
IPv6 fail-closed: If ip6tables -P OUTPUT DROP fails, firewall_init returns non-zero and the container refuses to start. Earlier versions silently continued with partial enforcement — this was changed to prevent IPv6 bypass.
Health check: The agent's Docker health check verifies the iptables DNS rule exists (iptables -C OUTPUT -d 127.0.0.11 -p udp --dport 53 -j ACCEPT). If iptables isn't active, the container reports unhealthy. Note: iptables -C requires NET_ADMIN. The health check runs via docker exec (not as a child of the entrypoint), so it gets the container's original capability set — unaffected by the setpriv bounding set drop in the main process tree.
The proxy sidecar runs enforcer.py, which intercepts every HTTP request and HTTPS CONNECT tunnel:
- Reads the allowlist from
/proxy/policy.yml(YAML format) - For each request, checks if
flow.request.pretty_host(and optionally port) matches any allowed entry - Entry syntax:
api.github.com— hostname, any porthost.docker.internal:11434— hostname, specific port only*.github.com— wildcard, any port (exact match too — matchesgithub.comitself)*.github.com:443— wildcard, specific port only[::1]:8080— bracketed IPv6 with port
- Non-matching requests get a
403 Forbiddenwith body:BLOCKED by devbox enforcer: <host>:<port> is not in the allowlist
The proxy CA certificate is generated on first run. The private key is persisted on a proxy-only volume (proxy-ca-keypair), while only the public certificate is shared with the agent via a separate volume (proxy-ca-cert, mounted read-only). The agent entrypoint installs this certificate into the system trust store. This gives mitmproxy full HTTPS visibility — it can inspect, log, and enforce even for TLS traffic — without exposing the CA private key to the agent.
Policy hot-reload: The enforcer checks the policy file's mtime every DEVBOX_RELOAD_INTERVAL seconds (default: 30). When devbox allowlist add modifies the file on the host, the proxy picks up the change without restart.
| Process behavior | Stopped by |
|---|---|
Respects HTTP_PROXY env |
Proxy enforcer |
| Ignores proxy, connects directly | iptables OUTPUT DROP |
| Attempts DNS to external resolver | iptables OUTPUT DROP (only 127.0.0.11 allowed) |
| Attempts ICMP covert channel | iptables ICMP DROP |
| Attempts IPv6 bypass | ip6tables OUTPUT DROP (fail-closed) |
| Listens for inbound connections | iptables INPUT DROP |
| Attempts network forwarding | iptables FORWARD DROP |
| Container escape (kernel exploit) | Defense-in-depth: read_only rootfs, cap_drop: ALL, NET_ADMIN dropped from bounding set after init. Keep Docker current. |
API keys are never baked into Docker images. They're injected at runtime via Docker Compose env_file:
env_file:
- ${DEVBOX_SECRETS_FILE} # global secrets (~/.devbox/secrets/.env)
- ${DEVBOX_PROJECT_SECRETS_FILE} # per-project overrideSecrets files are created with umask 077 (mode 600). The CLI validates permissions and warns if they've been loosened. File locking (flock) prevents concurrent modification races.
When API keys are present in the user's secrets files, devbox automatically activates proxy-layer credential injection:
- Real API keys are passed only to the proxy sidecar (via
DEVBOX_INJECT_*env vars) - The agent container receives phantom tokens (
sk-devbox-phantom-not-a-real-key) that satisfy tool startup checks but have no real value - The proxy's
injector.pyaddon strips any auth headers the agent sends and injects real credentials based on the destination domain
This means a compromised agent cannot exfiltrate API keys — it literally does not possess them. The provider-to-header mapping is hardcoded in the injector (not configurable by the agent), preventing credential routing to arbitrary domains.
Supported providers: Anthropic (x-api-key), OpenAI (Authorization: Bearer), Gemini (x-goog-api-key), OpenRouter (Authorization: Bearer), GitHub API (Authorization: Bearer).
GH_TOKEN exception: GH_TOKEN remains in the agent environment because git credential helpers need it client-side for HTTPS operations. The injector still handles api.github.com requests, but the token is also available to the agent. This is an accepted trade-off — git operations require the token, and github.com is already in the allowlist.
Escape hatch: Set DEVBOX_CREDENTIAL_INJECTION=false in the environment or .devboxrc to disable injection and pass real keys to the agent (pre-v0.4 behavior).
The logging proxy captures all outbound API request/response bodies. If a prompt injection tricks an agent into exfiltrating data in a request body, the log records it — visible via devbox logs.
cap_drop: [ALL] # Drop all Linux capabilities
cap_add: [NET_ADMIN, SETUID, SETGID, SETPCAP] # Firewall, gosu, cap drop
security_opt: [no-new-privileges:true] # Prevent privilege escalation
read_only: true # Immutable root filesystem
tmpfs: /tmp (256MB) # Ephemeral temp, size-limited
pids: 4096 # Prevent fork bombs (allows concurrent AI tools)
memory: 8G (configurable) # OOM protection
cpus: 4.0 (configurable) # CPU quota
restart: unless-stopped # Auto-recover from crashesSETUID and SETGID are required for gosu (the entrypoint drops from root to the devbox user after firewall setup). SETPCAP is required for setpriv to drop NET_ADMIN from the bounding set after firewall init. no-new-privileges prevents any process from gaining capabilities beyond what it was started with.
Capability drop after init: After firewall_init() completes, the entrypoint uses setpriv --bounding-set -net_admin --inh-caps -net_admin to irrevocably drop NET_ADMIN from the bounding set before switching to the unprivileged user. Once removed from the bounding set, no child process can ever regain NET_ADMIN — the kernel enforces this. The iptables rules become immutable from inside the container's main process tree.
| Container path | Host path | Mode | Purpose |
|---|---|---|---|
/workspace |
~/projects/<name> |
rw | Project source code |
/devbox |
~/.config/devbox |
ro | Global config, OpenCode |
/devbox/.private |
~/configs/devbox |
ro | Private overlay (settings, hooks, skills) |
/data/history |
~/.devbox/<hash>/history |
rw | Shell history |
/home/devbox/.claude |
~/.devbox/claude |
rw | Claude Code state (credentials, conversations, plugins) |
/home/devbox/.opencode-mem/project |
~/.devbox/<hash>/memory |
rw | OpenCode memory |
/home/devbox/.opencode-mem/shared |
Docker volume | rw | Shared OpenCode memory |
/run/proxy-ca |
Docker volume | ro | Proxy CA certificate |
Read-only rootfs (read_only: true) prevents persistence attacks — a compromised agent cannot modify system binaries, install backdoors, or tamper with the firewall scripts. Writable paths are constrained to tmpfs mounts and the bind mounts above.
| Threat | Mitigation | Residual risk |
|---|---|---|
| Agent exfiltrates data | Dual-layer network + domain allowlist | Can still send to allowed domains |
| Prompt injection via project files | Network layer limits destinations; credentials not in agent env | In-boundary actions cannot be prevented |
| Credential theft (API keys) | Proxy-layer injection — agent gets phantom tokens, real keys never enter container | GH_TOKEN remains in agent for git operations |
| Claude OAuth token access | ~/.claude/.credentials.json is rw — agent can read OAuth tokens |
Accepted: agent already has API access via the proxy; the token grants no additional access beyond what the proxy allows |
| Malware persistence | read_only: true rootfs — system paths immutable |
Agent can persist in writable mounts (~/.claude/, /workspace) |
| Git token over-scope | N/A — user responsibility | Token may grant access beyond mounted project |
| Container escape | cap_drop: ALL, no-new-privileges, NET_ADMIN dropped after init |
Kernel exploits remain possible |
| Firewall modification | NET_ADMIN irrevocably dropped from bounding set after firewall init | docker exec -u 0 retains container-level caps |
| DNS tunneling | DNS restricted to Docker resolver (127.0.0.11) | Docker resolver is trusted |
| IPv6 bypass | ip6tables -P OUTPUT DROP (fail-closed — container refuses to start on failure) |
Requires ip6tables binary in container |
| Project file tampering | /workspace is rw by design — agent needs to edit code | Malicious commits, git hook injection possible |
| Claude hooks/settings tampering | ~/.claude/ is rw — agent can modify hooks |
Private overlay re-applies settings on restart |
mitmproxy loads four addons in order:
enforcer.py— Domain allowlist enforcementinjector.py— Proxy-layer credential injection (strip agent auth headers, inject real credentials)notifier.py— cmux integration (sidebar status, notifications via host relay)logger.py— SQLite request/response logging
All are loaded via mitmdump -s enforcer.py -s injector.py -s notifier.py -s logger.py. Order matters: the enforcer blocks disallowed domains before the injector touches headers (blocked flows never receive credentials). The logger runs last, recording the final state.
Policy loading: Reads /proxy/policy.yml using PyYAML's safe_load. Validates file size (max 1 MB), structure (allowed key must be a list), and wildcard patterns (only *. prefix allowed, no multi-wildcard). Returns empty list on any error — fail-closed design.
Domain matching: Case-insensitive. Two modes:
- Exact:
host == pattern - Wildcard:
*.example.commatchesexample.comandsub.example.com
Internal endpoints: Requests to /_devbox/* paths are skipped by the enforcer — these are handled by the notifier addon for cmux integration and never leave the proxy.
Blocking: Both request() (HTTP) and http_connect() (HTTPS CONNECT) hooks check the allowlist. Blocked responses include the host (truncated to 253 chars to prevent oversized responses from malicious Host headers).
Hot-reload: The _maybe_reload() method, called on every request, checks if RELOAD_INTERVAL seconds have passed since the last mtime check. If the file's mtime changed, it reloads. This means devbox allowlist add takes effect within one interval without restarting the proxy.
The agent container cannot reach the host directly (internal-only network). cmux runs on the host and communicates via a Unix socket that can't bridge into containers. The notification system uses a multi-hop relay:
┌─────────────────────────────────────┐
│ Agent Container (sandbox network) │
│ │
│ Claude Code fires hook │
│ │ │
│ │ stdin JSON │
│ ▼ │
│ devbox-claude-hook <event> │
│ │ │
│ │ POST /_devbox/claude-hook │
│ │ (HTTP via proxy env vars) │
└───────┼─────────────────────────────┘
│ sandbox network (internal)
┌───────▼─────────────────────────────┐
│ Proxy Sidecar (sandbox + external) │
│ │
│ notifier.py intercepts /_devbox/* │
│ │ │
│ │ TCP to host.docker.internal │
│ │ JSON-RPC: claude-hook.stop │
└───────┼─────────────────────────────┘
│ external network → host
┌───────▼─────────────────────────────┐
│ Host (macOS) │
│ │
│ cmux-proxy.py (TCP relay daemon) │
│ │ │
│ │ inline cmux socket protocol │
│ │ (set_status, notify_target, │
│ │ clear_notifications, etc.) │
│ ▼ │
│ cmux app (Unix socket) │
│ → sidebar status pills │
│ → desktop notifications │
│ → session tracking │
└─────────────────────────────────────┘
Why this chain: Each hop crosses one isolation boundary. The agent can only reach the proxy (iptables). The proxy can reach the host (external network). The host daemon has cmux socket access. No firewall exceptions needed on the agent.
What cmux handles: All rendering — sidebar status pills ("Working", "Idle", "Needs input"), desktop notifications with Claude's actual response text, session tracking. The hooks pipe raw JSON from Claude Code; cmux interprets it natively via cmux claude-hook.
Internal endpoints: The notifier intercepts requests to /_devbox/* paths:
/_devbox/claude-hook— forwards Claude Code hook JSON to the host proxy for cmux sidebar/notification updates/_devbox/notify— sends a notification vianotification.createJSON-RPC/_devbox/status— sets sidebar status via text protocol
Host-side proxy (cmux-proxy.py): TCP relay started by devbox when cmux is detected. Listens on fixed port 19876, filters commands against an allowlist, and forwards to the cmux Unix socket. Claude Code hook events are handled inline using the cmux socket protocol (no subprocess calls). The socket connection is established at startup while the proxy is still in the cmux process tree (required for cmux's process-lineage auth). The proxy auto-restarts on the next devbox command if it dies.
Workspace binding: each devbox session's proxy sidecar is passed CMUX_WORKSPACE_ID at container start. notifier.py attaches this to every forwarded request and strips any agent-supplied --tab= / --workspace= / params.workspace_id. The host proxy treats the sidecar's value as authoritative — so sessions started in different cmux workspaces always route to the right workspace, regardless of which workspace the host daemon was first spawned from.
Known limitation — local trust: the host proxy binds 127.0.0.1:19876 with no authentication. Any process on the Mac can send commands that pass the allowlist (sidebar status, notifications, claude-hook events). The blast radius is limited to cmux sidebar/notification manipulation — no data exfiltration, no code execution, no container access. An attacker with local code execution has equivalent capabilities through native macOS APIs (osascript, NSUserNotification) without this proxy. A shared-secret token scheme would close this but adds compose plumbing; treated as a defense-in-depth follow-up.
Database: SQLite at /data/api.db with WAL mode for concurrent read/write. Schema:
CREATE TABLE requests (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%f', 'now')),
method TEXT NOT NULL,
url TEXT NOT NULL,
host TEXT NOT NULL,
status INTEGER,
request_content_type TEXT,
request_body TEXT,
response_content_type TEXT,
response_body TEXT,
duration_ms INTEGER
);Body truncation: Request and response bodies are truncated at 64 KB to prevent unbounded storage growth. Truncated entries are marked with [TRUNCATED by devbox logger at 64KB].
Retention: Configurable via environment variables:
DEVBOX_LOG_MAX_AGE_DAYS— delete rows older than N days (default: 90)DEVBOX_LOG_MAX_ROWS— keep at most N rows (default: 100,000)- Pruning runs at startup and every 1,000 inserts
Querying: The host-side devbox logs command reads the SQLite database directly (or falls back to running sqlite3 inside the container if the host lacks it). Supports filters: --errors, --blocked, --slow, --hosts, --since, --until.
1. Proxy starts → generates CA keypair in /home/devbox/.mitmproxy/
(flock serializes concurrent proxy starts on the same volume)
2. Exports mitmproxy-ca-cert.pem to /ca/ (shared Docker volume)
3. Agent entrypoint waits for /run/proxy-ca/mitmproxy-ca-cert.pem (up to 60s)
4. Copies to /usr/local/share/ca-certificates/mitmproxy-ca.crt
5. Runs update-ca-certificates --fresh
6. Sets NODE_EXTRA_CA_CERTS, REQUESTS_CA_BUNDLE, SSL_CERT_FILE
This ensures all TLS libraries (OpenSSL, Node.js, Python requests) trust the proxy's CA, allowing full HTTPS inspection.
The agent container uses a split entrypoint to minimize privilege exposure:
Phase 1 — Root (entrypoint.sh):
- Detects the Docker bridge subnet from
ip route - Validates the CIDR (format regex + semantic octet/prefix range check)
- Sources and runs
firewall_init()— mandatory, container refuses to start on failure - Waits for proxy CA certificate, installs into system trust store
- Sets TLS environment variables
Phase 2 — Unprivileged (user-setup.sh via gosu devbox):
- Configures git identity from
GIT_AUTHOR_NAME/GIT_AUTHOR_EMAILenv vars - Links OpenCode configuration from read-only mount
- Copies private config overlay from
/devbox/.private/into user home - Sets up tmux symlinks for version compatibility
- Changes to
/workspaceand holds open withtail -f /dev/null
The gosu call (exec gosu devbox ...) replaces the root process entirely — no root shell remains running.
The agent image uses a two-stage Docker build:
Stage 1 (builder): Installs everything that requires build tools or network access:
- Node.js 22 LTS (via NodeSource apt repo)
- npm global packages: opencode-ai, @google/gemini-cli, @openai/codex, @anthropic-ai/claude-code, gsd-opencode
- GitHub CLI (via apt repo)
- uv (Python toolchain manager, copied from official image)
- Oh My Zsh + Powerlevel10k + plugins (git clones — least cacheable, ordered last)
Stage 2 (runtime): Copies only artifacts from the builder, installs minimal runtime packages:
COPY --from=builderfor npm packages, gh, uv, Oh My Zsh- Runtime packages: bash, delta, fzf, git, gosu, iproute2, iptables, jq, neovim, python3, sqlite3, tmux, zsh
- Shell configs from
templates/ - Firewall script and language profiles from
lib/andtooling/profiles/ - Entrypoint scripts
Layer ordering is optimized for Docker build cache — apt packages and npm installs (stable) come before git clones (change frequently).
Profiles are self-contained bash scripts in tooling/profiles/ that install language toolchains inside the running container:
| Profile | What it installs | Variants |
|---|---|---|
rust |
rustup, cargo, clippy, rustfmt, cargo-watch, cargo-edit | wasm (wasm-pack, wasm32 target) |
python |
uv, python3, ruff, mypy, pytest | ml (numpy, pandas, scikit-learn), api (fastapi, httpx) |
node |
Node.js LTS, pnpm, typescript, eslint, prettier | bun (Bun runtime) |
go |
Go toolchain, golangci-lint, delve, gopls | — |
Profiles run via docker compose exec agent gosu devbox bash -c 'source "/usr/local/lib/devbox/profiles/$1.sh"' _ "<name>". The profile name is validated against ^[a-zA-Z0-9_-]+$ before execution. Variants are validated against the profile's declared # VARIANTS: header.
Each project gets its own policy.yml, stored at ~/.devbox/<hash>/policy.yml on the host and mounted read-only into the proxy at /proxy/policy.yml. The agent container cannot access or modify this file.
The default policy (templates/policy.yml) allows:
- Model APIs: api.anthropic.com, openrouter.ai, generativelanguage.googleapis.com, api.openai.com
- Package registries: crates.io, registry.npmjs.org, pypi.org, files.pythonhosted.org, storage.googleapis.com
- Code hosting: github.com, api.github.com, *.githubusercontent.com
- Language toolchains: sh.rustup.rs, go.dev, dl.google.com, *.golang.org, astral.sh
- Documentation: docs.rs, docs.python.org, developer.mozilla.org
- System updates: security.ubuntu.com, archive.ubuntu.com
Users manage the allowlist via devbox allowlist add|remove|reset. All modifications use file locking (flock) to prevent concurrent write races.
The devbox script is a bash CLI that sources library modules from lib/:
devbox (entry point)
├── lib/commands.sh — command handlers (start, stop, profile, logs, etc.)
├── lib/container.sh — Docker Compose lifecycle (build, start, shell, status)
├── lib/secrets.sh — secrets management (set, show, edit, remove)
├── lib/allowlist.sh — allowlist CRUD (add, remove, reset, show)
├── lib/mount.sh — per-project volume mount management
├── lib/profile.sh — profile discovery, validation, menus
├── lib/firewall.sh — iptables rules (sourced inside container, not on host)
└── lib/ui.sh — TUI helpers (info, warn, error, confirm, spinner)
Each project is identified by a 16-character SHA-256 hash of its absolute path:
echo -n "/home/user/projects/my-app" | sha256sum | cut -c1-16
# → a1b2c3d4e5f67890This hash is used as the data directory name (~/.devbox/bf341fbe16930634/). The Docker Compose project name includes both the human-friendly name and the hash for uniqueness: devbox-ralph-bf341fbe16930634. The project name defaults to the directory basename but can be overridden via DEVBOX_NAME in .devboxrc. The full path is stored in .project_path and the name in .project_name.
Projects can include a .devboxrc file with whitelisted variables:
| Variable | Type | Default | Purpose |
|---|---|---|---|
DEVBOX_MEMORY |
Docker memory (e.g., 12G) |
8G |
Agent container memory limit |
DEVBOX_CPUS |
Decimal (e.g., 4.0) |
4.0 |
Agent container CPU limit |
DEVBOX_BRIDGE_SUBNET |
CIDR (e.g., 172.18.0.0/16) |
Auto-detected | Docker bridge subnet for firewall |
DEVBOX_RELOAD_INTERVAL |
Integer (seconds) | 30 |
Policy file hot-reload interval |
DEVBOX_PRIVATE_CONFIGS |
Git URL or path | — | Private config overlay source |
DEVBOX_NAME |
Alphanumeric + hyphens (max 32) | Directory basename | Human-friendly project name |
Only these variable names are accepted — arbitrary keys are rejected. Values are validated by type. Environment variables take precedence over file values.
Users overlay their local dev environment into the container without committing configs to the public repo. The goal: the container should feel identical to your local machine.
Set DEVBOX_PRIVATE_CONFIGS to a local directory path or private git URL:
your-configs/
├── Dockerfile # Optional: FROM devbox-agent:latest, pre-build plugins
├── claude/ # → ~/.claude/ (settings.json, hooks, skills)
├── opencode/ # → ~/.config/opencode/ (merged with defaults)
├── nvim/ # → ~/.config/nvim/ (init.lua, lua/, lazy-lock.json)
├── tmux/ # → ~/.config/tmux/ + ~/.tmux.conf (symlinked)
└── .zshrc # → ~/.zshrc (replaces default devbox zshrc)
-
Host sync —
sync_private_configs()symlinks a local directory or shallow-clones a git repo to~/.config/devbox/.private/. Local paths are symlinked (zero-copy); git repos use--depth=1 --single-branch. -
Image build (optional, cached) — if
.private/Dockerfileexists,container_build()runsdocker build -f .private/Dockerfile -t devbox-agent:latest. This layers heavy installs (nvim plugins, LSP servers) on top of the base image. Docker build cache means plugins only reinstall when lock files change (e.g.,lazy-lock.json). -
Startup overlay — Phase 2 of the entrypoint (
user-setup.sh) copies configs from the read-only mount (/devbox/.private/) into the user's home. This runs every start, so config file changes take effect immediately without rebuilding.
- Read-only mount + copy — the host directory is mounted
:roso the container cannot modify host state. Configs are copied into user home for tools that need write access (e.g., nvim's lazy-lock). - Build cache for plugins — nvim Lazy sync, LSP installs, and similar heavy operations are baked into the image layer. Only re-runs when the relevant COPY layer changes.
- Overlay on every start — even with pre-built images, the entrypoint re-copies configs. Editing a config file takes effect on next
devbox startwithout rebuilding. - Private repo isolation — cloned to
~/.config/devbox/.private/, never referenced in the public devbox repo or committed to images by default.
When using OpenCode inside the container, PAL MCP + clink dispatch is available:
| Task | Dispatch to | Reason |
|---|---|---|
| Security audit | Codex | Fresh context, isolated, no side effects |
| Full codebase review | Gemini CLI | 1M context window |
| Complex debugging | Codex | Strong reasoning, clean slate |
| Architecture decision | PAL consensus | Multi-model cross-check |
| Large log analysis | Gemini CLI | 1M context, fast |
| Quick analysis | Direct (OpenCode) | No overhead |
This is optional — users who prefer Claude Code or direct tool use are not required to use OpenCode's dispatch model.
~/.devbox/ # Per-user runtime data (DEVBOX_DATA)
├── secrets/.env # Global API keys (mode 600)
├── <project-hash>/
│ ├── .project_path # Absolute path for display
│ ├── policy.yml # Project network policy
│ ├── secrets/.env # Per-project secrets override (mode 600)
│ ├── history/ # Persistent shell history
│ ├── memory/ # OpenCode project memory
│ └── logs/
│ └── api.db # SQLite API log
~/.config/devbox/ # Global config (DEVBOX_CONFIG, mounted ro)
├── opencode/ # OpenCode config (opencode.json, pal/, etc.)
└── .private/ # Private config overlay (git or symlink)
├── Dockerfile # Optional image layer
├── claude/ # → ~/.claude/
├── opencode/ # → ~/.config/opencode/
├── nvim/ # → ~/.config/nvim/
├── tmux/ # → ~/.config/tmux/
└── .zshrc # → ~/.zshrc
| Host path | Container path | Mode | Purpose |
|---|---|---|---|
~/projects/my-app/ |
/workspace |
rw | Project source |
~/.config/devbox/ |
/devbox |
ro | Config + private overlay |
~/.devbox/<hash>/history/ |
/data/history |
rw | Shell history |
~/.devbox/<hash>/memory/ |
~/.opencode-mem/project |
rw | OpenCode memory |
~/.devbox/<hash>/logs/ |
/data (proxy) |
rw | API logs |
~/.devbox/<hash>/policy.yml |
/proxy/policy.yml (proxy) |
ro | Allowlist |
~/.devbox/<hash>/secrets/.env |
env-file | — | Per-project secrets |
~/.devbox/secrets/.env |
env-file | — | Global secrets |
Docker-managed volumes:
| Volume | Container path | Purpose |
|---|---|---|
proxy-ca-keypair |
/ca (proxy only, rw) |
CA private key + cert (persists across restarts) |
proxy-ca-cert |
/run/proxy-ca (agent, ro) and /ca-cert (proxy, rw) |
Public CA certificate only (shared with agent) |
devbox-shared-memory |
~/.opencode-mem/shared |
Cross-project OpenCode memory |
devbox/
├── devbox # Main CLI entry point (bash)
├── main.sh # Bootstrap / symlink installer
├── Dockerfile # Agent container image (multi-stage)
├── entrypoint.sh # Agent entrypoint (Phase 1: root, firewall + CA)
├── user-setup.sh # Agent entrypoint (Phase 2: devbox user, config overlay)
├── docker-compose.yml # Container + sidecar stack definition
├── proxy/
│ ├── Dockerfile # Proxy sidecar image (Python 3.12-slim)
│ ├── entrypoint.sh # Proxy entrypoint (CA gen + mitmdump)
│ ├── enforcer.py # Domain allowlist addon
│ ├── injector.py # Credential injection addon
│ ├── notifier.py # cmux notification addon
│ └── logger.py # SQLite logging addon
├── lib/
│ ├── commands.sh # CLI command handlers and helpers
│ ├── container.sh # Container lifecycle (build, start, shell, status)
│ ├── firewall.sh # iptables + ip6tables setup
│ ├── mount.sh # Volume mount management
│ ├── secrets.sh # Secrets management (set/show/edit/remove)
│ ├── profile.sh # Profile management and menus
│ ├── allowlist.sh # Allowlist CRUD operations
│ └── ui.sh # TUI helpers (info, warn, error, confirm, spinner)
├── tooling/
│ ├── completions.bash # Bash tab completion
│ ├── completions.zsh # Zsh tab completion
│ └── profiles/
│ ├── _common.sh # Shared profile helpers
│ ├── rust.sh # Rust toolchain profile
│ ├── python.sh # Python toolchain profile
│ ├── node.sh # Node.js toolchain profile
│ └── go.sh # Go toolchain profile
├── config/
│ └── opencode/ # Default OpenCode config
│ ├── opencode.json
│ ├── agents/
│ ├── pal/systemprompts/clink/
│ └── skills/
├── templates/
│ ├── policy.yml # Default network policy
│ ├── claude-hooks.json # Default Claude Code hooks/settings
│ ├── AGENTS.md # Per-project agent docs template
│ ├── zshrc # Container zsh config
│ ├── tmux.conf # Container tmux config
│ └── private-overlay.Dockerfile # Template for private config Dockerfile
├── tests/
│ ├── bats/ # Bash integration tests (BATS)
│ └── pytest/ # Python unit tests (enforcer, logger)
├── docs/
│ ├── DESIGN.md # This file
│ └── PLAN.md # Implementation plan and changelog
├── .github/workflows/ci.yml # CI: lint + build + smoke test
├── .pre-commit-config.yaml # Linter config (shellcheck, hadolint, ruff, etc.)
├── CREDITS.md # Attribution
├── LICENSE # MIT
└── README.md # Quickstart and user documentation
| Decision | Choice | Rationale |
|---|---|---|
| New project vs fork | New project | Scope of changes makes it genuinely distinct |
| Licensing | MIT | Compatible with all source projects |
| Container model | Isolated environment (exec-in) | Tool-agnostic; user runs whatever they want per-pane. No serve/attach complexity, no port mapping. |
| macOS runtime | OrbStack recommended | Docker Desktop's VirtioFS breaks Unix domain sockets (cmux). OrbStack handles them natively. |
| Network enforcement | Dual-layer (mitmproxy + iptables) | Single-layer iptables bypassable via hardcoded IPs. Single-layer proxy bypassable by ignoring env vars. Both together close the gap. |
| Network topology | internal: true + separate external |
Even without iptables, Docker's network isolation prevents direct egress |
| Observability | SQLite + CLI queries | Zero extra infrastructure, queryable, persistent, works offline |
| Config storage | Host-mounted read-only | Updates without rebuilds; agent cannot tamper |
| Secrets | --env-file at runtime |
Never baked into image. File locking prevents races. |
| Container scope | One per project | True isolation, per-project network policy |
| Private configs | Git repo + optional Dockerfile overlay | Configs stay private; heavy installs baked into cached image layer |
| Credential injection | Environment only | No SSH key copying, no volume-mounting credential files |
| Policy file location | Host-mounted read-only | Agent cannot modify its own network policy |
| Entrypoint split | Two phases (root → gosu devbox) | Minimizes root exposure. Phase 1: firewall + CA. Phase 2: user config + hold open. |
| IPv6 | DROP all if ip6tables available | Prevents bypass via IPv6 tunneling |
| DNS | Restricted to 127.0.0.11 | Prevents DNS tunneling to external resolvers |
| Rate limiting | Intentionally not implemented | Single-user tool; user controls agent and API keys. Rate limiting adds config complexity and risks interrupting legitimate sessions. For runaway spend, set budget alerts on your API provider's dashboard. |
| Fail-closed enforcer | Empty allowlist on policy error | Security default: if the policy file is missing or malformed, block all traffic rather than allow all |
| CIDR validation | Regex + semantic check | Regex validates format, separate function validates octets ≤ 255 and prefix ≤ 32 |
| Default seccomp | Docker's built-in profile | Blocks ~44 dangerous syscalls (ptrace, bpf, etc.) out of the box. Custom profile not needed given cap_drop: ALL. |
| Read-only rootfs | read_only: true |
Root filesystem is immutable. Writable paths use tmpfs mounts (/home/devbox, /tmp, /run, /var/log, CA cert dirs). Oh My Zsh lives at /opt/oh-my-zsh on the read-only rootfs; user home is populated from /etc/skel at startup. |
devbox's architecture was evaluated against production AI isolation tools. Key comparisons:
| Tool | Isolation level | Network egress control | Request-level logging | Local/self-hosted |
|---|---|---|---|---|
| E2B | Firecracker microVM (KVM) | SNI/Host header filtering | No | Cloud-first |
| Daytona | Docker (+ optional Kata) | API-driven allow/block | No | Yes |
| Coder | Process-level (nsjail/Landlock) | Domain + HTTP method + path | Audit logs only | Yes |
| Gitpod | K8s pods + VM | VPC network policies | No | Enterprise |
| Dev Containers | Docker container | None by default | No | Yes |
| devbox | Docker container | iptables + mitmproxy allowlist | Full HTTP req/resp to SQLite | Yes |
Request-level observability — no other tool in this landscape logs full HTTP request/response bodies to a queryable store. Coder's Agent Boundaries come closest with audit logs, but at process level, not container level.
Dual-layer network enforcement — most tools use either iptables OR a proxy. devbox uses both, closing the gap that either alone leaves open (processes ignoring proxy env vs. hardcoded IPs bypassing iptables).
Isolation depth — E2B and Firecracker use hardware virtualization (KVM), providing a fundamentally stronger boundary than Docker namespaces/cgroups. A container escape gives host access; a VM escape is orders of magnitude harder. gVisor sits in between as a potential drop-in upgrade.
Credential brokering — E2B's egressTransform injects Authorization headers at the proxy layer so secrets never enter the sandbox at all. Coder provisions per-workspace ephemeral credentials via Vault. devbox now uses a similar proxy-layer injection pattern (injector.py) for supported API providers, though GH_TOKEN remains in the agent environment for git credential helper compatibility.
Read-only root filesystem — production containers typically use read_only: true to prevent malware persistence. devbox now uses read_only: true with tmpfs mounts for writable paths and Oh My Zsh installed to /opt/oh-my-zsh on the read-only rootfs.
These are deliberate design choices, not oversights:
-
Docker over Firecracker — devbox is a single-user local tool. Docker is universally available; Firecracker requires KVM and custom orchestration. The threat model is "prevent accidental data exfiltration by AI agents," not "multi-tenant hostile workloads."
-
Proxy-layer credential injection with GH_TOKEN exception — API keys for model providers (Anthropic, OpenAI, Gemini, OpenRouter) are injected at the proxy layer via
injector.py— the agent receives phantom tokens.GH_TOKENis the exception: git credential helpers need it client-side, so it remains in the agent environment. Disable injection entirely withDEVBOX_CREDENTIAL_INJECTION=false. -
SQLite over encrypted storage — API logs contain request/response bodies (potentially sensitive). SQLite stores them in plaintext on the host. Encryption (SQLCipher) would add a dependency and key management complexity. For a single-user tool, host filesystem permissions (umask 077) are sufficient; the threat is agent exfiltration, not host compromise.