Cleanroom runs untrusted code in microVMs with deny-by-default network policy. It is self-hosted, enforces repository-scoped egress rules, and keeps credentials on the host side of the VM boundary.
Agent sandboxing tools are proliferating fast. Most focus on isolation alone. Cleanroom adds policy-controlled network access so you decide exactly what the sandbox can reach.
Deny-by-default egress. A cleanroom.yaml policy file in your repo controls which hosts the sandbox may reach. Current hostname-based rules are enforced from observed DNS answers plus destination IP:port, so co-hosted services on the same IP:port are not distinguished. Everything else is blocked.
MicroVM isolation. Each sandbox is a hardware-virtualized microVM (Firecracker on Linux, Virtualization.framework on macOS), not a container. A VM boundary is stronger than namespaces, seccomp, or gVisor -- a kernel vulnerability in the guest doesn't compromise the host.
Self-hosted. Runs on your infrastructure. Your code and data never leave your machines.
Credentials stay on the host. A host-side gateway rewrites git traffic through Cleanroom-owned routes and keeps upstream credentials on the host side of the boundary. The same gateway now embeds content-cache for cache-backed git, OCI, Go module, RubyGems, and immutable download handling.
Standard OCI images. Use any OCI image from any registry as your sandbox base. Digest-pinned in policy for reproducibility. No custom VM image format or vendor-specific base images. Same image works across backends.
Docker inside the sandbox. Enable a guest Docker daemon with a single policy flag (services.docker.required: true). Docker Hub pulls are mirrored through the host gateway cache, and you can build and run containers inside the microVM.
Coming soon: broader guest-side package-manager rewrites with lockfile enforcement, broader non-Docker-Hub registry caching, hermetic offline build flows, and richer audit surfaces. See the spec for the full roadmap.
Install the latest release:
curl -fsSL https://raw.githubusercontent.com/buildkite/cleanroom/main/scripts/install.sh | bashInstall a specific version:
curl -fsSL https://raw.githubusercontent.com/buildkite/cleanroom/main/scripts/install.sh | \
bash -s -- --version vX.Y.ZBy default this installs to /usr/local/bin. Override with --install-dir or CLEANROOM_INSTALL_DIR.
On macOS, install.sh now prefers the signed, notarized .pkg when using the default install flow. It falls back to the release tarball when you request a custom install directory or helper customization.
Install the locally built binaries from this checkout into /usr/local/bin:
mise run install:globalInitialize runtime config and check host prerequisites:
cleanroom config init
cleanroom config validate
cleanroom doctorStart the server (all CLI commands need a running server):
cleanroom serve &The server listens on unix://$XDG_RUNTIME_DIR/cleanroom/cleanroom.sock by default.
When observability is enabled, cleanroom serve also prints startup status for
trace export, sampling, and whether direct trace links are configured.
For observability setup, local Grafana/Tempo/Prometheus development, runtime config examples, and trace diagnostics, see docs/observability.md.
Install as a daemon:
# macOS: installs a user LaunchAgent (user-scope only)
cleanroom daemon install --restart
# Linux (systemd)
sudo cleanroom daemon install --restartUse cleanroom daemon install --init-config --restart for first-run bootstrap
when the runtime config file does not exist yet.
Use cleanroom daemon restart --force to start the daemon again if it is
currently stopped. On macOS, --system is unsupported; --user is accepted
for explicitness.
Manage the daemon lifecycle:
cleanroom daemon status
cleanroom daemon start
cleanroom daemon stop
cleanroom daemon restart
cleanroom daemon uninstallThe system daemon socket is root-owned (unix:///var/run/cleanroom/cleanroom.sock),
so client commands against that daemon should be run with sudo unless you
configure an alternate endpoint. User-scope daemons listen on the runtime socket
(unix://$XDG_RUNTIME_DIR/cleanroom/cleanroom.sock when XDG_RUNTIME_DIR is set).
Run a command in a sandbox:
cleanroom exec -- npm test
cleanroom exec --tty -e OPENAI_API_KEY -- codex app-serverWhen cleanroom.yaml includes a repository bootstrap block, the top-level
commands become repo-aware: Cleanroom resolves the current git remote and local
HEAD, materializes that checkout in the sandbox, and starts commands in the
configured guest path. Cleanroom no longer auto-detects or auto-wraps commands
for mise; if you want mise, run it explicitly in the command you execute or
in sandbox.dependencies.command or sandbox.services.command so it can
participate in create-time stage caching.
You can also define create-time and per-execution setup:
sandbox:
dependencies:
command: bundle install
key:
files: [Gemfile.lock]
# Optional: reuse dependency state across source-only commits when the
# dependency outputs survive a checkout refresh.
# reuse: portable
services:
docker:
required: true
command: |
docker compose up -d postgres valkey
bin/rails db:prepare
docker compose stop postgres valkey
key:
files: [docker-compose.yml, db/schema.rb]
run:
before: docker compose up -d postgres valkeyUse sandbox.dependencies.command for deterministic repo-local bootstrap,
sandbox.services.command for snapshotable on-disk service preparation, and
sandbox.run.before for live startup that must happen before each execution.
Set sandbox.services.docker.required: true when the services bootstrap needs
the guest Docker daemon.
sandbox.dependencies.command and sandbox.services.command support either a
shell string or an argv sequence. Prefer the string form unless you specifically
need exact argv semantics.
sandbox.run.before always runs through sh -lc.
Pre-create a long-running sandbox without running a command:
SANDBOX_ID="$(cleanroom create)"
cleanroom exec --in "$SANDBOX_ID" -- npm run lintOverride the sandbox image per command (remote tag/digest or local Docker image name):
cleanroom sandbox create --image ghcr.io/buildkite/cleanroom-base/debian:latest
cleanroom exec --image ghcr.io/buildkite/cleanroom-base/debian:latest -- npm test
cleanroom console --image my-local-image:dev -- sh
cleanroom exec -e OPENAI_API_KEY -e CODEX_HOME=/workspace/.codex -- codex app-serverEquivalent namespaced command:
cleanroom sandbox createcleanroom sandbox create stays generic. It does not inspect the local git
repository or infer a checkout from cleanroom.yaml.
cleanroom exec and cleanroom console create ephemeral sandboxes by default.
Reuse an existing sandbox with --in, or keep a newly created sandbox with
--keep.
List sandboxes and run more commands:
cleanroom sandbox ls
cleanroom exec --in <id> -- npm run lint
cleanroom exec --in <id> -- npm run buildCopy a one-off file into or out of a kept sandbox:
cleanroom cp ./fixture.json <id>:/tmp/fixture.json
cleanroom cp <id>:/tmp/result.json ./result.jsonKeep a sandbox created by exec:
cleanroom exec --keep -- npm testRun against a snapshot:
cleanroom exec --from snap_... -- npm test
cleanroom console --from snap_...Interactive console:
cleanroom console -- bash
cleanroom exec --tty -- bashA cleanroom.yaml in your repo defines the sandbox policy. Cleanroom also checks .buildkite/cleanroom.yaml as a fallback.
version: 1
sandbox:
image:
ref: ghcr.io/buildkite/cleanroom-base/debian@sha256:0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef
resources:
vcpus: 4
memory: 8GiB
disk: 16GiB
network:
default: deny
allow:
- host: api.github.com
ports: [443]
- host: registry.npmjs.org
ports: [443]sandbox.resources is optional and declares backend-neutral minimum workload
requirements. vcpus is a positive integer, while memory and disk accept
raw bytes or human-friendly sizes such as 4096MiB, 8GiB, or 16GiB.
Cleanroom raises the selected backend runtime config to meet these minimums,
but does not lower larger host defaults.
Enable Docker as a guest service:
sandbox:
services:
docker:
required: trueValidate policy without running anything:
cleanroom policy validateRepository-aware bootstrap is the default for the top-level commands when you run them from inside a git repository.
The implicit defaults are:
repository:
remote: origin
path: /workspace
submodules: falseUse the optional repository block only to override those defaults or disable
the behavior:
repository:
enabled: falseor:
repository:
path: /work
submodules: trueWith the default behavior:
cleanroom createcreates a sandbox with the current repo checked out at localHEADcleanroom exec -- <cmd>checks out the repo, runs<cmd>from/workspace, and tears the sandbox down unless--keepis setcleanroom exec --tty -- <cmd>runs<cmd>from/workspacewith a real tty and tears the sandbox down unless--keepis setcleanroom console -- bashopens a shell in/workspaceusing the same interactive tty transport and tears the sandbox down unless--keepis set- dirty working trees print a warning and use committed
HEAD; uncommitted changes are not copied in cleanroom sandbox createremains explicit and repo-agnostic
Repository bootstrap needs the remote host in sandbox.network.allow, for
example:
sandbox:
network:
default: deny
allow:
- host: github.com
ports: [443]| Host OS | Backend | Status | Notes |
|---|---|---|---|
| Linux | firecracker |
Full support | Persistent sandboxes, per-sandbox TAP + guest IP identity, file copy, egress allowlist enforcement |
| macOS | darwin-vz |
Supported with gaps | Persistent sandboxes, file copy, filehandle networking with allowlist egress filtering, no TAP parity |
Backend capabilities are exposed in cleanroom doctor --json under capabilities. See isolation model for enforcement and persistence details.
File copy uses streaming file primitives, and the API also exposes path and archive primitives for larger sync and diff workflows.
Network model differs significantly by backend:
firecrackercreates a dedicated TAP interface and host/guest IP pair per sandbox, which enables host-side identity and firewall enforcement.darwin-vzusesfilehandlenetworking on macOS so deny-by-default policies can use the Cleanroom-owned gateway path for allowlisted egress.darwin-vzstill does not expose Firecracker-style TAP devices or host firewall enforcement semantics.
Select a backend explicitly:
cleanroom exec --backend firecracker -- npm test
cleanroom exec --backend darwin-vz -- npm test- Server:
cleanroom serve(required for all operations) - Client: CLI and ConnectRPC clients
- Transport: unix socket (default), HTTPS with mTLS, or Tailscale
- RPC services:
cleanroom.v1.SandboxService,cleanroom.v1.ExecutionService(API design)
Use github.com/buildkite/cleanroom/client from external Go modules.
import (
"context"
"os"
"github.com/buildkite/cleanroom/client"
)
func example() error {
c := client.Must(client.NewFromEnv())
sb, err := c.EnsureSandbox(context.Background(), "thread:abc123", client.EnsureSandboxOptions{
Backend: "firecracker",
Policy: client.PolicyFromAllowlist(
"ghcr.io/buildkite/cleanroom-base/debian@sha256:...",
"sha256:...",
client.Allow("api.github.com", 443),
client.Allow("registry.npmjs.org", 443),
),
})
if err != nil { return err }
result, err := c.ExecAndWait(context.Background(), sb.ID, []string{"bash", "-lc", "echo hello"}, client.ExecOptions{
Stdout: os.Stdout,
Stderr: os.Stderr,
})
if err != nil { return err }
_ = result
return nil
}client exposes:
client.Clientfor RPC calls- protobuf request/response/event types (for example
client.CreateExecutionRequest) - status enums (
client.SandboxStatus_*,client.ExecutionStatus_*) - ergonomic wrappers (
client.NewFromEnv,client.EnsureSandbox,client.ExecAndWait)
client.ExecAndWait is the batch-oriented helper. Interactive attach flows use
the lower-level execution RPCs (CreateExecution, AttachExecution, and
related methods).
Cleanroom uses digest-pinned OCI images as sandbox bases. Images are pulled from any OCI registry and materialized into ext4 rootfs files for the VM backend.
cleanroom image pull ghcr.io/buildkite/cleanroom-base/debian@sha256:...
cleanroom image ls
cleanroom image rm sha256:...
cleanroom image import ghcr.io/buildkite/cleanroom-base/debian@sha256:... ./rootfs.tar.gz
cleanroom image bump-ref ghcr.io/buildkite/cleanroom-base/debian:latest
# resolve :latest tag to digest and update cleanroom.yamlRecommended defaults are the Debian-based images: ghcr.io/buildkite/cleanroom-base/debian, ghcr.io/buildkite/cleanroom-base/debian-ruby, ghcr.io/buildkite/cleanroom-base/debian-docker, and ghcr.io/buildkite/cleanroom-base/debian-agents.
The Alpine variants remain available as smaller musl-based alternatives: ghcr.io/buildkite/cleanroom-base/alpine, ghcr.io/buildkite/cleanroom-base/alpine-docker, and ghcr.io/buildkite/cleanroom-base/alpine-agents.
Build these locally with mise:
mise run build:images
# or individually:
mise run build:image:debian
mise run build:image:debian-ruby
mise run build:image:debian-docker
mise run build:image:debian-agents
mise run build:image:alpine
mise run build:image:alpine-docker
mise run build:image:alpine-agentsConfig path: $XDG_CONFIG_HOME/cleanroom/config.yaml (typically ~/.config/cleanroom/config.yaml).
cleanroom config init
cleanroom config validateOn macOS this defaults default_backend to darwin-vz. On Linux it defaults to firecracker.
If default_backend is omitted or blank in an existing config, Cleanroom falls back to the same host default at load time.
Optional endpoint override precedence is --host, then CLEANROOM_HOST, then control_host from runtime config, then defaults (macOS: user runtime socket; Linux: system socket when present, otherwise user runtime socket).
default_backend: firecracker
control_host: "" # optional override for client endpoint resolution
backends:
firecracker:
binary_path: firecracker
kernel_image: "" # auto-managed when unset
privileged_helper_path: /usr/local/sbin/cleanroom-root-helper
vcpus: 2
memory_mib: 1024
launch_seconds: 30
darwin-vz:
kernel_image: "" # auto-managed when unset
rootfs: "" # derived from sandbox.image.ref when unset
network:
mode: filehandle # optional; this is the only supported darwin-vz mode
vcpus: 2
memory_mib: 1024
launch_seconds: 30When kernel_image is unset, Cleanroom auto-downloads a managed kernel. Set it explicitly for offline operation.
When rootfs is unset, Cleanroom derives one from sandbox.image.ref and injects the guest runtime. This requires mkfs.ext4 and debugfs on the host (macOS: brew install e2fsprogs).
Linux (firecracker):
/dev/kvmavailable and writable- Firecracker binary installed
mkfs.ext4for OCI-to-ext4 materializationsudo -naccess to/usr/local/sbin/cleanroom-root-helper
macOS (darwin-vz):
cleanroom-darwin-vzhelper signed withcom.apple.security.virtualizationentitlementmkfs.ext4anddebugfs(brew install e2fsprogs)
cleanroom doctor # check host prerequisites
cleanroom doctor --json # machine-readable with capabilities map
cleanroom sandbox inspect <sandbox-id>
cleanroom sandbox inspect --last
cleanroom execution ls # list active executions
cleanroom execution inspect --last
cleanroom execution inspect --sandbox-id <sandbox-id> --last
cleanroom execution inspect <execution-id>
cleanroom status --last # browse the newest retained execution artifacts
cleanroom status --execution-id <execution-id>
cleanroom versionFailure flow:
cleanroom execandcleanroom consolekeep failure stderr focused on streamed guest output; they do not appendsandbox_id,execution_id,trace_id, ortrace_urlfooters automatically.- Use
--print-sandbox-idwhen you need to correlate a kept or reused sandbox, and usecleanroom status --lastorcleanroom execution inspect ...for retained diagnostics. - Attached
cleanroom execandcleanroom consolestreams may print warning notices on stderr for policy observations such as blocked connections or disallowed DNS lookups. cleanroom sandbox inspect <sandbox-id>andcleanroom sandbox inspect --lastshow sandbox state pluslast_execution_idandactive_execution_id.cleanroom execution lslists active executions by default; add--allto include finished executions that are still known to the control plane.cleanroom execution inspect ...is the control-plane view for execution status, retained stdout/stderr, image metadata,trace_id, optionaltrace_url, and observability.cleanroom status ...is the local artifact view under$XDG_STATE_HOME/cleanroom/executions.
Terraform provisioning and private host bootstrap automation now live in the
private sibling repo ../cleanroom-ops.
- research.md -- backend and tooling evaluation notes
- benchmarks.md -- TTI measurement and results
- ci.md -- Buildkite pipeline and base image workflow
- spec.md -- full specification and roadmap
- tls.md -- certificate bootstrap, auto-discovery, HTTPS transport
- gateway.md -- host-side git/registry proxy and credential injection
- remote-access.md -- Tailscale and HTTP listeners
- isolation.md -- enforcement details and persistence behavior
- api.md -- ConnectRPC surface and proto sketch
- observability.md -- OTLP config, local stack, and trace diagnostics
- vsock.md -- guest execution protocol
- backend/firecracker.md -- Firecracker backend design
- backend/darwin-vz.md -- macOS backend and helper design