Skip to content

buildkite/cleanroom

Repository files navigation

👩‍🔬 Cleanroom

Cleanroom runs untrusted code in microVMs with deny-by-default network policy. It is self-hosted, enforces repository-scoped egress rules, and keeps credentials on the host side of the VM boundary.

Agent sandboxing tools are proliferating fast. Most focus on isolation alone. Cleanroom adds policy-controlled network access so you decide exactly what the sandbox can reach.

Why Cleanroom?

Deny-by-default egress. A cleanroom.yaml policy file in your repo controls which hosts the sandbox may reach. Current hostname-based rules are enforced from observed DNS answers plus destination IP:port, so co-hosted services on the same IP:port are not distinguished. Everything else is blocked.

MicroVM isolation. Each sandbox is a hardware-virtualized microVM (Firecracker on Linux, Virtualization.framework on macOS), not a container. A VM boundary is stronger than namespaces, seccomp, or gVisor -- a kernel vulnerability in the guest doesn't compromise the host.

Self-hosted. Runs on your infrastructure. Your code and data never leave your machines.

Credentials stay on the host. A host-side gateway rewrites git traffic through Cleanroom-owned routes and keeps upstream credentials on the host side of the boundary. The same gateway now embeds content-cache for cache-backed git, OCI, Go module, RubyGems, and immutable download handling.

Standard OCI images. Use any OCI image from any registry as your sandbox base. Digest-pinned in policy for reproducibility. No custom VM image format or vendor-specific base images. Same image works across backends.

Docker inside the sandbox. Enable a guest Docker daemon with a single policy flag (services.docker.required: true). Docker Hub pulls are mirrored through the host gateway cache, and you can build and run containers inside the microVM.

Coming soon: broader guest-side package-manager rewrites with lockfile enforcement, broader non-Docker-Hub registry caching, hermetic offline build flows, and richer audit surfaces. See the spec for the full roadmap.

Install

Install the latest release:

curl -fsSL https://raw.githubusercontent.com/buildkite/cleanroom/main/scripts/install.sh | bash

Install a specific version:

curl -fsSL https://raw.githubusercontent.com/buildkite/cleanroom/main/scripts/install.sh | \
  bash -s -- --version vX.Y.Z

By default this installs to /usr/local/bin. Override with --install-dir or CLEANROOM_INSTALL_DIR.

On macOS, install.sh now prefers the signed, notarized .pkg when using the default install flow. It falls back to the release tarball when you request a custom install directory or helper customization.

Install the locally built binaries from this checkout into /usr/local/bin:

mise run install:global

Quick start

Initialize runtime config and check host prerequisites:

cleanroom config init
cleanroom config validate
cleanroom doctor

Start the server (all CLI commands need a running server):

cleanroom serve &

The server listens on unix://$XDG_RUNTIME_DIR/cleanroom/cleanroom.sock by default. When observability is enabled, cleanroom serve also prints startup status for trace export, sampling, and whether direct trace links are configured.

For observability setup, local Grafana/Tempo/Prometheus development, runtime config examples, and trace diagnostics, see docs/observability.md.

Install as a daemon:

# macOS: installs a user LaunchAgent (user-scope only)
cleanroom daemon install --restart

# Linux (systemd)
sudo cleanroom daemon install --restart

Use cleanroom daemon install --init-config --restart for first-run bootstrap when the runtime config file does not exist yet. Use cleanroom daemon restart --force to start the daemon again if it is currently stopped. On macOS, --system is unsupported; --user is accepted for explicitness.

Manage the daemon lifecycle:

cleanroom daemon status
cleanroom daemon start
cleanroom daemon stop
cleanroom daemon restart
cleanroom daemon uninstall

The system daemon socket is root-owned (unix:///var/run/cleanroom/cleanroom.sock), so client commands against that daemon should be run with sudo unless you configure an alternate endpoint. User-scope daemons listen on the runtime socket (unix://$XDG_RUNTIME_DIR/cleanroom/cleanroom.sock when XDG_RUNTIME_DIR is set).

Run a command in a sandbox:

cleanroom exec -- npm test
cleanroom exec --tty -e OPENAI_API_KEY -- codex app-server

When cleanroom.yaml includes a repository bootstrap block, the top-level commands become repo-aware: Cleanroom resolves the current git remote and local HEAD, materializes that checkout in the sandbox, and starts commands in the configured guest path. Cleanroom no longer auto-detects or auto-wraps commands for mise; if you want mise, run it explicitly in the command you execute or in sandbox.dependencies.command or sandbox.services.command so it can participate in create-time stage caching.

You can also define create-time and per-execution setup:

sandbox:
  dependencies:
    command: bundle install
    key:
      files: [Gemfile.lock]
    # Optional: reuse dependency state across source-only commits when the
    # dependency outputs survive a checkout refresh.
    # reuse: portable
  services:
    docker:
      required: true
    command: |
      docker compose up -d postgres valkey
      bin/rails db:prepare
      docker compose stop postgres valkey
    key:
      files: [docker-compose.yml, db/schema.rb]
  run:
    before: docker compose up -d postgres valkey

Use sandbox.dependencies.command for deterministic repo-local bootstrap, sandbox.services.command for snapshotable on-disk service preparation, and sandbox.run.before for live startup that must happen before each execution. Set sandbox.services.docker.required: true when the services bootstrap needs the guest Docker daemon.

sandbox.dependencies.command and sandbox.services.command support either a shell string or an argv sequence. Prefer the string form unless you specifically need exact argv semantics. sandbox.run.before always runs through sh -lc.

Pre-create a long-running sandbox without running a command:

SANDBOX_ID="$(cleanroom create)"
cleanroom exec --in "$SANDBOX_ID" -- npm run lint

Override the sandbox image per command (remote tag/digest or local Docker image name):

cleanroom sandbox create --image ghcr.io/buildkite/cleanroom-base/debian:latest
cleanroom exec --image ghcr.io/buildkite/cleanroom-base/debian:latest -- npm test
cleanroom console --image my-local-image:dev -- sh
cleanroom exec -e OPENAI_API_KEY -e CODEX_HOME=/workspace/.codex -- codex app-server

Equivalent namespaced command:

cleanroom sandbox create

cleanroom sandbox create stays generic. It does not inspect the local git repository or infer a checkout from cleanroom.yaml.

cleanroom exec and cleanroom console create ephemeral sandboxes by default. Reuse an existing sandbox with --in, or keep a newly created sandbox with --keep.

List sandboxes and run more commands:

cleanroom sandbox ls
cleanroom exec --in <id> -- npm run lint
cleanroom exec --in <id> -- npm run build

Copy a one-off file into or out of a kept sandbox:

cleanroom cp ./fixture.json <id>:/tmp/fixture.json
cleanroom cp <id>:/tmp/result.json ./result.json

Keep a sandbox created by exec:

cleanroom exec --keep -- npm test

Run against a snapshot:

cleanroom exec --from snap_... -- npm test
cleanroom console --from snap_...

Interactive console:

cleanroom console -- bash
cleanroom exec --tty -- bash

Policy file

A cleanroom.yaml in your repo defines the sandbox policy. Cleanroom also checks .buildkite/cleanroom.yaml as a fallback.

version: 1
sandbox:
  image:
    ref: ghcr.io/buildkite/cleanroom-base/debian@sha256:0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef
  resources:
    vcpus: 4
    memory: 8GiB
    disk: 16GiB
  network:
    default: deny
    allow:
      - host: api.github.com
        ports: [443]
      - host: registry.npmjs.org
        ports: [443]

sandbox.resources is optional and declares backend-neutral minimum workload requirements. vcpus is a positive integer, while memory and disk accept raw bytes or human-friendly sizes such as 4096MiB, 8GiB, or 16GiB. Cleanroom raises the selected backend runtime config to meet these minimums, but does not lower larger host defaults.

Enable Docker as a guest service:

sandbox:
  services:
    docker:
      required: true

Validate policy without running anything:

cleanroom policy validate

Repository-aware bootstrap is the default for the top-level commands when you run them from inside a git repository.

The implicit defaults are:

repository:
  remote: origin
  path: /workspace
  submodules: false

Use the optional repository block only to override those defaults or disable the behavior:

repository:
  enabled: false

or:

repository:
  path: /work
  submodules: true

With the default behavior:

  • cleanroom create creates a sandbox with the current repo checked out at local HEAD
  • cleanroom exec -- <cmd> checks out the repo, runs <cmd> from /workspace, and tears the sandbox down unless --keep is set
  • cleanroom exec --tty -- <cmd> runs <cmd> from /workspace with a real tty and tears the sandbox down unless --keep is set
  • cleanroom console -- bash opens a shell in /workspace using the same interactive tty transport and tears the sandbox down unless --keep is set
  • dirty working trees print a warning and use committed HEAD; uncommitted changes are not copied in
  • cleanroom sandbox create remains explicit and repo-agnostic

Repository bootstrap needs the remote host in sandbox.network.allow, for example:

sandbox:
  network:
    default: deny
    allow:
      - host: github.com
        ports: [443]

Backend support

Host OS Backend Status Notes
Linux firecracker Full support Persistent sandboxes, per-sandbox TAP + guest IP identity, file copy, egress allowlist enforcement
macOS darwin-vz Supported with gaps Persistent sandboxes, file copy, filehandle networking with allowlist egress filtering, no TAP parity

Backend capabilities are exposed in cleanroom doctor --json under capabilities. See isolation model for enforcement and persistence details. File copy uses streaming file primitives, and the API also exposes path and archive primitives for larger sync and diff workflows.

Network model differs significantly by backend:

  • firecracker creates a dedicated TAP interface and host/guest IP pair per sandbox, which enables host-side identity and firewall enforcement.
  • darwin-vz uses filehandle networking on macOS so deny-by-default policies can use the Cleanroom-owned gateway path for allowlisted egress.
  • darwin-vz still does not expose Firecracker-style TAP devices or host firewall enforcement semantics.

Select a backend explicitly:

cleanroom exec --backend firecracker -- npm test
cleanroom exec --backend darwin-vz -- npm test

Architecture

  • Server: cleanroom serve (required for all operations)
  • Client: CLI and ConnectRPC clients
  • Transport: unix socket (default), HTTPS with mTLS, or Tailscale
  • RPC services: cleanroom.v1.SandboxService, cleanroom.v1.ExecutionService (API design)

Go Client (Public API)

Use github.com/buildkite/cleanroom/client from external Go modules.

import (
  "context"
  "os"

  "github.com/buildkite/cleanroom/client"
)

func example() error {
  c := client.Must(client.NewFromEnv())

  sb, err := c.EnsureSandbox(context.Background(), "thread:abc123", client.EnsureSandboxOptions{
    Backend: "firecracker",
    Policy: client.PolicyFromAllowlist(
      "ghcr.io/buildkite/cleanroom-base/debian@sha256:...",
      "sha256:...",
      client.Allow("api.github.com", 443),
      client.Allow("registry.npmjs.org", 443),
    ),
  })
  if err != nil { return err }

  result, err := c.ExecAndWait(context.Background(), sb.ID, []string{"bash", "-lc", "echo hello"}, client.ExecOptions{
    Stdout: os.Stdout,
    Stderr: os.Stderr,
  })
  if err != nil { return err }
  _ = result
  return nil
}

client exposes:

  • client.Client for RPC calls
  • protobuf request/response/event types (for example client.CreateExecutionRequest)
  • status enums (client.SandboxStatus_*, client.ExecutionStatus_*)
  • ergonomic wrappers (client.NewFromEnv, client.EnsureSandbox, client.ExecAndWait)

client.ExecAndWait is the batch-oriented helper. Interactive attach flows use the lower-level execution RPCs (CreateExecution, AttachExecution, and related methods).

Images

Cleanroom uses digest-pinned OCI images as sandbox bases. Images are pulled from any OCI registry and materialized into ext4 rootfs files for the VM backend.

cleanroom image pull ghcr.io/buildkite/cleanroom-base/debian@sha256:...
cleanroom image ls
cleanroom image rm sha256:...
cleanroom image import ghcr.io/buildkite/cleanroom-base/debian@sha256:... ./rootfs.tar.gz
cleanroom image bump-ref ghcr.io/buildkite/cleanroom-base/debian:latest
                           # resolve :latest tag to digest and update cleanroom.yaml

Recommended defaults are the Debian-based images: ghcr.io/buildkite/cleanroom-base/debian, ghcr.io/buildkite/cleanroom-base/debian-ruby, ghcr.io/buildkite/cleanroom-base/debian-docker, and ghcr.io/buildkite/cleanroom-base/debian-agents. The Alpine variants remain available as smaller musl-based alternatives: ghcr.io/buildkite/cleanroom-base/alpine, ghcr.io/buildkite/cleanroom-base/alpine-docker, and ghcr.io/buildkite/cleanroom-base/alpine-agents.

Build these locally with mise:

mise run build:images
# or individually:
mise run build:image:debian
mise run build:image:debian-ruby
mise run build:image:debian-docker
mise run build:image:debian-agents
mise run build:image:alpine
mise run build:image:alpine-docker
mise run build:image:alpine-agents

Runtime config

Config path: $XDG_CONFIG_HOME/cleanroom/config.yaml (typically ~/.config/cleanroom/config.yaml).

cleanroom config init
cleanroom config validate

On macOS this defaults default_backend to darwin-vz. On Linux it defaults to firecracker. If default_backend is omitted or blank in an existing config, Cleanroom falls back to the same host default at load time.

Optional endpoint override precedence is --host, then CLEANROOM_HOST, then control_host from runtime config, then defaults (macOS: user runtime socket; Linux: system socket when present, otherwise user runtime socket).

default_backend: firecracker
control_host: ""             # optional override for client endpoint resolution
backends:
  firecracker:
    binary_path: firecracker
    kernel_image: ""    # auto-managed when unset
    privileged_helper_path: /usr/local/sbin/cleanroom-root-helper
    vcpus: 2
    memory_mib: 1024
    launch_seconds: 30
  darwin-vz:
    kernel_image: ""    # auto-managed when unset
    rootfs: ""          # derived from sandbox.image.ref when unset
    network:
      mode: filehandle  # optional; this is the only supported darwin-vz mode
    vcpus: 2
    memory_mib: 1024
    launch_seconds: 30

When kernel_image is unset, Cleanroom auto-downloads a managed kernel. Set it explicitly for offline operation.

When rootfs is unset, Cleanroom derives one from sandbox.image.ref and injects the guest runtime. This requires mkfs.ext4 and debugfs on the host (macOS: brew install e2fsprogs).

Host requirements

Linux (firecracker):

  • /dev/kvm available and writable
  • Firecracker binary installed
  • mkfs.ext4 for OCI-to-ext4 materialization
  • sudo -n access to /usr/local/sbin/cleanroom-root-helper

macOS (darwin-vz):

  • cleanroom-darwin-vz helper signed with com.apple.security.virtualization entitlement
  • mkfs.ext4 and debugfs (brew install e2fsprogs)

Diagnostics

cleanroom doctor              # check host prerequisites
cleanroom doctor --json       # machine-readable with capabilities map
cleanroom sandbox inspect <sandbox-id>
cleanroom sandbox inspect --last
cleanroom execution ls        # list active executions
cleanroom execution inspect --last
cleanroom execution inspect --sandbox-id <sandbox-id> --last
cleanroom execution inspect <execution-id>
cleanroom status --last       # browse the newest retained execution artifacts
cleanroom status --execution-id <execution-id>
cleanroom version

Failure flow:

  • cleanroom exec and cleanroom console keep failure stderr focused on streamed guest output; they do not append sandbox_id, execution_id, trace_id, or trace_url footers automatically.
  • Use --print-sandbox-id when you need to correlate a kept or reused sandbox, and use cleanroom status --last or cleanroom execution inspect ... for retained diagnostics.
  • Attached cleanroom exec and cleanroom console streams may print warning notices on stderr for policy observations such as blocked connections or disallowed DNS lookups.
  • cleanroom sandbox inspect <sandbox-id> and cleanroom sandbox inspect --last show sandbox state plus last_execution_id and active_execution_id.
  • cleanroom execution ls lists active executions by default; add --all to include finished executions that are still known to the control plane.
  • cleanroom execution inspect ... is the control-plane view for execution status, retained stdout/stderr, image metadata, trace_id, optional trace_url, and observability.
  • cleanroom status ... is the local artifact view under $XDG_STATE_HOME/cleanroom/executions.

Further reading

Terraform provisioning and private host bootstrap automation now live in the private sibling repo ../cleanroom-ops.

About

Cleanroom sandbox orchestration system

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors