Skip to content

Commit 08001ca

Browse files
authored
fix(docker): harden supervisor startup and gateway routing (NVIDIA#1128)
1 parent 721c39f commit 08001ca

22 files changed

Lines changed: 711 additions & 209 deletions

File tree

Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

architecture/build-containers.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ The gateway runs the control plane API server. It is deployed as a StatefulSet i
99
- **Docker target**: `gateway` in `deploy/docker/Dockerfile.images`
1010
- **Registry**: `ghcr.io/nvidia/openshell/gateway:latest`
1111
- **Pulled when**: Cluster startup (the Helm chart triggers the pull)
12-
- **Entrypoint**: `openshell-gateway --port 8080` (gRPC + HTTP, mTLS)
12+
- **Entrypoint**: `openshell-gateway --bind-address 0.0.0.0 --port 8080` (gRPC + HTTP, mTLS)
1313

1414
## Cluster (`openshell/cluster`)
1515

architecture/gateway-security.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -304,9 +304,9 @@ Traffic flows through several layers from the host to the gateway process:
304304
| Container | `30051` | Hardcoded in `crates/openshell-bootstrap/src/docker.rs` |
305305
| k3s NodePort | `30051` | `deploy/helm/openshell/values.yaml` (`service.nodePort`) |
306306
| k3s Service | `8080` | `deploy/helm/openshell/values.yaml` (`service.port`) |
307-
| Server bind | `8080` | `--port` flag / `OPENSHELL_SERVER_PORT` env var |
307+
| Server bind | `0.0.0.0:8080` in deployed containers | `--bind-address 0.0.0.0 --port 8080` / `OPENSHELL_BIND_ADDRESS` + `OPENSHELL_SERVER_PORT` |
308308

309-
Docker maps `host_port → 30051/tcp`. Inside k3s, the NodePort service maps `30051 → 8080 (pod port)`. The server binds `0.0.0.0:8080`.
309+
Docker maps `host_port → 30051/tcp`. Inside k3s, the NodePort service maps `30051 → 8080 (pod port)`. The deployed gateway container binds `0.0.0.0:8080` explicitly.
310310

311311
## Security Model Summary
312312

architecture/gateway.md

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -107,23 +107,25 @@ The gateway boots in `cli::run_cli` (`crates/openshell-server/src/cli.rs`) and p
107107
- `docker` constructs `openshell-driver-docker` in-process and manages local containers labeled with the configured sandbox namespace.
108108
- `vm` spawns the standalone `openshell-driver-vm` binary as a local compute-driver process, resolves it from `--driver-dir`, conventional libexec install paths, or a sibling of the gateway binary, connects to it over a Unix domain socket, and keeps the libkrun/rootfs runtime out of the gateway binary.
109109
3. Build `ServerState` (shared via `Arc<ServerState>` across all handlers), including a fresh `SupervisorSessionRegistry`.
110-
4. **Spawn background tasks**:
110+
4. Resume persisted sandboxes that were stopped during the previous gateway shutdown.
111+
5. **Spawn background tasks**:
111112
- `ComputeRuntime::spawn_watchers` -- consumes the compute-driver watch stream, republishes platform events, and runs a periodic `ListSandboxes` snapshot reconcile.
112113
- `ssh_tunnel::spawn_session_reaper` -- sweeps expired or revoked SSH session tokens from the store hourly.
113114
- `supervisor_session::spawn_relay_reaper` -- sweeps orphaned pending relay channels every 30 seconds.
114-
5. Create `MultiplexService`.
115-
6. Bind `TcpListener` on `config.bind_address`.
116-
7. Optionally create `TlsAcceptor` from cert/key files.
117-
8. Enter the accept loop: for each connection, spawn a tokio task that optionally performs a TLS handshake, then calls `MultiplexService::serve()`.
115+
6. Create `MultiplexService`.
116+
7. Bind the primary gateway listener and any compute-driver requested listeners. Docker requests the Docker bridge gateway address with the normal gateway port, so sandbox containers can call back over the bridge without joining the host network.
117+
8. Bind optional health and metrics listeners.
118+
9. Optionally create `TlsAcceptor` from cert/key files.
119+
10. Spawn a task per gateway listener. Each accepted connection optionally performs a TLS handshake, then calls `MultiplexService::serve()`.
118120

119121
## Configuration
120122

121123
All configuration is via CLI flags with environment variable fallbacks. The `--db-url` flag is required.
122124

123125
| Flag | Env Var | Default | Description |
124126
|------|---------|---------|-------------|
127+
| `--bind-address` | `OPENSHELL_BIND_ADDRESS` | `127.0.0.1` | IP address for gateway, health, and metrics listeners. Container deployments pass `0.0.0.0` explicitly. |
125128
| `--port` | `OPENSHELL_SERVER_PORT` | `8080` | TCP listen port |
126-
| `--bind-address` | `OPENSHELL_BIND_ADDRESS` | `0.0.0.0` | Address for the main gateway listener |
127129
| `--log-level` | `OPENSHELL_LOG_LEVEL` | `info` | Tracing log level filter |
128130
| `--tls-cert` | `OPENSHELL_TLS_CERT` | None | Path to PEM certificate file |
129131
| `--tls-key` | `OPENSHELL_TLS_KEY` | None | Path to PEM private key file |
@@ -136,6 +138,7 @@ All configuration is via CLI flags with environment variable fallbacks. The `--d
136138
| `--sandbox-image` | `OPENSHELL_SANDBOX_IMAGE` | None | Default container image for sandbox pods |
137139
| `--grpc-endpoint` | `OPENSHELL_GRPC_ENDPOINT` | None | gRPC endpoint reachable from within the cluster (for supervisor callbacks) |
138140
| `--drivers` | `OPENSHELL_DRIVERS` | `kubernetes` | Compute backend to use. Current options are `kubernetes`, `docker`, and `vm`. |
141+
| `--docker-network-name` | `OPENSHELL_DOCKER_NETWORK_NAME` | `openshell-docker` | Docker bridge network that local Docker sandboxes join |
139142
| `--vm-driver-state-dir` | `OPENSHELL_VM_DRIVER_STATE_DIR` | `target/openshell-vm-driver` | Host directory for VM sandbox rootfs, console logs, and runtime state |
140143
| `--driver-dir` | `OPENSHELL_DRIVER_DIR` | unset | Override directory for `openshell-driver-vm`. When unset, the gateway searches `~/.local/libexec/openshell`, `/usr/libexec/openshell`, `/usr/local/libexec/openshell`, `/usr/local/libexec`, then a sibling binary. |
141144
| `--vm-krun-log-level` | `OPENSHELL_VM_KRUN_LOG_LEVEL` | `1` | libkrun log level for VM helper processes |
@@ -608,6 +611,9 @@ The gateway reaches the sandbox exclusively through the supervisor-initiated `Co
608611
The Docker driver (`crates/openshell-driver-docker/src/lib.rs`) is an in-process compute backend for local standalone gateways. It creates one Docker container per sandbox, labels each container with `openshell.ai/managed-by=openshell`, `openshell.ai/sandbox-id`, `openshell.ai/sandbox-name`, and `openshell.ai/sandbox-namespace`, and bind-mounts a Linux `openshell-sandbox` supervisor binary into the container.
609612

610613
- **Create**: Pulls or validates the sandbox image according to `sandbox_image_pull_policy`, creates a labeled container, mounts the supervisor binary and optional TLS material, and starts the container with the supervisor as entrypoint.
614+
- **Bridge networking**: Ensures a local Docker bridge network exists (`openshell-docker` by default) and starts every sandbox container on that network instead of using `network_mode=host`.
615+
- **Gateway callback routing**: On native Linux Docker, injects `host.openshell.internal` with the bridge gateway IP and reports that bridge gateway IP plus the normal gateway port to `run_server()` as an extra listener. If the primary listener already binds the wildcard address for that port, the extra address is covered and is not bound a second time. On Docker Desktop, the bridge gateway IP belongs to Docker Desktop's VM rather than the macOS/Windows host, so the driver maps `host.openshell.internal` to Docker's `host-gateway` alias and does not request an extra listener. `OPENSHELL_ENDPOINT` inside Docker sandboxes uses the configured scheme and points at `host.openshell.internal:<gateway-port>` in both cases.
616+
- **Environment ownership**: Merges template and spec environment first, then overwrites driver-owned supervisor variables, including `PATH`, `OPENSHELL_ENDPOINT`, `OPENSHELL_SANDBOX_ID`, `OPENSHELL_SSH_SOCKET_PATH`, and `OPENSHELL_SANDBOX_COMMAND`. This keeps privileged supervisor setup from resolving helper binaries through a user-controlled search path.
611617
- **List/Get/Watch**: Reads labeled containers in the configured sandbox namespace and derives driver-native sandbox status from Docker state plus supervisor relay readiness.
612618
- **Stop**: Stops the matching labeled container without deleting it.
613619
- **Delete**: Force-removes the matching labeled container.

architecture/sandbox-custom-containers.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ The `--from` flag accepts four kinds of input:
99
| Input | Example | Behavior |
1010
|-------|---------|----------|
1111
| **Community sandbox name** | `--from openclaw` | Resolves to `ghcr.io/nvidia/openshell-community/sandboxes/openclaw:latest` |
12-
| **Dockerfile path** | `--from ./Dockerfile` | Builds the image, pushes it into the cluster, then creates the sandbox |
12+
| **Dockerfile path** | `--from ./Dockerfile` | Builds the image into the host Docker daemon, then creates the sandbox |
1313
| **Directory with Dockerfile** | `--from ./my-sandbox/` | Uses the directory as the build context |
1414
| **Full image reference** | `--from myregistry.com/img:tag` | Uses the image directly |
1515

@@ -33,8 +33,7 @@ The community registry prefix defaults to `ghcr.io/nvidia/openshell-community/sa
3333
When `--from` points to a Dockerfile or directory, the CLI:
3434

3535
1. Builds the image locally via the Docker daemon (respecting `.dockerignore`).
36-
2. Pushes it into the cluster's containerd runtime using `docker save` / `ctr import`.
37-
3. Creates the sandbox with the resulting image tag.
36+
2. Creates the sandbox with the resulting local image tag.
3837

3938
## How It Works
4039

@@ -107,18 +106,18 @@ The `openshell-sandbox` supervisor adapts to arbitrary environments:
107106
|----------|-----------|
108107
| Unified `--from` flag | Single entry point for community names, Dockerfiles, directories, and image refs — removes the need to know registry paths |
109108
| Community name resolution | Bare names like `openclaw` expand to the GHCR community registry, making the common case simple |
110-
| Auto build+push for Dockerfiles | Eliminates the two-step `image push` + `create` workflow for local development |
109+
| Auto build for Dockerfiles | Eliminates the two-step `docker build` + `create` workflow for local Docker-backed development |
111110
| `OPENSHELL_COMMUNITY_REGISTRY` env var | Allows organizations to host their own community sandbox registry |
112111
| hostPath side-load | Supervisor binary lives on the node filesystem — no init container, no emptyDir, no extra image pull. Faster pod startup. |
113112
| Read-only mount in agent | The supervisor binary is mounted read-only, and the startup seccomp prelude blocks the remount syscalls that would otherwise reopen it for writes once privileged bootstrap has completed. |
114113
| Command override | Ensures `openshell-sandbox` is the entrypoint regardless of the image's default CMD |
115114
| Clear `run_as_user/group` for custom images | Prevents startup failure when the image lacks the default `sandbox` user |
116115
| Non-fatal log file init | `/var/log/openshell.log` may be unwritable in arbitrary images; falls back to stdout |
117-
| `docker save` / `ctr import` for push | Avoids requiring a registry for local dev; images land directly in the k3s containerd store |
116+
| Host Docker image store | Dockerfile sources build into the host Docker daemon and are referenced by local image tag. |
118117
| Optional `iptables` for bypass detection | Core network isolation works via routing alone (`iproute2`); `iptables` only adds fast-fail (`ECONNREFUSED`) and diagnostic LOG entries. Making it optional avoids hard failures in minimal images that lack `iptables` while giving better UX when it is available. |
119118

120119
## Limitations
121120

122121
- Distroless / `FROM scratch` images are not supported (the supervisor needs glibc and `/proc`)
123122
- Missing `iproute2` (or required capabilities) blocks startup in proxy mode because namespace isolation is mandatory
124-
- The supervisor binary must be present on the k3s node at `/opt/openshell/bin/openshell-sandbox` (embedded in the cluster image at build time)
123+
- Kubernetes gateways require images to be available to the cluster runtime. Dockerfile sources build into the host Docker daemon only; use a registry image reference when the selected gateway cannot access the host Docker image store.

crates/openshell-bootstrap/src/build.rs

Lines changed: 4 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,10 @@
11
// SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
22
// SPDX-License-Identifier: Apache-2.0
33

4-
//! Build and push container images into a k3s gateway.
4+
//! Build container images from Dockerfiles.
55
//!
66
//! This module wraps bollard's `build_image()` API to build a container image
7-
//! from a Dockerfile and build context, then reuses the existing push pipeline
8-
//! to import the image into the gateway's containerd runtime.
7+
//! from a Dockerfile and build context into the local Docker daemon.
98
109
use std::collections::HashMap;
1110
use std::path::Path;
@@ -15,48 +14,21 @@ use bollard::query_parameters::BuildImageOptionsBuilder;
1514
use futures::StreamExt;
1615
use miette::{IntoDiagnostic, Result, WrapErr};
1716

18-
use crate::constants::container_name;
19-
use crate::push::push_local_images;
20-
21-
/// Build a container image from a Dockerfile and push it into the gateway.
22-
///
23-
/// This is used by `openshell sandbox create --from <Dockerfile>`. It:
24-
/// 1. Creates a tar archive of the build context directory.
25-
/// 2. Sends it to the local Docker daemon via `build_image()`.
26-
/// 3. Pushes the resulting image into the gateway's containerd via the
27-
/// existing `push_local_images()` pipeline.
17+
/// Build a container image from a Dockerfile into the local Docker daemon.
2818
#[allow(clippy::implicit_hasher)]
29-
pub async fn build_and_push_image(
19+
pub async fn build_local_image(
3020
dockerfile_path: &Path,
3121
tag: &str,
3222
context_dir: &Path,
33-
gateway_name: &str,
3423
build_args: &HashMap<String, String>,
3524
on_log: &mut impl FnMut(String),
3625
) -> Result<()> {
37-
// 1. Build the image locally.
3826
on_log(format!(
3927
"Building image {tag} from {}",
4028
dockerfile_path.display()
4129
));
4230
build_image(dockerfile_path, tag, context_dir, build_args, on_log).await?;
4331
on_log(format!("Built image {tag}"));
44-
45-
// 2. Push into the gateway.
46-
on_log(format!(
47-
"Pushing image {tag} into gateway \"{gateway_name}\""
48-
));
49-
// Use the long-timeout Docker client so `docker save` of multi-GB images
50-
// doesn't trip the 120s bollard default mid-stream. Override with
51-
// OPENSHELL_DOCKER_TIMEOUT_SECS=<secs>.
52-
let local_docker = crate::docker::connect_local_for_large_transfers()
53-
.into_diagnostic()
54-
.wrap_err("failed to connect to local Docker daemon")?;
55-
let container = container_name(gateway_name);
56-
let images: Vec<&str> = vec![tag];
57-
push_local_images(&local_docker, &local_docker, &container, &images, on_log).await?;
58-
59-
on_log(format!("Image {tag} is available in the gateway."));
6032
Ok(())
6133
}
6234

crates/openshell-cli/src/run.rs

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2877,19 +2877,18 @@ async fn build_from_dockerfile(
28772877
eprintln!(" {msg}");
28782878
};
28792879

2880-
openshell_bootstrap::build::build_and_push_image(
2880+
openshell_bootstrap::build::build_local_image(
28812881
dockerfile,
28822882
&tag,
28832883
context,
2884-
gateway_name,
28852884
&HashMap::new(),
28862885
&mut on_log,
28872886
)
28882887
.await?;
28892888

28902889
eprintln!();
28912890
eprintln!(
2892-
"{} Image {} is available in the gateway.",
2891+
"{} Image {} is available in the local Docker daemon.",
28932892
"✓".green().bold(),
28942893
tag.cyan(),
28952894
);

crates/openshell-core/src/config.rs

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,9 @@ pub const DEFAULT_SSH_HANDSHAKE_SKEW_SECS: u64 = 300;
3131
/// Default Podman bridge network name.
3232
pub const DEFAULT_NETWORK_NAME: &str = "openshell";
3333

34+
/// Default Docker bridge network name for local sandboxes.
35+
pub const DEFAULT_DOCKER_NETWORK_NAME: &str = "openshell-docker";
36+
3437
/// Default OCI image for the openshell-sandbox supervisor binary.
3538
pub const DEFAULT_SUPERVISOR_IMAGE: &str = "openshell/supervisor:latest";
3639

@@ -515,7 +518,7 @@ impl Config {
515518
}
516519

517520
fn default_bind_address() -> SocketAddr {
518-
"0.0.0.0:8080".parse().expect("valid default address")
521+
"127.0.0.1:8080".parse().expect("valid default address")
519522
}
520523

521524
fn default_log_level() -> String {
@@ -589,6 +592,12 @@ mod tests {
589592
assert!(err.contains("unsupported compute driver 'firecracker'"));
590593
}
591594

595+
#[test]
596+
fn config_defaults_to_loopback_bind_address() {
597+
let expected: SocketAddr = "127.0.0.1:8080".parse().expect("valid address");
598+
assert_eq!(Config::new(None).bind_address, expected);
599+
}
600+
592601
#[test]
593602
fn config_new_disables_health_bind_by_default() {
594603
let cfg = Config::new(None);

crates/openshell-driver-docker/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ bytes = { workspace = true }
2222
bollard = { version = "0.20" }
2323
tar = "0.4"
2424
tempfile = "3"
25+
url = { workspace = true }
2526

2627
[lints]
2728
workspace = true

0 commit comments

Comments
 (0)