System extension support for managed Docker daemons (Phase 1) and K3s Kubernetes clusters (Phase 2). Phase 3 will add kubeadm + HA control plane.
A NodeInstance with the appropriate runtime module assigned (docker-engine, k3s-server, or k3s-agent) auto-bootstraps the daemon on the next agent heartbeat tick. The agent installs the binary, generates local TLS material (Docker) or captures k3s-generated state (K3s), and posts a runtime/handshake to the platform. The platform creates the corresponding Devops::DockerHost or Devops::KubernetesCluster row. All daemon API traffic flows over the SDWAN overlay /128 — no public daemon sockets.
sequenceDiagram
actor Op as Operator
participant Agent as powernode-agent
participant Plat as Platform
participant CA as InternalCaService
Op->>Plat: 1. assign docker-engine<br/>module to Node
Plat-->>Agent: heartbeat picks up<br/>new assignment
Agent->>Agent: 2. install docker-ce<br/>generate Ed25519 server keypair
Agent->>Plat: 3. POST /runtime/handshake<br/>runtime=docker phase=wants_cert<br/>csr_pem=...
Plat->>CA: 4. sign CSR
CA-->>Plat: signed cert + chain
Plat->>Plat: create managed<br/>Devops::DockerHost<br/>bound to instance
Plat-->>Agent: 5. cert returned
Agent->>Agent: 6. write daemon.json with TLS<br/>+ listen on SDWAN /128<br/>systemctl start docker
Agent->>Plat: 7. POST phase=ready<br/>version=25.0.3
Plat->>Plat: 8. host status:<br/>pending → connected
Op->>Plat: 9. docker_list_containers
Plat->>Agent: over SDWAN /128 (mTLS)
Agent-->>Op: container list
sequenceDiagram
actor Op as Operator
participant A1 as Agent<br/>(bootstrap server)
participant Plat as Platform
participant VIP as Sdwan::VirtualIp
participant A2 as Agent<br/>(worker)
Op->>Plat: 1. assign k3s-server module
Plat-->>A1: heartbeat picks up
A1->>A1: 2. apt install k3s,<br/>systemctl start k3s
A1->>A1: 3. k3s generates kubeconfig<br/>+ tokens
A1->>Plat: 4. POST phase=bootstrap
Plat->>VIP: allocate api_endpoint VIP<br/>(slice 3)
VIP-->>Plat: VIP /128 + holders=[A1]
Plat->>Plat: 5. create Devops::KubernetesCluster<br/>+ bootstrap node row<br/>status=bootstrapping<br/>api_endpoint=VIP /128
A1->>Plat: 6. POST phase=ready
Plat->>Plat: cluster status=active
Op->>Plat: 7. assign k3s-agent module
Plat-->>A2: heartbeat picks up
A2->>Plat: 8. POST phase=join_request
Plat-->>A2: 9. api_endpoint VIP /128 + token
A2->>A2: 10. write systemd drop-in<br/>K3S_URL + K3S_TOKEN<br/>systemctl start k3s-agent
A2->>Plat: 11. POST phase=ready
Plat->>Plat: node status=active
Op->>Plat: 12. download kubeconfig
note over VIP,A1: If bootstrap server drains:<br/>VIP migrates to next holder<br/>workers' K3S_URL keeps resolving
In accounts with more than one cluster, k3s-agent module assignments must
carry metadata.target_cluster_id so the agent joins the right cluster.
flowchart TB
Op[Operator]
subgraph A["Account"]
T1[Template: cluster-a-worker<br/>module: k3s-agent<br/>config.target_cluster_id = C1]
T2[Template: cluster-b-worker<br/>module: k3s-agent<br/>config.target_cluster_id = C2]
subgraph C1["KubernetesCluster C1"]
S1[k3s-server #1]
W1[k3s-agent worker]
end
subgraph C2["KubernetesCluster C2"]
S2[k3s-server #2]
W2[k3s-agent worker]
end
end
Op --> T1 --> W1
Op --> T2 --> W2
W1 -. "joins via VIP" .-> S1
W2 -. "joins via VIP" .-> S2
Without target_cluster_id, the agent picks the first cluster the API
returns — wrong in multi-cluster accounts.
| Module | Variety | Packages | Purpose |
|---|---|---|---|
docker-engine |
subscription | docker-ce, docker-ce-cli, containerd.io, docker-buildx-plugin, docker-compose-plugin | Docker Engine binary install |
k3s-server |
subscription | k3s (single binary; bundled containerd + runc) | K3s control plane |
k3s-agent |
subscription | k3s (same binary; agent mode) | K3s worker node |
All three live in the Container Runtimes NodeModuleCategory (position=70, between Network Overlay at 60 and userland at 90+).
- NodeInstance must have at least one
Sdwan::Peerwith an assigned overlay/128. The daemon binds to that address only; provisioning errors withMissingSdwanPeerErrorif no peer is attached. - The instance's
Nodemust have thedocker-enginemodule assigned.
# 1. Assign the module via existing API (admin UI or MCP)
curl -X POST /api/v1/system/node_module_assignments \
-H "Authorization: Bearer $JWT" \
-d '{"node_id": "<node-uuid>", "node_module_name": "docker-engine"}'
# 2. Wait ~60s for agent reconcile loop. The managed host appears at:
curl /api/v1/system/managed_docker_hosts -H "Authorization: Bearer $JWT"platform.system_provision_docker_runtime({
node_instance_id: "0193cdef-..."
})
// → { success: true, host: { id, name, status, api_endpoint, ... } }This path is what the System Concierge agent invokes when an operator chats "provision docker on instance X".
# Via REST
curl /api/v1/system/managed_docker_hosts/<host-id> -H "Authorization: Bearer $JWT"
# Via MCP — list managed hosts (excludes external operator-registered hosts)
platform.system_list_managed_docker_hosts()
# Inspect containers running on the host (uses encrypted SDWAN overlay)
platform.docker_list_containers({ host_id: "<host-id>" })Assign k3s-server to the first NodeInstance. The agent installs k3s, captures kubeconfig + tokens from /etc/rancher/k3s/k3s.yaml + /var/lib/rancher/k3s/server/node-token, and posts phase=bootstrap. Cluster appears within ~60s.
platform.kubernetes_list_clusters()
// → { clusters: [{ id, name, flavor: "k3s", status: "bootstrapping", ... }] }Assign k3s-agent to additional NodeInstances. The agent posts phase=join_request, gets the cluster's api_endpoint + agent_token, writes a systemd drop-in at /etc/systemd/system/k3s-agent.service.d/override.conf with K3S_URL + K3S_TOKEN, then starts k3s-agent.service.
Download the kubeconfig from the platform:
platform.kubernetes_get_kubeconfig({ cluster_id: "<id>" })
// → { kubeconfig: "apiVersion: v1...", api_endpoint: "https://[fd00::]:6443" }Or via UI: /app/devops/kubernetes → cluster card → "kubeconfig" button.
# Save + use
echo "$KUBECONFIG_YAML" > ~/.kube/powernode-cluster.yaml
kubectl --kubeconfig ~/.kube/powernode-cluster.yaml get nodesThe kubectl traffic flows over the SDWAN overlay — operators must be on the same SDWAN network or have a federation route.
Three operator vectors:
// 1. MCP — destroys managed host row + Vault TLS material
platform.system_decommission_docker_runtime({ host_id: "<host-id>" })
// 2. Unassign the docker-engine module from the Node
// Agent sees module gone, stops dockerd, posts phase=stopped
// Platform marks host status=disconnected (but keeps row)
// 3. Terminate the NodeInstance — cascade nullifies node_instance_id
// Host row goes orphan; operator can clean via MCPplatform.kubernetes_decommission_cluster({ cluster_id: "<id>" })
// → cascade-deletes all member KubernetesNode rows; underlying
// NodeInstances are NOT terminated. Operator must separately
// unassign k3s-server / k3s-agent modules to fully clean up.The NodeInstance has no SDWAN peer with an assigned /128. Attach one via:
platform.system_sdwan_attach_peer({
network_id: "<sdwan-net-id>",
node_instance_id: "<instance-id>"
})The agent has provisioned the host row but hasn't yet reported phase=ready. Common causes:
docker-cepackage install in progress (~30-60s on first run)- TLS cert hasn't been written yet
- systemd unit fails to start (check via SSH:
journalctl -u docker.service -n 50)
The bootstrap node has installed k3s but the agent hasn't captured + posted state yet. Common causes:
- K3s bootstrap takes ~30-60s; check
/etc/rancher/k3s/k3s.yamlexists on the node - Server token file
/var/lib/rancher/k3s/server/node-tokennot yet populated - The agent's heartbeat tick interval (default 30s) hasn't fired yet
encrypted_kubeconfig is blank because the cluster is still bootstrapping. Wait for cluster_status: active before retrieving.
Symptoms: platform.docker_list_containers returns x509: certificate signed by unknown authority or tls: bad certificate. Common causes:
- Operator's local truststore isn't the issue — the platform manages mTLS internally; the API call from the platform to the daemon uses Vault-stored client certs.
- Server cert was rotated but agent didn't pick it up: trigger
system.runtime_docker_tls_rotateskill (auto-approved), then re-test after one heartbeat tick. - Server cert was minted by a different InternalCaService root than the platform now trusts (rare; only happens after CA replacement). Decommission via
system_decommission_docker_runtimeand re-provision; the new host gets a fresh cert from the current CA. - On the node itself:
journalctl -u docker.service | grep -i tlsshows the actual handshake error.
Symptoms: system_provision_docker_runtime succeeds but daemon connections fail. The daemon.json hosts array should contain tcp://[<sdwan-/128>]:2376. If it doesn't:
- Check the agent's reconciler state cache:
cat /var/lib/powernode-agent/reconciler.jsonon the node. - Verify the SDWAN peer is up:
wg show wg-pnon the node should show a recentlatest handshake. - The
Sdwan::Peer.host_addressis the source of truth — confirm it viaplatform.system_sdwan_get_peer({ id: '<peer-id>' }).
Slice 10 introduced config-variety dockerd modules with per-node + per-instance overrides. If overrides aren't being picked up:
- Verify the override module is assigned to a higher-priority slot than the base
docker-enginemodule — overrides require greatereffective_priorityper the dependant-modules pattern. - After assignment, the agent re-renders
daemon.jsonon its next reconcile tick (~30 s). To force immediate:systemctl restart powernode-agenton the node. - Layer ordering visible in
cat /etc/docker/daemon.jsonafter reconcile — keys present in the override module win over the base.
Symptoms: agent posts phase=join_request but cluster fails to add it; agent log shows K3S_URL connection refused or bad token.
api_endpointmismatch: K3s api_endpoint uses an SDWAN VIP (slice 3). If the worker isn't on the same SDWAN network as the bootstrap node, the VIP is unreachable. Confirm withsystem_sdwan_list_peersthat both peers are on the same network.- Token mismatch (rare): platform regenerated the join token but the agent has a stale cache. Force re-fetch by removing the systemd drop-in
/etc/systemd/system/k3s-agent.service.d/override.confand restartingpowernode-agent. - Multi-cluster confusion: if
metadata.target_cluster_idis set on the agent module assignment, validation rejects join requests for any other cluster ID. Set the right ID viaplatform.system_assign_module_to_templateor remove the metadata to fall back to "join most recent active cluster."
To retrieve kubelet logs for a managed K3s node:
# Via SSH (node must be reachable on SDWAN /128)
journalctl -u k3s.service -n 200 # k3s-server
journalctl -u k3s-agent.service -n 200 # k3s-agentOr via the agent task channel (no SSH required):
// ⚠️ aspirational — use system_provision_instance / system_terminate_instance and platform.recent_events for task progress
platform.system_execute_task({
node_instance_id: "...",
command: ["journalctl", "-u", "k3s-agent.service", "-n", "200"]
})The output streams back through the worker API and lands in the operator dashboard task pane.
K3s clusters today support two CNI plugins, auto-selected from the bootstrap NodeInstance's network_profile and overridable explicitly. The KubernetesClusterProvisionerService enforces the choice + raises CniProfileMismatchError when an operator-supplied override would be incompatible with the host profile.
network_profile |
Auto-default CNI | Rationale |
|---|---|---|
heavyweight (≥4 GB RAM, x86/Pi5/server-class) |
ovn_kubernetes |
Headroom for OVN-controller + OVN-K8s; supports encrypted pod-to-pod via OVN tunnels |
lightweight (constrained / Pi-class) |
flannel |
Bundled with K3s; no OVN footprint |
| (anything else / unset) | flannel (DEFAULT_CNI_PLUGIN) |
Conservative fallback |
Explicit operator override (passed at bootstrap via the cni_plugin argument):
platform.system_provision_instance({
node_id: "<bootstrap-node-id>",
// ... other args ...
// For K3s bootstrap, the provisioner reads cni_plugin from the
// template/operator hint and validates against network_profile.
})If the operator-supplied cni_plugin is ovn_kubernetes and the bootstrap node has network_profile: "lightweight", the service raises CniProfileMismatchError rather than silently degrading. Same check runs at join time: a KubernetesNode whose profile mismatches the cluster's CNI is refused with the same error class. The valid cni_plugin set is the union of the values in NETWORK_PROFILE_TO_CNI plus DEFAULT_CNI_PLUGIN (i.e., flannel, ovn_kubernetes).
Pod-to-pod encryption posture:
flannel(default for lightweight) — VXLAN over the host's primary NIC by default. Optional encryption via SDWAN overlay: when the cluster'sSdwan::Networkhaspod_subnet_prefixset, the provisioner stamps the cluster's bootstrap config with--flannel-iface=wg-sdwan-<handle>,--flannel-backend=host-gw, and--cluster-cidr=<pod_subnet_prefix>so flannel runs in host-gw mode bound to the SDWAN WireGuard interface. Pod traffic between nodes then flows through the existing WireGuard tunnels via the AllowedIPs covering the SDWAN /64. No VXLAN encapsulation, no MTU computation, no nested-header fragmentation — the kernel's per-node /24 routes (installed by flannel host-gw from the K8s API) point at the other node's overlay IP, which WG already routes correctly.ovn_kubernetes(default for heavyweight) — pod traffic flows over OVN tunnels with native encryption support; pairs cleanly with the SDWAN overlay for hub-to-hub paths.pod_subnet_prefixis ignored on ovn-K8s clusters (OVN owns its own pod-network layer); the provisioner emits asystem.cluster_bootstrap.pod_subnet_prefix_ignoredwarning event when the field is set on an ovn-K8s path.
Operator workflow:
-
Declare the pod CIDR on the SDWAN network at create time (or update an existing network — but see "Migration" note below):
platform.system_sdwan_create_network({ name: "tokyo-edge", pod_subnet_prefix: "10.42.0.0/16", // RFC1918 pod CIDR — flannel default size // other fields per usual })
The platform validates that
pod_subnet_prefixdoes NOT overlap (a) the SDWAN /64, (b) any peer'slan_subnets, (c) any VIP /128, (d) any other network'spod_subnet_prefixin the same account. -
Bootstrap the k3s cluster with
cni_plugin: "flannel"on a NodeInstance attached to the network. The provisioner detects the network'spod_subnet_prefix, stamps the cluster'smetadata["pod_cidr"]+metadata["sdwan_network_id"], and emits aSubnetAdvertisement(source: "pod_subnet")row. -
The agent receives the new flannel args via the bootstrap_config endpoint on its next heartbeat:
flannel_iface: "wg-sdwan-<handle>"flannel_backend: "host-gw"cluster_cidr: "<pod_subnet_prefix>"
The agent appends
--flannel-iface --flannel-backend=host-gw --cluster-cidrto its k3s install args. Subsequent k3s installs (idempotent on content_hash) carry the new posture.
Migration model for existing clusters: setting pod_subnet_prefix on a network that already has running flannel clusters triggers the agent's reconcile loop to re-install k3s with the new flags on next heartbeat. This causes a brief (~30–60 s per node) pod-network outage as k3s restarts. The operator's act of setting the field IS the explicit opt-in; the platform doesn't auto-migrate without operator action.
Routing topology trade-off:
Network routing_protocol |
Pod-traffic path |
|---|---|
static (default — hub-and-spoke) |
Spoke A → Hub → Spoke B (hub-hairpinned) |
ibgp (slice 9c) |
Spoke A → Spoke B directly (FRR-advertised /24s) |
iBGP is the recommended posture for clusters with non-trivial pod-to-pod traffic; static-mode hub-hairpinning works but adds a round-trip per pod packet.
Immutability: once a Devops::KubernetesCluster references the network, pod_subnet_prefix cannot be changed (k3s pod CIDR is immutable post-bootstrap — same constraint as cni_plugin). Destroy + rebuild the cluster to migrate to a different pod CIDR.
See USE_CASE_MATRIX.md use cases 9 + 10 for the operator-facing summary; runbooks/multi-cluster-k3s.md for per-tenant pod_subnet_prefix isolation patterns; tutorials/04-k3s-cluster.md §"Pod traffic over SDWAN" for the live operator walkthrough.
Symptoms: docker pull fails with timeout or connection refused.
- Most operators use a registry mirror co-located on the SDWAN. Configure via dependant module override:
# daemon-json-override module hosts: ["tcp://[<sdwan-/128>]:2376", "unix:///var/run/docker.sock"] registry-mirrors: ["https://registry.<sdwan-domain>"]
- For pulls from
registry.example.com(Powernode Gitea container registry), the agent injects credentials into~/.docker/config.jsonautomatically when the node has a valid Vault token.
Backend (parent repo):
server/db/migrate/20260505000100_add_node_instance_to_devops_docker_hosts.rbserver/db/migrate/20260505000200_create_kubernetes_cluster_management_tables.rbserver/app/models/devops/docker_host.rb(managed/external state machine)server/app/models/devops/kubernetes_cluster.rb,kubernetes_node.rbserver/app/services/ai/tools/docker_provisioning_tool.rbserver/app/services/ai/tools/kubernetes_cluster_tool.rbserver/app/services/ai/tools/kubernetes_provisioning_tool.rbserver/app/controllers/api/v1/devops/kubernetes/clusters_controller.rb
Backend (system extension):
extensions/system/server/app/services/system/docker_daemon_provisioner_service.rbextensions/system/server/app/services/system/kubernetes_cluster_provisioner_service.rbextensions/system/server/app/controllers/api/v1/system/node_api/runtime_controller.rbextensions/system/server/db/seeds/docker_runtime_module.rbextensions/system/server/db/seeds/k3s_modules.rbextensions/system/server/db/seeds/smoke_test_docker_runtime.rb(live smoke)extensions/system/server/db/seeds/smoke_test_k3s_runtime.rb(live smoke)
Agent (Go):
extensions/system/agent/internal/dockerd/— handshake.go, manager.go, applier.go, shell_applier.go, modules.go (Phase 1)extensions/system/agent/internal/k3sd/— handshake.go, server_manager.go, agent_manager.go, applier.go, shell_applier.go (Phase 2)extensions/system/agent/internal/runtime/service.go— chains both reconcilers into PostSend after sdwanMgr
Frontend:
frontend/src/features/devops/docker/pages/DockerHostsPage.tsx(Managed badge)frontend/src/features/devops/kubernetes/(full hub)frontend/src/pages/app/devops/KubernetesHubPage.tsx
extensions/system/docs/SKILL_EXECUTORS.md—docker_provision+provision_clusterskill executorsextensions/system/docs/ARCHITECTURE.md— Container Runtimes subsystem entrydocs/platform/MCP_TOOL_CATALOG.md— full action reference