Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions docs/cli/Guides/swarm-vllm-s3.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,13 +64,13 @@ In the Super Swarm dashboard, sign in using either Google (recommended) or MetaM

## 5. Provide access to the bucket

**5.1.** In Object Storage, click **Policy Rules**:
**5.1.** In **Object Storage**, click **Policy Rules**:

<img src={require('../images/swarm-object-storage-policy-rules.png').default} width="auto" height="auto" border="1"/>
<br/>
<br/>

**5.2.** Click **+Grant Access** in the top-right corner, select a Service Account, and click **Grant Access**:
**5.2.** Click **+Grant Access** in the top-right corner, select a **Service Account**, and click **Grant Access**:

<img src={require('../images/swarm-policy-rules-grant-access.png').default} width="auto" height="auto" border="1"/>
<br/>
Expand Down Expand Up @@ -119,8 +119,8 @@ Ensure `AWS_DEFAULT_REGION` matches the region in the **Connect Info**.

```shell
aws s3 sync ./qwen-1.5b s3://${S3_BUCKET}/models/qwen-1.5b/ \
--endpoint-url ${S3_ENDPOINT} \
--exclude ".cache/*"
--endpoint-url ${S3_ENDPOINT} \
--exclude ".cache/*"
```

**7.4.** Check if the model was uploaded successfully:
Expand Down Expand Up @@ -182,7 +182,7 @@ Back in the Super Swarm dashboard, go to **Ingresses** and check the hostname li
<br/>
<br/>

At your DNS provider, add a CNAME record pointing to the hostname and a TXT record for domain verification.
At your DNS provider, add a CNAME record pointing to the hostname and a TXT record for domain verification.

Ensure the statuses have changed to **Verified** and **Delegated**. This may take a couple of minutes.

Expand Down
8 changes: 4 additions & 4 deletions docs/cli/Guides/swarm-vllm.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,9 +126,9 @@ kubectl get ingress

Expected output:

- Two pods in `1/1 Running`
- Two services
- Two ingresses
- A pod in `1/1 Running`
- A service
- An ingress

## 9. Confirm DNS records

Expand All @@ -138,7 +138,7 @@ Back in the Super Swarm dashboard, go to **Ingresses** and note the two hostname
<br/>
<br/>

For each hostname, add a CNAME record pointing to it and a TXT record for domain verification at your DNS provider.
At your DNS provider, add a CNAME record pointing to the hostname and a TXT record for domain verification.
Comment thread
k3dz0r marked this conversation as resolved.

Back in the Super Swarm dashboard, ensure the statuses are **Verified** and **Delegated**. This may take a couple of minutes.

Expand Down
Binary file modified docs/cli/images/swarm-ingresses-s3-verified.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/cli/images/swarm-ingresses-s3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/cli/images/swarm-ingresses-vllm-verified.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/cli/images/swarm-ingresses-vllm.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/cli/images/swarm-policy-rules-grant-access.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
111 changes: 111 additions & 0 deletions docs/fundamentals/swarm-certification.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
---
id: "swarm-certification"
title: "Super Swarm Certification System"
slug: "/swarm-certification"
sidebar_position: 20
---

When a node (a confidential virtual machine, VM) joins a Super Swarm network, it goes through a cryptographic onboarding process that establishes hardware-backed trust before participating in any cluster operations. This document explains how that process works: what certificates are generated, how they relate to one another, and what exactly is verified.

## swarm-db

One of the key components of Super Swarm is the distributed database `swarm-db`, which handles synchronization between nodes. It is encrypted using a `swarm-key`, which is randomly generated by the bootstrap node at startup and does not change.

The key distribution process completes before the services (`swarm-db`, `swarm-cloud`, etc.) start. Trust between running nodes is established using a Public Key Infrastructure (PKI).

## Node types

Every Super Swarm network starts with one node in a special mode—the *bootstrap node*. Its configuration has no peer addresses to connect. The bootstrap node serves as the initial source of trust and does not verify itself against any other node.

All subsequent nodes are *worker nodes*. Their configuration includes the network ID, the address of the bootstrap node (or any PKI-capable node already in the network), and the root CA certificate. These parameters allow a worker node to locate and authenticate the network it is joining before committing to it.

Once enough worker nodes have joined, the bootstrap node is no longer special and becomes effectively equal to the others.

## Network modes

Super Swarm supports two network modes:

- `trusted`: The network admits only VMs with hardware confidentiality enabled (currently, Intel TDX or AMD SEV-SNP). The VM's hardware measurements must be present in a registry of known-good values (currently hosted on GitHub). All core components of the VM stack, such as kernel parameters, firmware, and similar settings, must not differ from what is registered. Otherwise, the measurements will not match, and the node will be rejected. Additionally, if a connected GPU has debug mode (`dbgStat`) enabled, the VM is considered untrusted.
- `untrusted`: Any VM can join, with or without hardware confidentiality support. Measurements are not checked. This mode exists for development and debugging only and should never be used in production.

The network mode is recorded in the root certificate, so any connecting node can inspect it and refuse to join an untrusted network if configured to do so.

## Certificate architecture

The bootstrap node generates the entire certificate hierarchy for the network at startup. There are two parallel chains, one built on RSA cryptography (*Basic*) and one on ECDSA elliptic curves (*Lite*). Each chain has the same structure:

Root CA → Subroot (CA operations) → VM certificate<br/>
Root CA → Subroot (Evidence signing)

This produces the following set of certificates on first boot:

- 2 root certificates (RSA/Basic and ECDSA/Lite)
- 4 subroot certificates (two per chain: one for the CA itself, one for workload evidence signing)
- 1 VM certificate (Basic) for the bootstrap node itself

All of these are generated before the `pki-authority` service starts.

### Root CA certificate (basic)

The RSA root certificate carries several non-standard extensions:

- Confidential environment type: TDX / SEV-SNP / untrusted, etc.; `OID 1.3.6.1.3.8888.1.1`
- Network type: trusted/untrusted; `OID 1.3.6.1.3.8888.4`
- Hardware report: `OID 0.6.9.42.840.113741.1337.6`

This is the most important certificate in the system. Everything else chains to it. Any external party verifying a VM certificate, an evidence signature, or a TLS connection to a published service must ultimately anchor trust in this certificate. The root certificate is intended to be public for any Swarm deployment.

### Subroot certificates

There are two subroot certificates per chain. The first is used by the CA itself to sign VM certificates. The second (*Subroot Evidence Certificate*) is used exclusively for signing Deployment Evidence—runtime reports attached to published workloads.

Subroot certificates share the hardware report and environment type fields with the root, but add VM metrics (`mrenclave`, `OID 1.3.6.1.3.8888.1.2`) and omit the network type field. Subroot certificates are planned for monthly renewal.

### VM Certificate

Each node receives a VM certificate as part of its onboarding. This certificate is signed by the CA subroot and contains a hardware report in which the node's public key is embedded in the `report data` field. This is the aTLS pattern: the hardware signs the public key, so the CA can verify not just that the certificate is cryptographically valid, but that the private key corresponding to it is held inside a TEE.

## Worker node onboarding

When a worker node starts, a dedicated service (`pki-authority-sync`) runs before any Swarm services come up. Its job is to provision the node with everything it needs to participate in the network.

### Phase 1: Obtaining a VM certificate

<img src={require('./images/swarm-certification-phase1.png').default} width="auto" height="auto"/>
<br/>
<br/>
Comment thread
k3dz0r marked this conversation as resolved.

The worker node generates a hardware attestation report. The report embeds the node's public key in the `report data` field. The node sends this report as a certificate signing request to the `pki-authority` service on the bootstrap node or any reachable CA node.

The CA performs the following checks:

- Network type: Does the request comply with the network mode (`trusted`/`untrusted`)?
- Hardware report integrity: Is the attestation report cryptographically valid?
- Measurements validation: Are the VM's `mrenclave` values present in the trusted registry? (`trusted` mode only)
- GPU state: Are any connected GPUs running in debug mode? (`trusted` mode only)

If all checks pass, the CA issues a VM certificate with the `validated` flag (`OID 1.3.6.1.3.8888.1.6`) set and returns it to the requesting node. The certificate and private key are saved locally on the worker node.

### Phase 2: Receiving secrets

<img src={require('./images/swarm-certification-phase2.png').default} width="auto" height="auto"/>
<br/>
<br/>

With a valid VM certificate in hand, the worker node connects back over mTLS, presenting the certificate it just received. The CA verifies:

- The certificate chain is cryptographically valid (signs back to the root CA).
- The `validated` flag is present.

If both conditions are met, the CA provides the worker node with:

- The `swarm-db` encryption key (`swarm-key`).
- Private keys and certificates already in the network.

In the current architecture, all nodes in a Swarm network share the same root and subroot private keys. Worker nodes retrieve the full set through the `pki-authority-sync` process described above, and from that point hold identical copies.

### After onboarding

Once the node has received these secrets, it initializes `swarm-db` using the received key, syncs with the rest of the cluster, and becomes a full peer. At this point, it can also act as a CA for subsequent nodes: it holds the same root certificates and can issue VM certificates. It can also provide secrets to nodes that connect to it rather than the bootstrap node.

The certificate chains issued by the bootstrap node and by a worker node are identical in structure. All chains terminate at the same root CA certificate.
Loading
Loading