This guide covers deploying dstack on bare metal TDX hosts.
dstack can be deployed in two ways:
- Dev Deployment: All components run directly on the host. For local development and testing only - no security guarantees.
- Production Deployment: KMS and Gateway run as CVMs with hardware-rooted security. Uses auth server for authorization and OS image whitelisting. Required for any deployment where security matters.
Hardware:
- Bare metal TDX server (setup guide)
- At least 16GB RAM, 100GB free disk space
- Public IPv4 address
- Optional: NVIDIA H100 or Blackwell GPU for Confidential Computing workloads
Network:
- Domain with DNS access (for Gateway TLS)
Note: See Hardware Requirements for server recommendations.
This approach runs all components directly on the host for local development and testing.
Warning: Dev deployment uses KMS in dev mode with no security guarantees. Do NOT use for production.
# Ubuntu 24.04
sudo apt install -y build-essential chrpath diffstat lz4 wireguard-tools xorriso
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | shgit clone https://github.com/Dstack-TEE/meta-dstack.git --recursive
cd meta-dstack/
mkdir build && cd build
../build.sh hostcfgEdit the generated build-config.sh for your environment. The minimal required changes are:
| Variable | Description |
|---|---|
KMS_DOMAIN |
DNS domain for KMS RPC (e.g., kms.example.com) |
GATEWAY_DOMAIN |
DNS domain for Gateway RPC (e.g., gateway.example.com) |
GATEWAY_PUBLIC_DOMAIN |
Public base domain for app routing (e.g., apps.example.com) |
TLS Certificates:
The Gateway requires TLS certificates. Configure Certbot with Cloudflare:
CERTBOT_ENABLED=true
CF_API_TOKEN=<your-cloudflare-token>The certificates will be obtained automatically via ACME DNS-01 challenge. The KMS auto-generates its own certificates during bootstrap.
Other variables like ports and CID pool settings have sensible defaults.
vim ./build-config.sh
../build.sh hostcfg../build.sh dl 0.5.5Start in separate terminals:
- KMS:
./dstack-kms -c kms.toml - Gateway:
sudo ./dstack-gateway -c gateway.toml - VMM:
./dstack-vmm -c vmm.toml
Note: This deployment uses KMS in dev mode without an auth server. For production deployments with proper security, see Production Deployment below.
For production, deploy KMS and Gateway as CVMs with hardware-rooted security. Production deployments require:
- KMS running in a CVM (not on the host)
- Auth server for authorization (webhook mode)
Required:
- Set up TDX host with dstack-vmm
- Deploy KMS as CVM (with auth server)
- Deploy Gateway as CVM
Optional Add-ons:
- Zero Trust HTTPS
- Certificate Transparency monitoring
- Multi-node deployment
- On-chain governance - Smart contract-based authorization
Clone and build dstack-vmm:
git clone https://github.com/Dstack-TEE/dstack
cd dstack
cargo build --release -p dstack-vmm -p supervisor
mkdir -p vmm-data
cp target/release/dstack-vmm vmm-data/
cp target/release/supervisor vmm-data/
cd vmm-data/Create vmm.toml:
address = "tcp:0.0.0.0:9080"
reuse = true
image_path = "./images"
run_path = "./run/vm"
[cvm]
kms_urls = []
gateway_urls = []
cid_start = 30000
cid_pool_size = 1000
[cvm.port_mapping]
enabled = true
address = "127.0.0.1"
range = [
{ protocol = "tcp", from = 1, to = 20000 },
{ protocol = "udp", from = 1, to = 20000 },
]
[host_api]
address = "vsock:2"
port = 10000Download guest images from meta-dstack releases and extract to ./images/.
For reproducible builds and verification, see the Security Model.
Start VMM:
./dstack-vmm -c vmm.tomlProduction KMS requires:
- KMS: The key management service inside a CVM
- Auth server: Webhook server that validates boot requests and returns authorization decisions
| Server | Use Case | Configuration |
|---|---|---|
| auth-simple | Config-file-based whitelisting | JSON config file |
| auth-eth | On-chain governance via smart contracts | Ethereum RPC + contract |
| Custom | Your own authorization logic | Implement webhook interface |
All auth servers implement the same webhook interface:
GET /- Health checkPOST /bootAuth/app- App boot authorizationPOST /bootAuth/kms- KMS boot authorization
auth-simple validates boot requests against a JSON config file.
Create auth-config.json for initial KMS deployment:
{
"osImages": ["0x<os-image-hash>"],
"kms": { "allowAnyDevice": true },
"apps": {}
}Run auth-simple:
cd kms/auth-simple
bun install
PORT=3001 AUTH_CONFIG_PATH=/path/to/auth-config.json bun run startFor adding Gateway, apps, and other config fields, see auth-simple Operations Guide.
For decentralized governance via smart contracts, see On-Chain Governance.
The OS image hash is in the digest.txt file inside the guest image tarball:
# Extract hash from release tarball
tar -xzf dstack-0.5.5.tar.gz
cat dstack-0.5.5/digest.txt
# Output: 0b327bcd642788b0517de3ff46d31ebd3847b6c64ea40bacde268bb9f1c8ec83Add 0x prefix for auth-simple config: 0x0b327bcd...
Choose the deployment script based on your auth server:
For auth-simple (external webhook):
auth-simple runs on your infrastructure, outside the CVM.
cd dstack/kms/dstack-app/Edit .env.simple:
VMM_RPC=http://127.0.0.1:9080
AUTH_WEBHOOK_URL=http://your-auth-server:3001
KMS_RPC_ADDR=0.0.0.0:9201
GUEST_AGENT_ADDR=127.0.0.1:9205
OS_IMAGE=dstack-0.5.5
IMAGE_DOWNLOAD_URL=https://github.com/Dstack-TEE/meta-dstack/releases/download/v0.5.5/dstack-0.5.5.tar.gzThen run:
./deploy-simple.shFor auth-eth (on-chain governance):
See On-Chain Governance Guide for deploying KMS with smart contract-based authorization.
Monitor startup:
tail -f ../../vmm-data/run/vm/<vm-id>/serial.logWait for [ OK ] Finished App Compose Service.
Open http://127.0.0.1:9201/ in your browser.
- Click Bootstrap
- Enter the domain for your KMS (e.g.,
kms.example.com) - Click Finish setup
The KMS will display its public key and TDX quote:
Before deploying Gateway:
- Register the Gateway app in your auth server config (add to
appssection inauth-config.json) - Note the App ID you assign - you'll need it for the
.envfile
For on-chain governance, see On-Chain Governance for registration steps.
cd dstack/gateway/dstack-app/
./deploy-to-vmm.shEdit .env with required variables:
# VMM connection (use TCP if VMM is on same host, or remote URL)
VMM_RPC=http://127.0.0.1:9080
# Cloudflare (for DNS-01 ACME challenge)
CF_API_TOKEN=your_cloudflare_api_token
# Domain configuration
SRV_DOMAIN=example.com
PUBLIC_IP=$(curl -s ifconfig.me)
# Gateway app ID (from registration above)
GATEWAY_APP_ID=32467b43BFa67273FC7dDda0999Ee9A12F2AaA08
# Gateway URLs
MY_URL=https://gateway.example.com:9202
BOOTNODE_URL=https://gateway.example.com:9202
# WireGuard (uses same port as RPC)
WG_ADDR=0.0.0.0:9202
# Network settings
SUBNET_INDEX=0
ACME_STAGING=no # Set to 'yes' for testing
OS_IMAGE=dstack-0.5.5Note on hex formats:
- Gateway
.envfile: Use raw hex without0xprefix (e.g.,GATEWAY_APP_ID=32467b43...) - auth-simple config: Use
0xprefix (e.g.,"0x32467b43..."). The server normalizes both formats.
Run the script again:
./deploy-to-vmm.shThe script will display the compose file and compose hash, then prompt for confirmation:
Docker compose file:
...
Compose hash: 0x700a50336df7c07c82457b116e144f526c29f6d8...
Configuration:
...
Continue? [y/N]
Before pressing 'y', add the compose hash to your auth server whitelist:
- For auth-simple: Add to
composeHashesarray inauth-config.json - For auth-eth: Use
app:add-hash(see On-Chain Governance)
Then return to the first terminal and press 'y' to deploy.
After Gateway is running, update vmm.toml with KMS and Gateway URLs:
[cvm]
kms_urls = ["https://kms.example.com:9201"]
gateway_urls = ["https://gateway.example.com:9202"]Restart dstack-vmm to apply changes.
Generate TLS certificates inside the TEE with automatic CAA record management.
Configure in build-config.sh:
GATEWAY_CERT=${CERTBOT_WORKDIR}/live/cert.pem
GATEWAY_KEY=${CERTBOT_WORKDIR}/live/key.pem
CF_API_TOKEN=<your-cloudflare-token>
ACME_URL=https://acme-v02.api.letsencrypt.org/directoryRun certbot:
RUST_LOG=info,certbot=debug ./certbot renew -c certbot.tomlThis will:
- Create an ACME account
- Set CAA DNS records on Cloudflare
- Request and auto-renew certificates
Monitor for unauthorized certificates issued to your domain.
cargo build --release -p ct_monitor
./target/release/ct_monitor \
--gateway-uri https://<gateway-domain> \
--domain <your-domain>How it works:
- Fetches known public keys from Gateway (
/acme-infoendpoint) - Queries crt.sh for certificates issued to your domain
- Verifies each certificate's public key matches the known keys
- Logs errors (❌) when certificates are issued to unknown public keys
The monitor runs in a loop, checking every 60 seconds. Integrate with your alerting system by monitoring stderr for error messages.
Scale by adding VMM nodes and KMS replicas for high availability.
On each additional TDX host:
- Set up dstack-vmm (see step 1)
- Configure
vmm.tomlwith existing KMS/Gateway URLs - Start VMM
[cvm]
kms_urls = ["https://kms.example.com:9201"]
gateway_urls = ["https://gateway.example.com:9202"]Additional KMS instances can onboard from an existing KMS to share the same root keys. This enables:
- High availability (multiple KMS nodes)
- Geographic distribution
- Load balancing
How it works:
- New KMS starts in onboard mode (empty
auto_bootstrap_domain) - New KMS calls
GetTempCaCerton source KMS - New KMS generates RA-TLS certificate with TDX quote
- New KMS calls
GetKmsKeywith mTLS authentication - Source KMS verifies attestation via
bootAuth/kmswebhook - If approved, source KMS returns root keys
- Both KMS instances now derive identical keys
Configure new KMS for onboarding:
[core.onboard]
enabled = true
auto_bootstrap_domain = "" # Empty = onboard mode
quote_enabled = true # Require TDX attestation
address = "0.0.0.0"
port = 9203 # HTTP port for onboard UITrigger onboard via API:
curl -X POST http://<new-kms>:9203/prpc/Onboard.Onboard?json \
-H "Content-Type: application/json" \
-d '{"source_url": "https://<existing-kms>:9201/prpc", "domain": "kms2.example.com"}'Finish and restart:
curl http://<new-kms>:9203/finish
# Restart KMS - it will now serve as a full KMS with shared keysNote: For KMS onboarding with
quote_enabled = true, add the KMS mrAggregated hash to your auth server'skms.mrAggregatedwhitelist.
After setup, deploy apps via the VMM dashboard or CLI.
Before deploying, register your app in your auth server:
- For auth-simple: See auth-simple Operations Guide
- For auth-eth: See On-Chain Governance
Open http://localhost:9080:
- Select the OS image
- Enter the App ID (from registration above)
- Upload your
docker-compose.yaml
After startup, click Dashboard to view:
The CID range conflicts with existing VMs.
- Find used CIDs:
ps aux | grep 'guest-cid=' - Update
vmm.toml:[cvm] cid_start = 33000 cid_pool_size = 1000
When running Gateway with many concurrent connections (>100K), the host's conntrack table may fill up, causing silent packet drops:
dmesg: nf_conntrack: table full, dropping packet
Each proxied connection creates multiple conntrack entries (client→gateway, gateway→WireGuard→backend). The default nf_conntrack_max (typically 262,144) is insufficient for high-concurrency gateways.
Fix:
# Check current limit
sysctl net.netfilter.nf_conntrack_max
# Increase for production (persistent)
echo "net.netfilter.nf_conntrack_max = 1048576" >> /etc/sysctl.d/99-dstack.conf
echo "net.netfilter.nf_conntrack_buckets = 262144" >> /etc/sysctl.d/99-dstack.conf
sysctl -p /etc/sysctl.d/99-dstack.confAlso increase inside bridge-mode CVMs if they handle many connections:
sysctl -w net.netfilter.nf_conntrack_max=524288Sizing rule of thumb: Set nf_conntrack_max to at least 4× your target concurrent connection count (each connection may use 2-3 conntrack entries across NAT/bridge layers).
Ubuntu 23.10+ restricts unprivileged user namespaces:
sudo sysctl kernel.apparmor_restrict_unprivileged_userns=0


