Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 42 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,23 @@

Validated pattern for deploying confidential containers on OpenShift using the [Validated Patterns](https://validatedpatterns.io/) framework.

Confidential containers use hardware-backed Trusted Execution Environments (TEEs) to isolate workloads from cluster and hypervisor administrators. This pattern deploys and configures the Red Hat CoCo stack — including the sandboxed containers operator, Trustee (Key Broker Service), and peer-pod infrastructure — on Azure.
Confidential containers use hardware-backed Trusted Execution Environments (TEEs) to isolate workloads from cluster and hypervisor administrators. This pattern deploys and configures the Red Hat CoCo stack — including the sandboxed containers operator, Trustee (Key Broker Service), and peer-pod infrastructure — on Azure and bare metal.

## Topologies

The pattern provides two deployment topologies:
The pattern provides three deployment topologies:

1. **Single cluster** (`simple` clusterGroup) — deploys all components (Trustee, Vault, ACM, sandboxed containers, workloads) in one cluster. This breaks the RACI separation expected in a remote attestation architecture but simplifies testing and demonstrations.
1. **Single cluster** (`simple` clusterGroup) — deploys all components (Trustee, Vault, ACM, sandboxed containers, workloads) in one cluster on Azure. This breaks the RACI separation expected in a remote attestation architecture but simplifies testing and demonstrations.

2. **Multi-cluster** (`trusted-hub` + `spoke` clusterGroups) — separates the trusted zone from the untrusted workload zone:
- **Hub** (`trusted-hub`): Runs Trustee (KBS + attestation service), HashiCorp Vault, ACM, and cert-manager. This cluster is the trust anchor.
- **Spoke** (`spoke`): Runs the sandboxed containers operator and confidential workloads. The spoke is imported into ACM and managed from the hub.

3. **Bare metal** (`baremetal` clusterGroup) — deploys all components on bare metal hardware with Intel TDX or AMD SEV-SNP support. NFD (Node Feature Discovery) auto-detects the CPU architecture and configures the appropriate runtime. Supports SNO (Single Node OpenShift) and multi-node clusters.

The topology is controlled by the `main.clusterGroupName` field in `values-global.yaml`.

Currently supports Azure via peer-pods. Peer-pods provision confidential VMs (`Standard_DCas_v5` family) directly on the Azure hypervisor rather than nesting VMs inside worker nodes.
Azure deployments use peer-pods, which provision confidential VMs (`Standard_DCas_v5` family) directly on the Azure hypervisor. Bare metal deployments use layered images and hardware TEE features directly.

## Current version (4.*)

Expand All @@ -42,9 +44,21 @@ All previous versions used pre-GA (Technology Preview) releases of Trustee:

### Prerequisites

**Azure deployments:**

- OpenShift 4.17+ cluster on Azure (self-managed via `openshift-install` or ARO)
- Azure `Standard_DCas_v5` VM quota in your target region (these are confidential computing VMs and are not available in all regions). See the note below for more details.
- Azure DNS hosting the cluster's DNS zone

**Bare metal deployments:**

- OpenShift 4.17+ cluster on bare metal with Intel TDX or AMD SEV-SNP hardware
- BIOS/firmware configured to enable TDX or SEV-SNP
- Available block devices for LVMS storage (auto-discovered)
- For Intel TDX: an Intel PCS API key from [api.portal.trustedservices.intel.com](https://api.portal.trustedservices.intel.com/)

**Common:**

- Tools on your workstation: `podman`, `yq`, `jq`, `skopeo`
- OpenShift pull secret saved at `~/pull-secret.json` (download from [console.redhat.com](https://console.redhat.com/openshift/downloads))
- Fork the repository — ArgoCD reconciles cluster state against your fork, so changes must be pushed to your remote
Expand All @@ -53,20 +67,20 @@ All previous versions used pre-GA (Technology Preview) releases of Trustee:

These scripts generate the cryptographic material and attestation measurements needed by Trustee and the peer-pod VMs. Run them once before your first deployment.

1. `bash scripts/gen-secrets.sh` — generates KBS key pairs, attestation policy seeds, and copies `values-secret.yaml.template` to `~/values-secret-coco-pattern.yaml`
2. `bash scripts/get-pcr.sh` — retrieves PCR measurements from the peer-pod VM image and stores them at `~/.coco-pattern/measurements.json` (requires `podman`, `skopeo`, and `~/pull-secret.json`)
3. Review and customise `~/values-secret-coco-pattern.yaml` — this file is loaded into Vault and provides secrets to the pattern
1. `bash scripts/gen-secrets.sh` — generates KBS key pairs, PCCS certificates/tokens (for bare metal), and copies `values-secret.yaml.template` to `~/values-secret-coco-pattern.yaml`
2. `bash scripts/get-pcr.sh` — retrieves PCR measurements from the peer-pod VM image and stores them at `~/.coco-pattern/measurements.json` (requires `podman`, `skopeo`, and `~/pull-secret.json`). **Not required for bare metal deployments.**
3. Review and customise `~/values-secret-coco-pattern.yaml` — this file is loaded into Vault and provides secrets to the pattern. For bare metal, uncomment the PCCS secrets section and provide your Intel PCS API key.

> **Note:** `gen-secrets.sh` will not overwrite existing secrets. Delete `~/.coco-pattern/` if you need to regenerate.

### Single cluster deployment
### Single cluster deployment (Azure)

1. Set `main.clusterGroupName: simple` in `values-global.yaml`
2. Ensure your Azure configuration is populated in `values-global.yaml` (see `global.azure.*` fields)
3. `./pattern.sh make install`
4. Wait for the cluster to reboot all nodes (the sandboxed containers operator triggers a MachineConfig update). Monitor progress in the ArgoCD UI.

### Multi-cluster deployment
### Multi-cluster deployment (Azure)

1. Set `main.clusterGroupName: trusted-hub` in `values-global.yaml`
2. Deploy the hub cluster: `./pattern.sh make install`
Expand All @@ -76,6 +90,25 @@ These scripts generate the cryptographic material and attestation measurements n
(see [importing a cluster](https://validatedpatterns.io/learn/importing-a-cluster/))
6. ACM will automatically deploy the `spoke` clusterGroup applications (sandboxed containers, workloads) to the imported cluster

### Bare metal deployment

1. Set `main.clusterGroupName: baremetal` in `values-global.yaml`
2. Run `bash scripts/gen-secrets.sh` to generate KBS keys and PCCS secrets
3. For Intel TDX: uncomment the PCCS secrets in `~/values-secret-coco-pattern.yaml` and provide your Intel PCS API key
4. `./pattern.sh make install`
5. Wait for the cluster to reboot nodes (MachineConfig updates for TDX kernel parameters and vsock)

The system auto-detects your hardware:

- **NFD** discovers Intel TDX or AMD SEV-SNP capabilities and labels nodes
- **LVMS** auto-discovers available block devices for storage
- **RuntimeClass** `kata-cc` is created automatically pointing to the correct handler (`kata-tdx` or `kata-snp`)
- Both `kata-tdx` and `kata-snp` RuntimeClasses are deployed; only the one matching your hardware has schedulable nodes
- MachineConfigs are deployed for both `master` and `worker` roles (safe on SNO where only master exists)
- PCCS and QGS services deploy unconditionally; DaemonSets only schedule on Intel nodes via NFD labels

Optional: pin PCCS to a specific node with `bash scripts/get-pccs-node.sh` and set `baremetal.pccs.nodeSelector` in the baremetal chart values.

## Sample applications

Two sample applications are deployed on the cluster running confidential workloads (the single cluster in `simple` mode, or the spoke in multi-cluster mode):
Expand Down
9 changes: 9 additions & 0 deletions charts/all/baremetal/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
apiVersion: v2
description: Bare metal platform configuration (NFD rules, MachineConfigs, RuntimeClasses, Intel device plugin).
keywords:
- pattern
- upstream
- sandbox
- baremetal
name: baremetal
version: 0.0.1
80 changes: 80 additions & 0 deletions charts/all/baremetal/templates/kata-nfd.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
apiVersion: nfd.openshift.io/v1alpha1
kind: NodeFeatureRule
metadata:
name: consolidated-hardware-features
namespace: openshift-nfd
spec:
rules:
- name: "runtime.kata"
labels:
feature.node.kubernetes.io/runtime.kata: "true"
matchAny:
- matchFeatures:
- feature: cpu.cpuid
matchExpressions:
SSE42: { op: Exists }
VMX: { op: Exists }
- feature: kernel.loadedmodule
matchExpressions:
kvm: { op: Exists }
kvm_intel: { op: Exists }
- matchFeatures:
- feature: cpu.cpuid
matchExpressions:
SSE42: { op: Exists }
SVM: { op: Exists }
- feature: kernel.loadedmodule
matchExpressions:
kvm: { op: Exists }
kvm_amd: { op: Exists }

- name: "amd.sev-snp"
labels:
amd.feature.node.kubernetes.io/snp: "true"
extendedResources:
sev-snp.amd.com/esids: "@cpu.security.sev.encrypted_state_ids"
matchFeatures:
- feature: cpu.cpuid
matchExpressions:
SVM: { op: Exists }
- feature: cpu.security
matchExpressions:
sev.snp.enabled: { op: Exists }

- name: "intel.sgx"
labels:
intel.feature.node.kubernetes.io/sgx: "true"
extendedResources:
sgx.intel.com/epc: "@cpu.security.sgx.epc"
matchFeatures:
- feature: cpu.cpuid
matchExpressions:
SGX: { op: Exists }
SGXLC: { op: Exists }
- feature: cpu.security
matchExpressions:
sgx.enabled: { op: IsTrue }
- feature: kernel.config
matchExpressions:
X86_SGX: { op: Exists }

- name: "intel.tdx"
labels:
intel.feature.node.kubernetes.io/tdx: "true"
extendedResources:
tdx.intel.com/keys: "@cpu.security.tdx.total_keys"
matchFeatures:
- feature: cpu.cpuid
matchExpressions:
VMX: { op: Exists }
- feature: cpu.security
matchExpressions:
tdx.enabled: { op: Exists }

- name: "ibm.se.enabled"
labels:
ibm.feature.node.kubernetes.io/se: "true"
matchFeatures:
- feature: cpu.security
matchExpressions:
se.enabled: { op: IsTrue }
12 changes: 12 additions & 0 deletions charts/all/baremetal/templates/nfd-instance.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
apiVersion: nfd.openshift.io/v1
kind: NodeFeatureDiscovery
metadata:
name: nfd-instance
namespace: openshift-nfd
spec:
operand:
image: registry.redhat.io/openshift4/ose-node-feature-discovery-rhel9:v4.20
imagePullPolicy: Always
servicePort: 12000
workerConfig:
configData: |
24 changes: 24 additions & 0 deletions charts/all/baremetal/templates/vsock-mco.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
{{- range list "master" "worker" }}
---
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: {{ . }}
name: 99-enable-coco-{{ . }}
spec:
kernelArguments:
- nohibernate
{{- if $.Values.tdx.enabled }}
- kvm_intel.tdx=1
{{- end }}
config:
ignition:
version: 3.2.0
storage:
files:
- path: /etc/modules-load.d/vsock.conf
mode: 0644
contents:
source: data:text/plain;charset=utf-8;base64,dnNvY2stbG9vcGJhY2sK
{{- end }}
2 changes: 2 additions & 0 deletions charts/all/baremetal/values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
tdx:
enabled: true
10 changes: 10 additions & 0 deletions charts/all/intel-dcap/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
apiVersion: v2
description: Intel DCAP services (PCCS and QGS) for TDX remote attestation.
keywords:
- pattern
- intel
- tdx
- pccs
- qgs
name: intel-dcap
version: 0.0.1
11 changes: 11 additions & 0 deletions charts/all/intel-dcap/templates/intel-dpo-sgx.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: deviceplugin.intel.com/v1
kind: SgxDevicePlugin
metadata:
name: sgxdeviceplugin-sample
spec:
image: registry.connect.redhat.com/intel/intel-sgx-plugin@sha256:f2c77521c6dae6b4db1896a5784ba8b06a5ebb2a01684184fc90143cfcca7bf4
enclaveLimit: 110
provisionLimit: 110
logLevel: 4
nodeSelector:
intel.feature.node.kubernetes.io/sgx: "true"
69 changes: 69 additions & 0 deletions charts/all/intel-dcap/templates/pccs-deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: pccs
namespace: intel-dcap
spec:
replicas: 1
selector:
matchLabels:
app: pccs
template:
metadata:
labels:
app: pccs
trustedservices.intel.com/cache: pccs
spec:
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
serviceAccountName: pccs-service-account
{{- if .Values.baremetal.pccs.nodeSelector }}
nodeSelector:
kubernetes.io/hostname: {{ .Values.baremetal.pccs.nodeSelector }}
{{- end }}
initContainers:
- name: init-seclabel
image: registry.access.redhat.com/ubi9/ubi:9.7-1764578509
command: [ "sh", "-c", "chcon -Rt container_file_t /var/cache/pccs" ]
volumeMounts:
- name: host-database
mountPath: /var/cache/pccs
securityContext:
runAsUser: 0
runAsGroup: 0
privileged: true # Required for chcon to work on host files
containers:
- name: pccs
image: registry.redhat.io/openshift-sandboxed-containers/osc-pccs@sha256:de64fc7b13aaa7e466e825d62207f77e7c63a4f9da98663c3ab06abc45f2334d
envFrom:
- secretRef:
name: pccs-secrets
env:
- name: "PCCS_LOG_LEVEL"
value: "info"
- name: "CLUSTER_HTTPS_PROXY"
value: ""
- name: "PCCS_FILL_MODE"
value: "LAZY"
ports:
- containerPort: 8042
name: pccs-port
volumeMounts:
- name: pccs-tls
mountPath: /opt/intel/pccs/ssl_key
readOnly: true
- name: host-database
mountPath: /var/cache/pccs/
securityContext:
runAsUser: 0
volumes:
- name: pccs-tls
secret:
secretName: pccs-tls
- name: host-database
hostPath:
path: /var/cache/pccs/
type: DirectoryOrCreate
49 changes: 49 additions & 0 deletions charts/all/intel-dcap/templates/pccs-rbac.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: pccs-service-account
namespace: intel-dcap
---
apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
name: pccs-scc
annotations:
kubernetes.io/description: "SCC for Intel DCAP PCCS service requiring privileged access and hostPath volumes"
allowHostDirVolumePlugin: true
allowHostIPC: false
allowHostNetwork: false
allowHostPID: false
allowHostPorts: false
allowPrivilegedContainer: true
allowedCapabilities:
- DAC_OVERRIDE
- SETGID
- SETUID
defaultAddCapabilities: null
fsGroup:
type: RunAsAny
priority: null
readOnlyRootFilesystem: false
requiredDropCapabilities:
- KILL
- MKNOD
- SETPCAP
- SYS_CHROOT
runAsUser:
type: RunAsAny
seLinuxContext:
type: MustRunAs
supplementalGroups:
type: RunAsAny
users:
- system:serviceaccount:intel-dcap:pccs-service-account
volumes:
- configMap
- downwardAPI
- emptyDir
- hostPath
- persistentVolumeClaim
- projected
- secret
Loading
Loading