We need to make sure that atelet and ateom only have privileges to act on behalf of actors that are actually assigned to them.
- If you break out of an actor to compromise an atelet / node, then you can modify the snapshots that atelet writes. This gives you RCE on the next node that actor is restored onto.
- My suspicion is that the churn in actors will mean that a compromise of one node will quickly be able to pivot to many more nodes, as compromised actors get restored on different nodes.
- Traditional node isolation (in the k8s sense) will not be a sufficient defense against this.
- We may need to consider a strategy where we divide the fleet of actors into N bins, and ensure that actors from different bins are never co-scheduled.
- The larger N gets, the more efficiency we are giving up for small deployments. So N will need to be somewhat dynamic.
We need to make sure that atelet and ateom only have privileges to act on behalf of actors that are actually assigned to them.