Environment
- OS: Amazon Linux 2023.11.20260514
- Kernel:
6.12.83-115.161.amzn2023.aarch64 (arm64)
- containerd:
2.2.3+unknown
- Kubernetes:
v1.34.8-eks-3385e9b
Describe the bug
When a CreateContainer call times out with context deadline exceeded, containerd leaves a zombie name reservation in its metadata store. All subsequent retries by kubelet fail immediately with:
failed to reserve container name "<name>"; check if another CreateContainer request is in progress:
name "<name>" is reserved for "<original-container-id>"
The pod is permanently stuck — kubelet loops retrying but can never create the container. The only recovery is restarting containerd on the affected node.
Steps to reproduce
- Run a Kubernetes pod under load on an arm64 AL2023 node
- Trigger a
CreateContainer timeout (e.g. resource pressure causes the first attempt to exceed the gRPC deadline)
- Observe kubelet retrying and hitting the reservation error indefinitely
Expected behavior
After a timed-out CreateContainer, containerd should clean up or expire the name reservation so retries can succeed.
Actual behavior
The name reservation persists indefinitely. kubectl describe pod shows:
Warning Failed 5m kubelet spec.initContainers{init-permissions}: Error: context deadline exceeded
Warning Failed 5s x25 kubelet spec.initContainers{init-permissions}: Error: failed to reserve container name
"..."; check if another CreateContainer request is in progress:
name "..." is reserved for "<zombie-container-id>"
Related upstream issues
Environment
6.12.83-115.161.amzn2023.aarch64(arm64)2.2.3+unknownv1.34.8-eks-3385e9bDescribe the bug
When a
CreateContainercall times out withcontext deadline exceeded, containerd leaves a zombie name reservation in its metadata store. All subsequent retries by kubelet fail immediately with:The pod is permanently stuck — kubelet loops retrying but can never create the container. The only recovery is restarting containerd on the affected node.
Steps to reproduce
CreateContainertimeout (e.g. resource pressure causes the first attempt to exceed the gRPC deadline)Expected behavior
After a timed-out
CreateContainer, containerd should clean up or expire the name reservation so retries can succeed.Actual behavior
The name reservation persists indefinitely.
kubectl describe podshows:Related upstream issues