You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Dec 31, 2025. It is now read-only.
Kube-apiserver sends two requests, one second apart, to etcd every 10 seconds [0]. When the etcd certs change on disk (e.g. after etcdadm reset and etcdadm init are invoked), the requests are rejected [1]. This appears to have some impact on etcd performance--still investigating.
When cctl recovers an etcd cluster, it first brings down the existing (potentially degraded) cluster and then brings it back up. This changes the CA certs on disk. When etcd performance is impacted, adding a third member can fail (though typically succeeds on a retry).
The workaround may require
Action Items:
Investigate whether the rejected requests impact etcd performance. If they do, cctl can stop all kube-apiserver instances before--not after--recovering the etcd cluster.
Consider adding retries to etcdadm's etcd API calls
Modify recovery test to use three instead of two masters, and also use the etcd benchmark tool to increase the size of the database.
Kube-apiserver sends two requests, one second apart, to etcd every 10 seconds [0]. When the etcd certs change on disk (e.g. after etcdadm reset and etcdadm init are invoked), the requests are rejected [1]. This appears to have some impact on etcd performance--still investigating.
When cctl recovers an etcd cluster, it first brings down the existing (potentially degraded) cluster and then brings it back up. This changes the CA certs on disk. When etcd performance is impacted, adding a third member can fail (though typically succeeds on a retry).
The workaround may require
Action Items:
[0] etcd-io/etcd#9285.
[1]