fix(reporter):added idempotency gate to prevent API server flooding#263
fix(reporter):added idempotency gate to prevent API server flooding#263LightCreator1007 wants to merge 1 commit into
Conversation
✅ Deploy Preview for node-readiness-controller canceled.
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: LightCreator1007 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @LightCreator1007. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Hey @LightCreator1007, PR looks good but the diff only touches Also, a comment noting that skipping the write stops refreshing the heartbeat (so it no longer signals the reporter is alive), would be good |
| // to prevent etcd write amplification and control plane flooding. | ||
| if existingCondition != nil && existingCondition.Status == status && existingCondition.Reason == health.Reason && existingCondition.Message == health.Message { | ||
| // since state did not change return early to avoid unnecessary API call | ||
| return nil |
There was a problem hiding this comment.
with the change we will miss the lastHeartbeatTime update. While this is an optional field. It is an important field to show indicating that the health check is running.
how about we add
if time.Since(existingCondition.LastHeartbeatTime.Time) >= 5*time.Minute {
needsUpdate = true // State is unchanged, but we need to refresh the heartbeat
}
so after 5 min even if there is no update to status, we still update once for the LastHeartbeatTime?
There was a problem hiding this comment.
Yeah, that works well. The 5m floor keeps the heartbeat alive so it still signals the reporter is running, while skipping the redundant writes the rest of the time. +1 from me.
Description
Adds an idempotency gate to
updateNodeConditionin thereadiness-condition-reporter.UpdateStatusis now bypassed if theStatus,Reason, andMessageare unchanged, preventing unnecessary API server flooding andetcdwrite amplification on every tick.Related Issue
NONE
Type of Change
/kind bug
Testing
main_test.golocally to assert against thefake.Clientset's tracked actions, confirming that theUpdateStatuscall is definitively bypassed when the condition state is unchanged.Checklist
make testpassesmake lintpassesDoes this PR introduce a user-facing change?