Skip to content

fix(reporter):added idempotency gate to prevent API server flooding#263

Open
LightCreator1007 wants to merge 1 commit into
kubernetes-sigs:mainfrom
LightCreator1007:fix/has-taint-by-spec-value
Open

fix(reporter):added idempotency gate to prevent API server flooding#263
LightCreator1007 wants to merge 1 commit into
kubernetes-sigs:mainfrom
LightCreator1007:fix/has-taint-by-spec-value

Conversation

@LightCreator1007
Copy link
Copy Markdown

Description

Adds an idempotency gate to updateNodeCondition in the readiness-condition-reporter.
UpdateStatus is now bypassed if the Status, Reason, and Message are unchanged, preventing unnecessary API server flooding and etcd write amplification on every tick.

Related Issue

NONE

Type of Change

/kind bug

Testing

  • Modified main_test.go locally to assert against the fake.Clientset's tracked actions, confirming that the UpdateStatus call is definitively bypassed when the condition state is unchanged.

Checklist

  • make test passes
  • make lint passes

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label May 20, 2026
@netlify
Copy link
Copy Markdown

netlify Bot commented May 20, 2026

Deploy Preview for node-readiness-controller canceled.

Name Link
🔨 Latest commit 0eb7ec2
🔍 Latest deploy log https://app.netlify.com/projects/node-readiness-controller/deploys/6a0dc91fa38ac10008fce857

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: LightCreator1007
Once this PR has been reviewed and has the lgtm label, please assign dchen1107 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 20, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @LightCreator1007. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label May 20, 2026
@sahitya-chandra
Copy link
Copy Markdown
Contributor

sahitya-chandra commented May 20, 2026

Hey @LightCreator1007, PR looks good but the diff only touches main.go, so the skip path isn't covered. Could you add a test where the node already has the matching condition and assert no update is sent?

Also, a comment noting that skipping the write stops refreshing the heartbeat (so it no longer signals the reporter is alive), would be good

// to prevent etcd write amplification and control plane flooding.
if existingCondition != nil && existingCondition.Status == status && existingCondition.Reason == health.Reason && existingCondition.Message == health.Message {
// since state did not change return early to avoid unnecessary API call
return nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the change we will miss the lastHeartbeatTime update. While this is an optional field. It is an important field to show indicating that the health check is running.

how about we add

if time.Since(existingCondition.LastHeartbeatTime.Time) >= 5*time.Minute {
  needsUpdate = true // State is unchanged, but we need to refresh the heartbeat
}

so after 5 min even if there is no update to status, we still update once for the LastHeartbeatTime?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that works well. The 5m floor keeps the heartbeat alive so it still signals the reporter is running, while skipping the redundant writes the rest of the time. +1 from me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants