fix: webhook fails closed when rule listing errors#252
Conversation
✅ Deploy Preview for node-readiness-controller canceled.
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dorodb-web22 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @dorodb-web22. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
a39181c to
d439b78
Compare
| // Fail closed: if we can't list rules, we cannot safely validate | ||
| // for conflicts. Reject the request so the client can retry. | ||
| return append(allErrs, field.InternalError( | ||
| field.NewPath("spec", "taint", "key"), |
There was a problem hiding this comment.
I'm not sure about using field.NewPath("spec", "taint", "key") here. A List failure is a systemic/infrastructure issue, not an error with the user's input in that specific field. Attributing it to the taint key could be misleading for users debugging their YAML.
A failure to list existing NodeReadinessRules is a systemic/
infrastructure error, not an error with any specific field in
the user's input. Using field.NewPath("spec","taint","key") as
the path misleads users into thinking their taint key is wrong
when debugging their YAML.
Switch to nil path for field.InternalError so the error message
correctly signals an internal issue, not a validation failure on
a user-controlled field.
Addresses review feedback from AvineshTripathi on PR kubernetes-sigs#252.
There was a problem hiding this comment.
True, @AvineshTripathi! using that path would mislead users into thinking their YAML is malformed when the real problem is on the controller side.
Fixed by switching to nil as the field path in field.InternalError, so the error correctly signals an internal issue rather than a field-level validation failure.
@dorodb-web22 the PR doesnt seem to reflect this change? Could you PTAL. |
|
/assign @dorodb-web22 Please reassign the PR for approval once ready for final review. Thanks |
Fixes #251
When
validateTaintConflictscannot list existing rules, it previously logged the error and allowed the request through (fail-open).This means two rules with the same taint key/effect and overlapping selectors could be created during an API degradation window, causing taints to flip-flop indefinitely on affected nodes.
This change returns an
InternalErrorinstead, rejecting the request with a clear message so the client can retry when the system is healthy.This aligns with the Kubernetes admission webhook best-practice offail-closed validation.