Conversation
PR openshift#5688 introduced a bug in the rollback logic for systemd unit updates. When an update fails and needs to roll back, the code was using the forward-direction unit diffs (old->new) instead of recalculating the reverse-direction diffs (new->old). This caused incomplete rollbacks that could leave systemd units in an inconsistent state. This issue manifested during upgrades where kube-apiserver pods failed to terminate gracefully, likely because kubelet or related systemd units weren't properly restored during rollback. The fix ensures that: 1. In updateOnClusterLayering: configs are properly swapped (was using oldIgnConfig->newIgnConfig, now uses newIgnConfig->oldIgnConfig) 2. In all three update functions: unit diffs are recalculated for the rollback direction (new->old) to ensure only the correct units are written during rollback Related: OCPBUGS-77221, OCPBUGS-58023 Test failure: periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-upgrade-fips Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
|
Skipping CI for Draft Pull Request. |
|
Important Review skippedAuto reviews are limited based on label configuration. 🚫 Excluded labels (none allowed) (1)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-upgrade-fips |
|
@isabella-janssen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/8c879890-22f3-11f1-8dd6-21b60c063f9a-0 |
|
/payload-job periodic-ci-openshift-release-main-nightly-4.19-e2e-aws-ovn-upgrade-fips |
|
@isabella-janssen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/933271b0-22f3-11f1-9079-205649f0ebe8-0 |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: isabella-janssen The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/payload-aggregate periodic-ci-openshift-release-main-nightly-4.19-e2e-aws-ovn-upgrade-fips 7 |
|
@isabella-janssen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/d4ec9fd0-23c6-11f1-9fd3-a09afc387bd4-0 |
PR #5688 introduced a bug in the rollback logic for systemd unit updates. When an update fails and needs to roll back, the code was using the forward-direction unit diffs (old->new) instead of recalculating the reverse-direction diffs (new->old). This caused incomplete rollbacks that could leave systemd units in an inconsistent state.
This issue manifested during upgrades where kube-apiserver pods failed to terminate gracefully, likely because kubelet or related systemd units weren't properly restored during rollback.
The fix ensures that:
Related: OCPBUGS-77221, OCPBUGS-58023
Test failure: periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-upgrade-fips
- What I did
- How to verify it
- Description for the changelog