AGENT-1449: Add IRI registry authentication support to MCO#5765
AGENT-1449: Add IRI registry authentication support to MCO#5765rwsu wants to merge 2 commits intoopenshift:mainfrom
Conversation
|
@rwsu: An error was encountered searching for bug AGENT-1449 on the Jira server at https://issues.redhat.com. No known errors were detected, please see the full error message for details. Full error message.
No response returned: Get "https://issues.redhat.com/rest/api/2/issue/AGENT-1449": GET https://issues.redhat.com/rest/api/2/issue/AGENT-1449 giving up after 5 attempt(s)
Please contact an administrator to resolve this issue, then request a bug refresh with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughThreads a new Internal Release Image auth secret (iriAuthSecret) through bootstrap, controller, and renderer flows; merges IRI registry credentials into the cluster pull secret when present/allowed; surfaces htpasswd data to templates; updates related templates and adds unit tests for auth merge and rendering. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.3)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: rwsu The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
pkg/controller/internalreleaseimage/internalreleaseimage_controller.go (1)
378-383: Consider whether merge failure should be fatal.If the IRI auth secret exists, it indicates auth is configured for the registry. When merging credentials into the pull secret fails, nodes may be unable to pull from the authenticated registry. The current warning-only approach could lead to silent authentication failures at runtime.
Consider whether this should return an error to trigger a retry, or at least emit an event for visibility.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go` around lines 378 - 383, Currently mergeIRIAuthIntoPullSecret failure is only logged with klog.Warningf which can hide critical pull-auth problems; change this to return the error from the surrounding reconcile path so the controller retries (i.e., replace the klog.Warningf branch with a return fmt.Errorf(...) or wrapped err from the Reconcile method when ctrl.mergeIRIAuthIntoPullSecret(cconfig, iriAuthSecret) fails), and additionally emit a Kubernetes event for visibility using the controller's event recorder (e.g., ctrl.recorder.Eventf or ctrl.eventRecorder.Eventf) mentioning the merge failure and the secret name (iriAuthSecret) so operators see the issue in events.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/controller/internalreleaseimage/pullsecret.go`:
- Around line 21-31: The code currently constructs iriRegistryHost using
baseDomain without validation, which allows empty/whitespace domains (producing
"api-int.:22625"); before creating iriRegistryHost validate baseDomain (e.g.,
use strings.TrimSpace(baseDomain) and check for empty), and if invalid return an
error (e.g., fmt.Errorf("empty baseDomain") ) so the function fails fast; update
the section that defines iriRegistryHost to perform this check and only build
iriRegistryHost after validation.
In
`@pkg/controller/internalreleaseimage/templates/master/files/usr-local-bin-load-registry-image-sh.yaml`:
- Line 40: The podman pull call unconditionally uses --authfile
/var/lib/kubelet/config.json which fails if that file is missing; change the
logic around the podman pull with registryImage so you first test for the
authfile's existence (e.g. [ -f /var/lib/kubelet/config.json ] and non-empty)
and only add --authfile /var/lib/kubelet/config.json when present, otherwise
perform an unauthenticated podman pull "${registryImage}" fallback; ensure the
conditional preserves current error handling and quoting of registryImage.
---
Nitpick comments:
In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go`:
- Around line 378-383: Currently mergeIRIAuthIntoPullSecret failure is only
logged with klog.Warningf which can hide critical pull-auth problems; change
this to return the error from the surrounding reconcile path so the controller
retries (i.e., replace the klog.Warningf branch with a return fmt.Errorf(...) or
wrapped err from the Reconcile method when
ctrl.mergeIRIAuthIntoPullSecret(cconfig, iriAuthSecret) fails), and additionally
emit a Kubernetes event for visibility using the controller's event recorder
(e.g., ctrl.recorder.Eventf or ctrl.eventRecorder.Eventf) mentioning the merge
failure and the secret name (iriAuthSecret) so operators see the issue in
events.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 28340cd8-6caf-4ee6-8c24-9145c611edd0
📒 Files selected for processing (13)
pkg/controller/bootstrap/bootstrap.gopkg/controller/common/constants.gopkg/controller/internalreleaseimage/internalreleaseimage_bootstrap.gopkg/controller/internalreleaseimage/internalreleaseimage_bootstrap_test.gopkg/controller/internalreleaseimage/internalreleaseimage_controller.gopkg/controller/internalreleaseimage/internalreleaseimage_controller_test.gopkg/controller/internalreleaseimage/internalreleaseimage_helpers_test.gopkg/controller/internalreleaseimage/internalreleaseimage_renderer.gopkg/controller/internalreleaseimage/pullsecret.gopkg/controller/internalreleaseimage/pullsecret_test.gopkg/controller/internalreleaseimage/templates/master/files/iri-registry-auth-htpasswd.yamlpkg/controller/internalreleaseimage/templates/master/files/usr-local-bin-load-registry-image-sh.yamlpkg/controller/internalreleaseimage/templates/master/units/iri-registry.service.yaml
...roller/internalreleaseimage/templates/master/files/usr-local-bin-load-registry-image-sh.yaml
Outdated
Show resolved
Hide resolved
|
/cc @andfasano |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/controller/internalreleaseimage/internalreleaseimage_controller.go (1)
348-350: Silent error suppression is intentional but consider logging.The comment explains this is for upgrade compatibility, but silently discarding errors could mask issues beyond "not found" (e.g., permission errors). Consider at least logging unexpected errors.
💡 Suggested improvement
// Auth secret may not exist during upgrades from non-auth clusters - iriAuthSecret, _ := ctrl.secretLister.Secrets(ctrlcommon.MCONamespace).Get(ctrlcommon.InternalReleaseImageAuthSecretName) + iriAuthSecret, err := ctrl.secretLister.Secrets(ctrlcommon.MCONamespace).Get(ctrlcommon.InternalReleaseImageAuthSecretName) + if err != nil && !errors.IsNotFound(err) { + klog.V(4).Infof("Could not get IRI auth secret: %v", err) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go` around lines 348 - 350, The code currently discards the error returned by ctrl.secretLister.Secrets(...).Get when retrieving iriAuthSecret; change it to capture the error (e.g., iriAuthSecret, err := ctrl.secretLister.Secrets(ctrlcommon.MCONamespace).Get(ctrlcommon.InternalReleaseImageAuthSecretName) ) and if err != nil and the error is not a NotFound (use apierrors.IsNotFound(err)) log the unexpected error with the controller logger (e.g., ctrl.Log or the controller's logger instance) including context like "getting InternalReleaseImageAuthSecret" and the err; keep the existing behavior of silently continuing on NotFound to preserve upgrade compatibility.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go`:
- Around line 348-350: The code currently discards the error returned by
ctrl.secretLister.Secrets(...).Get when retrieving iriAuthSecret; change it to
capture the error (e.g., iriAuthSecret, err :=
ctrl.secretLister.Secrets(ctrlcommon.MCONamespace).Get(ctrlcommon.InternalReleaseImageAuthSecretName)
) and if err != nil and the error is not a NotFound (use
apierrors.IsNotFound(err)) log the unexpected error with the controller logger
(e.g., ctrl.Log or the controller's logger instance) including context like
"getting InternalReleaseImageAuthSecret" and the err; keep the existing behavior
of silently continuing on NotFound to preserve upgrade compatibility.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3bcaa665-1174-478d-bf7c-2e0eae80cbf1
📒 Files selected for processing (13)
pkg/controller/bootstrap/bootstrap.gopkg/controller/common/constants.gopkg/controller/internalreleaseimage/internalreleaseimage_bootstrap.gopkg/controller/internalreleaseimage/internalreleaseimage_bootstrap_test.gopkg/controller/internalreleaseimage/internalreleaseimage_controller.gopkg/controller/internalreleaseimage/internalreleaseimage_controller_test.gopkg/controller/internalreleaseimage/internalreleaseimage_helpers_test.gopkg/controller/internalreleaseimage/internalreleaseimage_renderer.gopkg/controller/internalreleaseimage/pullsecret.gopkg/controller/internalreleaseimage/pullsecret_test.gopkg/controller/internalreleaseimage/templates/master/files/iri-registry-auth-htpasswd.yamlpkg/controller/internalreleaseimage/templates/master/files/usr-local-bin-load-registry-image-sh.yamlpkg/controller/internalreleaseimage/templates/master/units/iri-registry.service.yaml
🚧 Files skipped from review as they are similar to previous changes (6)
- pkg/controller/common/constants.go
- pkg/controller/internalreleaseimage/templates/master/files/usr-local-bin-load-registry-image-sh.yaml
- pkg/controller/internalreleaseimage/pullsecret.go
- pkg/controller/internalreleaseimage/internalreleaseimage_renderer.go
- pkg/controller/internalreleaseimage/templates/master/files/iri-registry-auth-htpasswd.yaml
- pkg/controller/internalreleaseimage/pullsecret_test.go
|
/jira refresh |
|
@rwsu: This pull request references AGENT-1449 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Add htpasswd-based authentication to the IRI registry. The installer generates credentials and provides them via a bootstrap secret. The MCO mounts the htpasswd file into the registry container and configures registry auth environment variables. The registry password is merged into the node pull secret so kubelet can authenticate when pulling the release image. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
The IRI controller merges registry auth credentials into the global pull secret after bootstrap. This triggers the template controller to re-render template MCs (00-master, etc.) with the updated pull secret, producing a different rendered MC hash than what bootstrap created. The mismatch causes the MCD DaemonSet pod to fail during bootstrap: it reads the bootstrap-rendered MC name from the node annotation, but that MC no longer exists in-cluster (replaced by the re-rendered one). The MCD falls back to reading /etc/machine-config-daemon/currentconfig, which was never written because the firstboot MCD detected "no changes" and skipped it. Both master nodes go Degraded and never recover. Fix by merging IRI auth into the pull secret during bootstrap before template MC rendering, so both bootstrap and in-cluster produce identical rendered MC hashes. Extract the pull secret merge logic into a shared MergeIRIAuthIntoPullSecret function used by both the bootstrap path and the in-cluster IRI controller. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
pkg/controller/internalreleaseimage/pullsecret_test.go (1)
53-66: Consider adding a test case for empty baseDomain.The
MergeIRIAuthIntoPullSecretfunction returns an error when baseDomain is empty (seepullsecret.golines 21-23), but this behavior isn't tested. Adding a test case would ensure this validation remains intact.Suggested test case
{ name: "missing auths field returns error", pullSecret: `{"registry":"quay.io"}`, password: "testpassword", baseDomain: "example.com", expectError: true, }, + { + name: "empty baseDomain returns error", + pullSecret: basePullSecret, + password: "testpassword", + baseDomain: "", + expectError: true, + },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/internalreleaseimage/pullsecret_test.go` around lines 53 - 66, Add a unit test in pullsecret_test.go that verifies MergeIRIAuthIntoPullSecret returns an error when baseDomain is an empty string; locate the test table in the existing tests (near the cases named "invalid JSON returns error" and "missing auths field returns error") and add a new case with name like "empty baseDomain returns error", pullSecret set to a valid JSON auth object, password set to a non-empty value, baseDomain set to "", and expectError true to assert the function MergeIRIAuthIntoPullSecret enforces the non-empty baseDomain validation.pkg/controller/internalreleaseimage/internalreleaseimage_controller.go (1)
348-350: Consider distinguishing NotFound from unexpected errors when fetching auth secret.Line 349 silently discards all errors, which is intentional for upgrade compatibility (auth secret may not exist). However, this also masks unexpected errors (e.g., network issues, RBAC problems) that might warrant logging or different handling.
Suggested improvement
// Auth secret may not exist during upgrades from non-auth clusters - iriAuthSecret, _ := ctrl.secretLister.Secrets(ctrlcommon.MCONamespace).Get(ctrlcommon.InternalReleaseImageAuthSecretName) + iriAuthSecret, err := ctrl.secretLister.Secrets(ctrlcommon.MCONamespace).Get(ctrlcommon.InternalReleaseImageAuthSecretName) + if err != nil && !errors.IsNotFound(err) { + klog.V(4).Infof("Could not get IRI auth secret %s: %v", ctrlcommon.InternalReleaseImageAuthSecretName, err) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go` around lines 348 - 350, When fetching the auth secret with ctrl.secretLister.Secrets(ctrlcommon.MCONamespace).Get(ctrlcommon.InternalReleaseImageAuthSecretName) do not ignore all errors: check the returned error and use apierrors.IsNotFound(err) to treat a missing secret as OK for upgrades, but for any other error log it (via the controller logger) and return/requeue the reconcile with the error so transient RBAC/network issues are surfaced; update the code around iriAuthSecret acquisition in internalreleaseimage_controller.go to handle these two cases explicitly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go`:
- Around line 348-350: When fetching the auth secret with
ctrl.secretLister.Secrets(ctrlcommon.MCONamespace).Get(ctrlcommon.InternalReleaseImageAuthSecretName)
do not ignore all errors: check the returned error and use
apierrors.IsNotFound(err) to treat a missing secret as OK for upgrades, but for
any other error log it (via the controller logger) and return/requeue the
reconcile with the error so transient RBAC/network issues are surfaced; update
the code around iriAuthSecret acquisition in internalreleaseimage_controller.go
to handle these two cases explicitly.
In `@pkg/controller/internalreleaseimage/pullsecret_test.go`:
- Around line 53-66: Add a unit test in pullsecret_test.go that verifies
MergeIRIAuthIntoPullSecret returns an error when baseDomain is an empty string;
locate the test table in the existing tests (near the cases named "invalid JSON
returns error" and "missing auths field returns error") and add a new case with
name like "empty baseDomain returns error", pullSecret set to a valid JSON auth
object, password set to a non-empty value, baseDomain set to "", and expectError
true to assert the function MergeIRIAuthIntoPullSecret enforces the non-empty
baseDomain validation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 347c1736-456a-4dc9-a7a5-da14b0b1b0b1
📒 Files selected for processing (13)
pkg/controller/bootstrap/bootstrap.gopkg/controller/common/constants.gopkg/controller/internalreleaseimage/internalreleaseimage_bootstrap.gopkg/controller/internalreleaseimage/internalreleaseimage_bootstrap_test.gopkg/controller/internalreleaseimage/internalreleaseimage_controller.gopkg/controller/internalreleaseimage/internalreleaseimage_controller_test.gopkg/controller/internalreleaseimage/internalreleaseimage_helpers_test.gopkg/controller/internalreleaseimage/internalreleaseimage_renderer.gopkg/controller/internalreleaseimage/pullsecret.gopkg/controller/internalreleaseimage/pullsecret_test.gopkg/controller/internalreleaseimage/templates/master/files/iri-registry-auth-htpasswd.yamlpkg/controller/internalreleaseimage/templates/master/files/usr-local-bin-load-registry-image-sh.yamlpkg/controller/internalreleaseimage/templates/master/units/iri-registry.service.yaml
🚧 Files skipped from review as they are similar to previous changes (5)
- pkg/controller/common/constants.go
- pkg/controller/internalreleaseimage/pullsecret.go
- pkg/controller/internalreleaseimage/templates/master/files/iri-registry-auth-htpasswd.yaml
- pkg/controller/internalreleaseimage/templates/master/units/iri-registry.service.yaml
- pkg/controller/internalreleaseimage/internalreleaseimage_helpers_test.go
|
/retest-required |
|
@rwsu: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
*- What I did
Add htpasswd-based authentication to the IRI registry. The installer generates credentials and provides them via a bootstrap secret. The MCO mounts the htpasswd file into the registry container and configures registry auth environment variables. The registry password is merged into the node pull secret so kubelet can authenticate when pulling the release image.
- How to verify it
- Description for the changelog
Add htpasswd authentication to the Internal Release Image (IRI) registry. The registry now requires Basic Auth credentials, with the password stored in the internal-release-image-registry-auth secret and automatically merged into the global pull secret so kubelet can authenticate when pulling images. The master MachineConfig is updated to configure the registry's htpasswd file and enable auth environment variables in the systemd unit.