-
Notifications
You must be signed in to change notification settings - Fork 48
OCPBUGS-63698: fix(azure): add token-minter for self-managed hosted clusters #461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
/uncc @jsafrane |
|
/uncc @tsmetana |
bef9f49 to
fd11123
Compare
fd11123 to
ee1827c
Compare
|
@bryan-cox: This pull request references Jira Issue OCPBUGS-63698, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Azure Disk and File CSI drivers fail on Azure self-managed hosted clusters because the service account token at /var/run/secrets/openshift/serviceaccount/token does not exist. Add runtime deployment hooks that conditionally inject token-minter sidecar container for self-managed Azure clusters. The token-minter creates guest cluster service account tokens that the CSI drivers use for Azure workload identity authentication. ARO HCP continues to use Secret Provider Class with managed identities and is not affected by this change. Fixes: OCPBUGS-63698 Signed-off-by: Bryan Cox <brcox@redhat.com> Commit-Message-Assisted-by: Claude (via Claude Code)
The token-minter image should use the placeholder instead of reading os.Getenv() directly. The placeholder is replaced at runtime by the DefaultReplacements() function when the operator processes the deployment. This matches the pattern used in AWS EBS static patches.
Fix copy-paste error in DefaultReplacements() where HYPERSHIFT_IMAGE
placeholder replacement was incorrectly gated on csiDriver != ""
instead of hyperShiftImage != "".
This bug prevented ${HYPERSHIFT_IMAGE} placeholders from being
replaced with the actual image value, causing token-minter containers
to have invalid image references.
Signed-off-by: Bryan Cox <brcox@redhat.com>
Commit-Message-Assisted-by: Claude (via Claude Code)
Deployment hooks run after asset placeholder replacement, so
placeholders added by hooks never get replaced. Fix by directly
reading os.Getenv("HYPERSHIFT_IMAGE") in the hook instead of using
a placeholder string.
Also add conditional behavior: if HYPERSHIFT_IMAGE is not set, skip
adding the token-minter container. This allows the same hook to work
for both self-managed Azure (where cluster-storage-operator sets
HYPERSHIFT_IMAGE) and ARO HCP (where it doesn't).
Signed-off-by: Bryan Cox <brcox@redhat.com>
Commit-Message-Assisted-by: Claude (via Claude Code)
a6adfc4 to
0af2834
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: bryan-cox The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/test all |
|
/jira refresh |
|
@bryan-cox: This pull request references Jira Issue OCPBUGS-63698, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (wduan@redhat.com), skipping review request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest |
|
/retest-required |
|
@bryan-cox: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@duanwei33: This PR was included in a payload test run from openshift/cluster-storage-operator#643
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/7142dd10-d6fe-11f0-8d59-3afbbfee03e5-0 |
Summary
Fixes Azure Disk and File CSI drivers on Azure self-managed hosted clusters by adding a token-minter sidecar container.
Problem
On Azure self-managed hosted clusters (HyperShift mode), Azure Disk and File CSI driver controllers fail to provision volumes with errors:
WorkloadIdentityCredential: open /var/run/secrets/openshift/serviceaccount/token: no such file or directoryfailed to ensure storage account: clientFactory is nilThe CSI driver controllers run in the management cluster but need guest cluster service account tokens for Azure workload identity authentication. The token file at
/var/run/secrets/openshift/serviceaccount/tokendoes not exist because there is no mechanism to create it.Solution
This PR adds a shared
WithTokenMinter(serviceAccountName string)deployment hook function inpkg/driver/common/operator/hooks.gothat both Azure Disk and File CSI driver operators use to inject a token-minter sidecar container.The token-minter sidecar:
/usr/bin/control-plane-operator token-mintercommandopenshift-cluster-csi-drivers/var/run/secrets/openshift/serviceaccount/tokenin a shared emptyDir volumeservice-network-admin-kubeconfigsecret to access the guest clusterHYPERSHIFT_IMAGEenv var directly (not placeholder) since deployment hooks run after asset replacementNote: The
bound-sa-tokenemptyDir volume andhosted-kubeconfigsecret volume are already added by the HyperShift patch files (controller_add_hypershift_controller.yaml), so the hook only adds the token-minter container.Platform-Specific Behavior
The hook is added to both Azure Disk and File drivers. The platform-specific behavior is controlled by cluster-storage-operator:
HYPERSHIFT_IMAGEenv var to the CSI driver operators, enabling token-minter functionalityHYPERSHIFT_IMAGE, as ARO HCP uses Secret Provider Class with managed identities insteadThis follows the same pattern already used by AWS EBS CSI driver and Azure Cloud Controller Manager.
Changes
WithTokenMinter(serviceAccountName string)deployment hook (lines 257-301)WithTokenMinter()withazure-disk-csi-driver-controller-sa(line 228)WithTokenMinter()withazure-file-csi-driver-controller-sa(line 187)Testing
On an Azure self-managed hosted cluster:
Related PRs
References