Skip to content

fix: wait for MCP rollout before creating CatalogSource in quay e2e tests#79379

Open
harishsurf wants to merge 1 commit into
openshift:mainfrom
harishsurf:fix/quay-catalogsource-mcp-ordering
Open

fix: wait for MCP rollout before creating CatalogSource in quay e2e tests#79379
harishsurf wants to merge 1 commit into
openshift:mainfrom
harishsurf:fix/quay-catalogsource-mcp-ordering

Conversation

@harishsurf
Copy link
Copy Markdown
Contributor

@harishsurf harishsurf commented May 17, 2026

Summary

  • Reorders wait_mcp_ready to run before create_catalog_source in the enable-quay-catalogsource step
  • The ICSP and pull secret changes trigger an MCP rollout — nodes must restart CRI-O to pick up the new Konflux registry credentials
  • Previously, the CatalogSource pod was created before nodes had the updated pull secret, causing ImagePullBackOff: unauthorized for the FBC image

Root Cause

The enable-quay-catalogsource step was calling functions in this order:

  1. update_pull_secret — adds Konflux auth to global pull secret
  2. create_icsp — triggers MCP rollout (nodes need to restart CRI-O)
  3. create_catalog_source — OLM creates catalog pod immediately
  4. check_catalog_source_status — waits 600s, times out
  5. wait_mcp_ready — nodes finally have credentials, but too late

The catalog pod hits ImagePullBackOff because the node's CRI-O still has the old pull secret without Konflux auth.

Fix

Move wait_mcp_ready before create_catalog_source:

  1. update_pull_secret
  2. create_icsp
  3. wait_mcp_ready — wait for nodes to have credentials
  4. create_catalog_source — now pods can pull
  5. check_catalog_source_status

Evidence

From must-gather pod YAML (fbc-operator-catalog-snvkw):

state:
  waiting:
    message: 'Back-off pulling image "quay.io/redhat-user-workloads/quay-eng-tenant/stable-3-16-v4-22@sha256:75c9bb9d...":
      ErrImagePull: unauthorized: access to the requested resource is not authorized'
    reason: ImagePullBackOff

Affected Prow jobs:

  • periodic-ci-quay-quay-tests-master-quay-api-quay-e2e-tests-quay316-api-testing/2055341040251965440
  • periodic-ci-quay-quay-tests-master-quay-api-quay-e2e-tests-quay316-api-testing/2054988676785508352

Test plan

  • Verify the periodic quay316 API e2e test passes on the next run after merge
  • Confirm fbc-operator-catalog pod can pull the FBC image successfully

🤖 Generated with Claude Code

Fix: Wait for MCP rollout before creating CatalogSource

This PR fixes the Quay CatalogSource installation process in OpenShift CI by correcting the orchestration order of operations in the enable-quay-catalogsource Prow step.

Problem

When installing the Quay operator from a custom FBC (File-Based Catalog) image, the script updates the global pull secret and creates an ImageContentSourcePolicy (ICSP) to redirect image pulls to the Konflux registry. This triggers a MachineConfigPool (MCP) rollout where worker nodes must restart CRI-O to pick up the new registry credentials. Previously, the CatalogSource pod was created before this MCP rollout completed, causing it to fail with ImagePullBackOff due to unauthorized access when attempting to pull the FBC image with outdated node credentials.

Solution

Reorders the execution sequence in the custom catalog path so that wait_mcp_ready executes after creating the ICSP but before creating the CatalogSource. The new sequence is:

  1. Update global pull secret (registers Konflux registry credentials)
  2. Create ImageContentSourcePolicy (redirects image pulls)
  3. Wait for MCP rollout to complete (ensures nodes have updated credentials)
  4. Create CatalogSource (pods can now successfully pull the FBC image)
  5. Check CatalogSource status

This ensures nodes have fully applied the new pull credentials before the CatalogSource pod attempts to pull images.

Affected CI

  • Quay periodic e2e test job configuration (periodic-ci-quay-quay-tests-master-quay-api-quay-e2e-tests-quay316-api-testing)
  • File: ci-operator/step-registry/quay-tests/enable-quay-catalogsource/quay-tests-enable-quay-catalogsource-commands.sh

The enable-quay-catalogsource step was creating the CatalogSource
before nodes had the updated pull secret. The ICSP and pull secret
changes trigger an MCP rollout, and nodes can't pull from the
Konflux registry until that rollout completes. Moving wait_mcp_ready
before create_catalog_source ensures nodes have credentials before
the catalog pod attempts to pull the FBC image.

Without this fix, the catalog pod hits ImagePullBackOff with
"unauthorized: access to the requested resource is not authorized"
and the test times out.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 17, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: ff8e02fb-92c7-40b6-ac38-256574f2e19d

📥 Commits

Reviewing files that changed from the base of the PR and between 16e4c03 and da6ee18.

📒 Files selected for processing (1)
  • ci-operator/step-registry/quay-tests/enable-quay-catalogsource/quay-tests-enable-quay-catalogsource-commands.sh

Walkthrough

This PR reorders function calls in the Quay catalog source setup script for unreleased FBC images. The wait_mcp_ready function now executes before create_catalog_source and check_catalog_source_status to ensure the system is ready before attempting catalog source creation.

Changes

MCP Readiness Orchestration

Layer / File(s) Summary
MCP readiness checkpoint in catalog source setup
ci-operator/step-registry/quay-tests/enable-quay-catalogsource/quay-tests-enable-quay-catalogsource-commands.sh
Function call sequence reordered so wait_mcp_ready runs before create_catalog_source and check_catalog_source_status in the unreleased FBC image installation path.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 12
✅ Passed checks (12 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: reordering operations so wait_mcp_ready executes before create_catalog_source, which is the core fix for the ImagePullBackOff issue in the quay e2e tests.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed This PR modifies a bash CI/CD orchestration script, not a Ginkgo test file. The check for "Stable and Deterministic Test Names" applies only to Ginkgo tests and is not applicable here.
Test Structure And Quality ✅ Passed PR modifies only a bash shell script that orchestrates CI operations. No Ginkgo test code is present, making the custom check not applicable.
Microshift Test Compatibility ✅ Passed PR modifies only a shell script for CI step orchestration, not Ginkgo e2e tests. The MicroShift Test Compatibility check is not applicable as no new e2e tests are being added.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No new Ginkgo e2e tests are added in this PR. Only a shell script is modified to reorder CI step operations. SNO compatibility check does not apply.
Topology-Aware Scheduling Compatibility ✅ Passed Change only reorders CI step operations. No deployment manifests, operator code, or scheduling constraints are modified. Check applies only to manifest and operator code changes.
Ote Binary Stdout Contract ✅ Passed Custom check for OTE Binary Stdout Contract is not applicable. PR modifies only a Bash shell script for CI orchestration, not Go test binaries or OTE code with JSON stdout contracts.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed This PR modifies a bash shell script for CI orchestration (not a Ginkgo e2e test). The custom check applies only to new Ginkgo e2e tests (It(), Describe(), etc.), which are not present in this PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 17, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: harishsurf
Once this PR has been reviewed and has the lgtm label, please assign syed for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot requested review from neisw and sosiouxme May 17, 2026 20:57
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@harishsurf: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
periodic-ci-quay-quay-tests-master-quay-newui-quay-e2e-tests-quay317-ocp416-newui-p1-p2 N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-stage-quay-io-quay-e2e-tests-stage-quay-io N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-quay-upgrade-quay317-ocp421-upgrade N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-ocp-418-quay-quay-e2e-tests-quay316-ocp418-unmanaged-tls N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-quay-performance-quay-e2e-tests-quay317-performance-test N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-quay-upgrade-quay316-ocp421-upgrade N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-quay-api-quay-e2e-tests-quay-ocp419-stage-api-testing N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-ocp-418-quay-quay-e2e-tests-quay316-ocp418-aws-s3-couldfront N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-quay-newui-quay-e2e-tests-quay317-ocp416-newui-p3 N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-ocp-4.22-quay-lp-interop-quay-e2e-tests-quay316-ocp422-aws-s3 N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-quay-newui-stress-quay-e2e-tests-quay312-ocp416-newui-stress N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-quay-api-quay-e2e-tests-quay315-api-testing N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-quay-newui-quay-e2e-tests-quay316-ocp416-newui-p1-p2 N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-quay-newui-quay-e2e-tests-quay314-ocp416-newui N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-quay-upgrade-quay314-ocp421-upgrade N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-ocp-419-quay-quay-e2e-tests-ceph-ocp419-quay316 N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-ocp-417-quay-quay-e2e-tests-quay313-ocp417-unmanaged-tls N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-quay-api-quay-e2e-tests-quay318-api-testing N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-ocp-418-quay-quay-e2e-tests-quay316-ocp418-aws-sts N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-quay-api-quay-e2e-tests-quay313-api-testing N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-ocp-418-quay-quay-e2e-tests-quay315-ocp418-virtual-builder N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-ocp-421-quay-quay-e2e-tests-quay317-ocp421-virtual-builder N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-quay-newui-quay-e2e-tests-quay316-ocp416-newui-p3 N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-quay-upgrade-quay315-ocp421-upgrade N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-ocp-420-quay-quay-e2e-tests-quay316-ocp420-virtual-builder N/A periodic Registry content changed

A total of 48 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs.

A full list of affected jobs can be found here
Prior to this PR being merged, you will need to either run and acknowledge or opt to skip these rehearsals.

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 17, 2026

@harishsurf: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant