Skip to content

dnm: test olmv1 deployment#2160

Open
weshayutin wants to merge 1 commit intoopenshift:oadp-devfrom
weshayutin:olmv1tests
Open

dnm: test olmv1 deployment#2160
weshayutin wants to merge 1 commit intoopenshift:oadp-devfrom
weshayutin:olmv1tests

Conversation

@weshayutin
Copy link
Copy Markdown
Contributor

@weshayutin weshayutin commented Apr 10, 2026

Add OLMv1 lifecycle tests for OADP operator

Summary

  • Add a new Ginkgo test suite (tests/olmv1/) that validates OADP installs, runs, and upgrades correctly via OLM v1's ClusterExtension API on OCP 4.18+ clusters
  • Add make test-olmv1 and make test-olmv1-cleanup Makefile targets with full parameterization via OLMV1_* variables
  • Handle OLMv1-specific requirements: watchNamespace for OwnNamespace install mode, catalog selector targeting, orphaned CRD cleanup, and SelfCertified upgrade constraint policy for dev builds

Background

OLM v1 replaces the classic OLM (Subscription/ClusterServiceVersion) model with a single ClusterExtension resource. OADP uses OwnNamespace install mode, which requires OLMv1-specific configuration (spec.config.inline.watchNamespace) that has no equivalent in OLMv0. These tests verify that OADP installs correctly under OLMv1 and catch regressions in bundle compatibility.

Reference: OCP 4.21 Extensions Documentation

Test cases

# Test Description
1 Install via ClusterExtension Creates a ClusterExtension, waits for Installed=True, verifies bundle name/version
2 Controller-manager pod running Confirms the OADP controller-manager pod reaches Running phase
3 CRDs installed Verifies 6 key OADP and Velero CRDs exist
4 No deprecation warnings Checks Deprecated, PackageDeprecated, ChannelDeprecated, BundleDeprecated are all False
5 Upgrade (optional) Patches the version with upgradeConstraintPolicy: SelfCertified, verifies bundle version changes. Skipped when --upgrade-version is not set

Design decisions

  • Dynamic client for OLMv1 CRs -- Uses k8s.io/client-go/dynamic instead of importing operator-controller Go types, avoiding dependency conflicts.
  • watchNamespace -- OADP only supports OwnNamespace install mode. OLMv1 requires spec.config.inline.watchNamespace set to the installation namespace; without it, install fails with InvalidConfiguration.
  • Catalog selector -- When a custom catalog image is provided, the ClusterExtension uses spec.source.catalog.selector.matchLabels to target only that catalog, preventing resolution ambiguity with default Red Hat catalogs.
  • Orphaned CRD cleanup -- BeforeAll deletes pre-existing *.oadp.openshift.io and *.velero.io CRDs from prior OLMv0 installs or test runs. OLMv1 cannot adopt CRDs it did not create.
  • Fail-fast on terminal errors -- The install polling loop logs all conditions on every poll and fails immediately on terminal Progressing reasons (InvalidConfiguration, Failed) instead of waiting the full 10-minute timeout.

Usage

# Build and push operator + catalog images first
make docker-build docker-push bundle bundle-build bundle-push catalog-build catalog-push

# Run OLMv1 tests against the custom catalog
make test-olmv1 OLMV1_CATALOG_IMAGE=ttl.sh/oadp-operator-catalog-<hash>:1h

# Or test against a productized catalog already on the cluster
make test-olmv1 OLMV1_PACKAGE=redhat-oadp-operator

# Manual cleanup if needed
make test-olmv1-cleanup

Prerequisites

  • OCP 4.18+ cluster with OLMv1 enabled (operator-controller and catalogd running)
  • TechPreviewNoUpgrade feature set enabled (for watchNamespace support)
  • No pre-existing OADP OLMv0 installation (run make undeploy-olm first if needed; the test cleans up orphaned CRDs but not OLMv0 Subscriptions/CSVs)

Test plan

  • make test-olmv1 passes on a clean OCP 4.22 cluster with a custom catalog image
  • make test-olmv1 passes on back-to-back runs (cleanup + re-run is idempotent)
  • Fail-fast triggers immediately on InvalidConfiguration instead of timing out
  • make test-olmv1 with OLMV1_UPGRADE_VERSION set (requires a catalog with multiple versions)
  • make test-olmv1 with OLMV1_PACKAGE=redhat-oadp-operator against a productized catalog

Signed-off-by: Wesley Hayutin <weshayutin@gmail.com>
@weshayutin weshayutin self-assigned this Apr 10, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 10, 2026

Walkthrough

This pull request introduces OLMv1 end-to-end testing infrastructure by adding Makefile targets for executing and cleaning up tests, configurable OLMv1 parameters, and comprehensive Ginkgo test suite files covering OADP operator lifecycle management. A Kustomize image override was also updated to use a temporary repository.

Changes

Cohort / File(s) Summary
Makefile Configuration
Makefile
Added 9 OLMv1-specific variables (OLMV1_PACKAGE, OLMV1_NAMESPACE, OLMV1_CHANNEL, OLMV1_VERSION, OLMV1_UPGRADE_VERSION, OLMV1_CATALOG, OLMV1_CATALOG_IMAGE, OLMV1_SERVICE_ACCOUNT, OLMV1_FAIL_FAST, OLMV1_GINKGO_FLAGS) and two phony targets: test-olmv1 (runs Ginkgo tests with OLMv1 configuration) and test-olmv1-cleanup (deletes OLM-related cluster resources).
Kustomize Deployment Config
config/manager/kustomization.yaml
Updated controller image override from quay.io/konveyor/oadp-operator:latest to ttl.sh/oadp-operator-7e53a850:1h (temporary image repository).
OLMv1 Test Infrastructure
tests/olmv1/\.gitignore, tests/olmv1/olmv1_suite_test.go, tests/olmv1/olmv1_install_test.go
Added new OLMv1 E2E test suite with Ginkgo test setup (olmv1_suite_test.go) including CLI flag parsing, Kubernetes client initialization, and helper functions for namespace/RBAC/ClusterCatalog management. Implemented lifecycle test suite (olmv1_install_test.go) validating OADP operator installation, runtime correctness, deprecation conditions, and upgrade scenarios. Added .gitignore for tmp/ directory.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@weshayutin weshayutin added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 10, 2026
@openshift-ci openshift-ci bot requested review from mpryc and sseago April 10, 2026 23:51
@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 10, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: weshayutin
Once this PR has been reviewed and has the lgtm label, please assign shubham-pampattiwar for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@config/manager/kustomization.yaml`:
- Around line 7-8: The checked-in kustomization currently pins newName/newTag to
the ephemeral ttl.sh image (ttl.sh/oadp-operator-7e53a850:1h); replace this with
a stable, non-expiring default (e.g., the official oadp-operator image and a
permanent tag or digest) or remove the ttl.sh newName/newTag entries so the
repository manifest does not reference an expiring image; ensure any temporary
ttl.sh usage is moved into CI/deploy workflows that inject the test image
dynamically rather than committing it to kustomization.yaml.

In `@Makefile`:
- Around line 1052-1057: The cleanup target test-olmv1-cleanup unconditionally
deletes ClusterCatalog $(OLMV1_CATALOG); guard that deletion so we only remove a
catalog the tests created by checking the same creation condition or a creation
marker. Modify test-olmv1-cleanup to only run the $(OC_CLI) delete
clustercatalog $(OLMV1_CATALOG) line when OLMV1_CATALOG_IMAGE is set (or when a
persisted marker like OLMV1_CATALOG_CREATED file/env var exists), and update the
Catalog creation step (the rule that creates the catalog) to set that marker
(e.g., touch a file or export a flag) so cleanup can safely detect it before
deleting.

In `@tests/olmv1/olmv1_install_test.go`:
- Around line 107-122: The current gomega.Eventually loop only checks
pod.Status.Phase == corev1.PodRunning which can false-pass; update the check in
the Eventually closure (the block using kubeClient.CoreV1().Pods(...).List and
iterating pods.Items) to verify readiness instead: for each pod, inspect
pod.Status.Conditions for condition.Type == corev1.PodReady with Status ==
corev1.ConditionTrue (or alternatively fetch the Deployment via
kubeClient.AppsV1().Deployments(namespace).Get and assert the Deployment status
has an Available condition == True / status.AvailableReplicas > 0); apply the
same change to the other occurrence mentioned (lines 213-226) so tests assert
PodReady or Deployment Available rather than just PodRunning.
- Around line 167-181: The current code reads the ClusterExtension via
getClusterExtension and then calls
dynamicClient.Resource(clusterExtensionGVR).Update after changing catalogSpec
(using unstructuredNestedMap/unstructuredSetNestedMap), which can race
controller status updates and yield 409s; instead patch only the
spec.source.catalog fields (or wrap the update in retry.RetryOnConflict) rather
than updating the whole object: construct a minimal merge patch containing
spec.source.catalog.version and spec.source.catalog.upgradeConstraintPolicy and
call dynamicClient.Resource(clusterExtensionGVR).Patch with types.MergePatchType
(add import "k8s.io/apimachinery/pkg/types"), or if you prefer keep Update, wrap
the read/modify/write in retry.RetryOnConflict to retry on conflicts. Ensure
references to getClusterExtension, unstructuredNestedMap,
unstructuredSetNestedMap, and dynamicClient.Resource(clusterExtensionGVR) are
updated accordingly.

In `@tests/olmv1/olmv1_suite_test.go`:
- Around line 110-126: The ClusterRoleBinding name in ensureClusterAdminBinding
only uses saName so it can collide across namespaces; change the naming or
reconcile existing bindings: either make bindingName include the namespace
(e.g., bindingName := saName + "-" + ns + "-cluster-admin") so it's unique per
namespace, or when Create returns AlreadyExists call
kubeClient.RbacV1().ClusterRoleBindings().Get to load the existing
ClusterRoleBinding and update its Subjects (add or replace the ServiceAccount
subject for {Name: saName, Namespace: ns}) and then call Update to persist the
corrected subjects; implement one of these approaches inside
ensureClusterAdminBinding.
- Around line 270-295: The cleanupOrphanedCRDs function is destructive on shared
clusters; change it to be gated behind an explicit opt-in (e.g., a test flag or
env var like TEST_DELETE_ORPHAN_CRDS) or a deterministic dedicated-cluster check
before calling dynamicClient.Resource(crdGVR).Delete, and after issuing Delete
for each CRD found by cleanupOrphanedCRDs poll/wait (using
dynamicClient.Resource(crdGVR).Get in a loop with backoff and timeout) until
apierrors.IsNotFound confirms the resource is fully removed before counting it
as deleted; ensure you still handle non-NotFound errors via logging and skip
deletion when the opt-in flag is not set.
- Around line 315-323: When Create on dynamicClient.Resource(clusterCatalogGVR)
returns apierrors.IsAlreadyExists(err), fetch the existing ClusterCatalog (using
dynamicClient.Resource(clusterCatalogGVR).Get with the same name and ctx) and
validate its image field against the requested image variable; if they differ,
fail the test (gomega.Expect/return error) or update/replace the catalog to
match the requested image instead of silently reusing it. Ensure the check
references the existing object's image path (e.g., status/spec field used for
image in ClusterCatalog) and only set createdCatalog = true and log "Created
ClusterCatalog" when you actually created or successfully reconciled the
resource to the desired image.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 77bf8ce7-890a-4973-98c9-d65314a9d12c

📥 Commits

Reviewing files that changed from the base of the PR and between 7e53a85 and 2b88073.

📒 Files selected for processing (5)
  • Makefile
  • config/manager/kustomization.yaml
  • tests/olmv1/.gitignore
  • tests/olmv1/olmv1_install_test.go
  • tests/olmv1/olmv1_suite_test.go

Comment on lines +7 to +8
newName: ttl.sh/oadp-operator-7e53a850
newTag: 1h
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't commit an expiring ttl.sh image as the default manager image.

ttl.sh/…:1h stops existing after an hour, so any flow that consumes this checked-in kustomization becomes non-reproducible and will break shortly after merge. Keep temporary test images in the test/deploy workflow, not in the repository’s default manifest.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/manager/kustomization.yaml` around lines 7 - 8, The checked-in
kustomization currently pins newName/newTag to the ephemeral ttl.sh image
(ttl.sh/oadp-operator-7e53a850:1h); replace this with a stable, non-expiring
default (e.g., the official oadp-operator image and a permanent tag or digest)
or remove the ttl.sh newName/newTag entries so the repository manifest does not
reference an expiring image; ensure any temporary ttl.sh usage is moved into
CI/deploy workflows that inject the test image dynamically rather than
committing it to kustomization.yaml.

Comment on lines +1052 to +1057
.PHONY: test-olmv1-cleanup
test-olmv1-cleanup: login-required ## Cleanup resources created by OLMv1 tests.
$(OC_CLI) delete clusterextension oadp-operator --ignore-not-found=true
$(OC_CLI) delete clustercatalog $(OLMV1_CATALOG) --ignore-not-found=true
$(OC_CLI) delete clusterrolebinding $(OLMV1_SERVICE_ACCOUNT)-cluster-admin --ignore-not-found=true
$(OC_CLI) delete sa $(OLMV1_SERVICE_ACCOUNT) -n $(OLMV1_NAMESPACE) --ignore-not-found=true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard test-olmv1-cleanup from deleting non-test ClusterCatalogs.

This target always deletes $(OLMV1_CATALOG), but the suite only creates a ClusterCatalog when OLMV1_CATALOG_IMAGE is set. If someone points OLMV1_CATALOG at an existing/shared catalog, cleanup removes a resource the test did not create. Gate Line 1055 on the same creation condition, or persist a “created catalog” marker and only delete when that marker is present.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 1052 - 1057, The cleanup target test-olmv1-cleanup
unconditionally deletes ClusterCatalog $(OLMV1_CATALOG); guard that deletion so
we only remove a catalog the tests created by checking the same creation
condition or a creation marker. Modify test-olmv1-cleanup to only run the
$(OC_CLI) delete clustercatalog $(OLMV1_CATALOG) line when OLMV1_CATALOG_IMAGE
is set (or when a persisted marker like OLMV1_CATALOG_CREATED file/env var
exists), and update the Catalog creation step (the rule that creates the
catalog) to set that marker (e.g., touch a file or export a flag) so cleanup can
safely detect it before deleting.

Comment on lines +107 to +122
gomega.Eventually(func() (bool, error) {
pods, err := kubeClient.CoreV1().Pods(namespace).List(ctx, metav1.ListOptions{
LabelSelector: managerLabelSelector,
})
if err != nil {
return false, err
}
for _, pod := range pods.Items {
if pod.Status.Phase == corev1.PodRunning {
log.Printf("Controller-manager pod %s is Running", pod.Name)
return true, nil
}
log.Printf("Controller-manager pod %s phase: %s", pod.Name, pod.Status.Phase)
}
return false, nil
}, 5*time.Minute, 10*time.Second).Should(gomega.BeTrue(), "controller-manager pod should be Running")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Check readiness, not only PodRunning, for controller-manager validation.

A pod can stay in Running while its containers are unready or crash-looping, so both checks can false-pass even though the operator is not usable. Assert PodReady or the Deployment’s Available condition instead.

Also applies to: 213-226

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/olmv1/olmv1_install_test.go` around lines 107 - 122, The current
gomega.Eventually loop only checks pod.Status.Phase == corev1.PodRunning which
can false-pass; update the check in the Eventually closure (the block using
kubeClient.CoreV1().Pods(...).List and iterating pods.Items) to verify readiness
instead: for each pod, inspect pod.Status.Conditions for condition.Type ==
corev1.PodReady with Status == corev1.ConditionTrue (or alternatively fetch the
Deployment via kubeClient.AppsV1().Deployments(namespace).Get and assert the
Deployment status has an Available condition == True / status.AvailableReplicas
> 0); apply the same change to the other occurrence mentioned (lines 213-226) so
tests assert PodReady or Deployment Available rather than just PodRunning.

Comment on lines +167 to +181
obj, err := getClusterExtension(ctx, clusterExtensionName)
gomega.Expect(err).NotTo(gomega.HaveOccurred())

previousBundleName, previousVersion, _ := getInstalledBundle(obj)
log.Printf("Current installed bundle: name=%s version=%s", previousBundleName, previousVersion)

catalogSpec, _, _ := unstructuredNestedMap(obj.Object, "spec", "source", "catalog")
gomega.Expect(catalogSpec).NotTo(gomega.BeNil())
catalogSpec["version"] = upgradeVersion
catalogSpec["upgradeConstraintPolicy"] = "SelfCertified"
err = unstructuredSetNestedMap(obj.Object, catalogSpec, "spec", "source", "catalog")
gomega.Expect(err).NotTo(gomega.HaveOccurred())

_, err = dynamicClient.Resource(clusterExtensionGVR).Update(ctx, obj, metav1.UpdateOptions{})
gomega.Expect(err).NotTo(gomega.HaveOccurred())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Patch ClusterExtension.spec instead of updating the whole object.

ClusterExtension is controller-managed, so this read/modify/write Update can race status changes and intermittently fail with a 409 conflict. Patch just the version/policy fields, or wrap the update in retry.RetryOnConflict.

Example using a merge patch
-           catalogSpec, _, _ := unstructuredNestedMap(obj.Object, "spec", "source", "catalog")
-           gomega.Expect(catalogSpec).NotTo(gomega.BeNil())
-           catalogSpec["version"] = upgradeVersion
-           catalogSpec["upgradeConstraintPolicy"] = "SelfCertified"
-           err = unstructuredSetNestedMap(obj.Object, catalogSpec, "spec", "source", "catalog")
-           gomega.Expect(err).NotTo(gomega.HaveOccurred())
-
-           _, err = dynamicClient.Resource(clusterExtensionGVR).Update(ctx, obj, metav1.UpdateOptions{})
+           patch := []byte(fmt.Sprintf(
+               `{"spec":{"source":{"catalog":{"version":%q,"upgradeConstraintPolicy":"SelfCertified"}}}}`,
+               upgradeVersion,
+           ))
+           _, err = dynamicClient.Resource(clusterExtensionGVR).Patch(
+               ctx,
+               clusterExtensionName,
+               types.MergePatchType,
+               patch,
+               metav1.PatchOptions{},
+           )
            gomega.Expect(err).NotTo(gomega.HaveOccurred())

Add:

import "k8s.io/apimachinery/pkg/types"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
obj, err := getClusterExtension(ctx, clusterExtensionName)
gomega.Expect(err).NotTo(gomega.HaveOccurred())
previousBundleName, previousVersion, _ := getInstalledBundle(obj)
log.Printf("Current installed bundle: name=%s version=%s", previousBundleName, previousVersion)
catalogSpec, _, _ := unstructuredNestedMap(obj.Object, "spec", "source", "catalog")
gomega.Expect(catalogSpec).NotTo(gomega.BeNil())
catalogSpec["version"] = upgradeVersion
catalogSpec["upgradeConstraintPolicy"] = "SelfCertified"
err = unstructuredSetNestedMap(obj.Object, catalogSpec, "spec", "source", "catalog")
gomega.Expect(err).NotTo(gomega.HaveOccurred())
_, err = dynamicClient.Resource(clusterExtensionGVR).Update(ctx, obj, metav1.UpdateOptions{})
gomega.Expect(err).NotTo(gomega.HaveOccurred())
obj, err := getClusterExtension(ctx, clusterExtensionName)
gomega.Expect(err).NotTo(gomega.HaveOccurred())
previousBundleName, previousVersion, _ := getInstalledBundle(obj)
log.Printf("Current installed bundle: name=%s version=%s", previousBundleName, previousVersion)
patch := []byte(fmt.Sprintf(
`{"spec":{"source":{"catalog":{"version":%q,"upgradeConstraintPolicy":"SelfCertified"}}}}`,
upgradeVersion,
))
_, err = dynamicClient.Resource(clusterExtensionGVR).Patch(
ctx,
clusterExtensionName,
types.MergePatchType,
patch,
metav1.PatchOptions{},
)
gomega.Expect(err).NotTo(gomega.HaveOccurred())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/olmv1/olmv1_install_test.go` around lines 167 - 181, The current code
reads the ClusterExtension via getClusterExtension and then calls
dynamicClient.Resource(clusterExtensionGVR).Update after changing catalogSpec
(using unstructuredNestedMap/unstructuredSetNestedMap), which can race
controller status updates and yield 409s; instead patch only the
spec.source.catalog fields (or wrap the update in retry.RetryOnConflict) rather
than updating the whole object: construct a minimal merge patch containing
spec.source.catalog.version and spec.source.catalog.upgradeConstraintPolicy and
call dynamicClient.Resource(clusterExtensionGVR).Patch with types.MergePatchType
(add import "k8s.io/apimachinery/pkg/types"), or if you prefer keep Update, wrap
the read/modify/write in retry.RetryOnConflict to retry on conflicts. Ensure
references to getClusterExtension, unstructuredNestedMap,
unstructuredSetNestedMap, and dynamicClient.Resource(clusterExtensionGVR) are
updated accordingly.

Comment on lines +110 to +126
func ensureClusterAdminBinding(ctx context.Context, saName, ns string) {
bindingName := saName + "-cluster-admin"
crb := &rbacv1.ClusterRoleBinding{
ObjectMeta: metav1.ObjectMeta{Name: bindingName},
RoleRef: rbacv1.RoleRef{
APIGroup: "rbac.authorization.k8s.io",
Kind: "ClusterRole",
Name: "cluster-admin",
},
Subjects: []rbacv1.Subject{
{Kind: "ServiceAccount", Name: saName, Namespace: ns},
},
}
_, err := kubeClient.RbacV1().ClusterRoleBindings().Create(ctx, crb, metav1.CreateOptions{})
if apierrors.IsAlreadyExists(err) {
return
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make the ClusterRoleBinding name namespace-specific or reconcile existing subjects.

The binding name only uses saName. If the same service account name is reused in another namespace, Line 124 treats the old binding as reusable even though its subject still points at the previous namespace, so the new run won't grant the intended SA any permissions. Include ns in the binding name or update the existing binding on AlreadyExists.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/olmv1/olmv1_suite_test.go` around lines 110 - 126, The
ClusterRoleBinding name in ensureClusterAdminBinding only uses saName so it can
collide across namespaces; change the naming or reconcile existing bindings:
either make bindingName include the namespace (e.g., bindingName := saName + "-"
+ ns + "-cluster-admin") so it's unique per namespace, or when Create returns
AlreadyExists call kubeClient.RbacV1().ClusterRoleBindings().Get to load the
existing ClusterRoleBinding and update its Subjects (add or replace the
ServiceAccount subject for {Name: saName, Namespace: ns}) and then call Update
to persist the corrected subjects; implement one of these approaches inside
ensureClusterAdminBinding.

Comment on lines +270 to +295
func cleanupOrphanedCRDs(ctx context.Context) {
crdGVR := schema.GroupVersionResource{
Group: apiextensionsv1.SchemeGroupVersion.Group,
Version: apiextensionsv1.SchemeGroupVersion.Version,
Resource: "customresourcedefinitions",
}
crdList, err := dynamicClient.Resource(crdGVR).List(ctx, metav1.ListOptions{})
if err != nil {
log.Printf("Warning: failed to list CRDs: %v", err)
return
}
var deleted int
for _, crd := range crdList.Items {
name := crd.GetName()
if strings.HasSuffix(name, ".oadp.openshift.io") || strings.HasSuffix(name, ".velero.io") {
if err := dynamicClient.Resource(crdGVR).Delete(ctx, name, metav1.DeleteOptions{}); err != nil && !apierrors.IsNotFound(err) {
log.Printf("Warning: failed to delete CRD %s: %v", name, err)
} else {
deleted++
}
}
}
if deleted > 0 {
log.Printf("Deleted %d orphaned OADP/Velero CRDs", deleted)
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This CRD cleanup is destructive on shared clusters.

It deletes every *.oadp.openshift.io and *.velero.io CRD cluster-wide, regardless of ownership, then continues without waiting for final deletion. On a shared test cluster that can break an existing OADP/Velero install and make this suite race terminating CRDs. Please gate this behind an explicit opt-in or dedicated-cluster check, and wait for the CRDs to be fully removed before proceeding.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/olmv1/olmv1_suite_test.go` around lines 270 - 295, The
cleanupOrphanedCRDs function is destructive on shared clusters; change it to be
gated behind an explicit opt-in (e.g., a test flag or env var like
TEST_DELETE_ORPHAN_CRDS) or a deterministic dedicated-cluster check before
calling dynamicClient.Resource(crdGVR).Delete, and after issuing Delete for each
CRD found by cleanupOrphanedCRDs poll/wait (using
dynamicClient.Resource(crdGVR).Get in a loop with backoff and timeout) until
apierrors.IsNotFound confirms the resource is fully removed before counting it
as deleted; ensure you still handle non-NotFound errors via logging and skip
deletion when the opt-in flag is not set.

Comment on lines +315 to +323
_, err := dynamicClient.Resource(clusterCatalogGVR).Create(ctx, cc, metav1.CreateOptions{})
if apierrors.IsAlreadyExists(err) {
log.Printf("ClusterCatalog %s already exists", name)
return
}
gomega.Expect(err).NotTo(gomega.HaveOccurred())
createdCatalog = true
log.Printf("Created ClusterCatalog %s with image %s", name, image)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate an existing ClusterCatalog matches the requested image.

On AlreadyExists, this helper just logs and reuses the catalog. A previous run with the same catalogName but a different image will silently point the suite at the wrong catalog contents.

Possible fix
  _, err := dynamicClient.Resource(clusterCatalogGVR).Create(ctx, cc, metav1.CreateOptions{})
  if apierrors.IsAlreadyExists(err) {
-     log.Printf("ClusterCatalog %s already exists", name)
-     return
+     existing, getErr := dynamicClient.Resource(clusterCatalogGVR).Get(ctx, name, metav1.GetOptions{})
+     gomega.Expect(getErr).NotTo(gomega.HaveOccurred())
+     existingRef, _, _ := unstructured.NestedString(existing.Object, "spec", "source", "image", "ref")
+     gomega.Expect(existingRef).To(gomega.Equal(image),
+         "ClusterCatalog %s already exists with image %s, expected %s", name, existingRef, image)
+     return
  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
_, err := dynamicClient.Resource(clusterCatalogGVR).Create(ctx, cc, metav1.CreateOptions{})
if apierrors.IsAlreadyExists(err) {
log.Printf("ClusterCatalog %s already exists", name)
return
}
gomega.Expect(err).NotTo(gomega.HaveOccurred())
createdCatalog = true
log.Printf("Created ClusterCatalog %s with image %s", name, image)
}
_, err := dynamicClient.Resource(clusterCatalogGVR).Create(ctx, cc, metav1.CreateOptions{})
if apierrors.IsAlreadyExists(err) {
existing, getErr := dynamicClient.Resource(clusterCatalogGVR).Get(ctx, name, metav1.GetOptions{})
gomega.Expect(getErr).NotTo(gomega.HaveOccurred())
existingRef, _, _ := unstructured.NestedString(existing.Object, "spec", "source", "image", "ref")
gomega.Expect(existingRef).To(gomega.Equal(image),
"ClusterCatalog %s already exists with image %s, expected %s", name, existingRef, image)
return
}
gomega.Expect(err).NotTo(gomega.HaveOccurred())
createdCatalog = true
log.Printf("Created ClusterCatalog %s with image %s", name, image)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/olmv1/olmv1_suite_test.go` around lines 315 - 323, When Create on
dynamicClient.Resource(clusterCatalogGVR) returns
apierrors.IsAlreadyExists(err), fetch the existing ClusterCatalog (using
dynamicClient.Resource(clusterCatalogGVR).Get with the same name and ctx) and
validate its image field against the requested image variable; if they differ,
fail the test (gomega.Expect/return error) or update/replace the catalog to
match the requested image instead of silently reusing it. Ensure the check
references the existing object's image path (e.g., status/spec field used for
image in ClusterCatalog) and only set createdCatalog = true and log "Created
ClusterCatalog" when you actually created or successfully reconciled the
resource to the desired image.

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 11, 2026

@weshayutin: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/unit-test 2b88073 link true /test unit-test
ci/prow/4.23-e2e-test-aws 2b88073 link false /test 4.23-e2e-test-aws

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant