fix(ci): k8s cluster auth (cherry-pick to release-1.8)#4441
fix(ci): k8s cluster auth (cherry-pick to release-1.8)#4441zdrapela wants to merge 10 commits intoredhat-developer:release-1.8from
Conversation
120f41f to
1aabedf
Compare
|
The container image build workflow finished with status: |
|
The container image build workflow finished with status: |
|
/test e2e-aks-helm-nightly |
1aabedf to
d1bb138
Compare
b8074c6 to
e09f9ab
Compare
|
/test e2e-eks-helm-nightly |
|
The container image build workflow finished with status: |
|
/test e2e-eks-helm-nightly |
|
/agentic_review |
Code Review by Qodo
1.
|
|
/test e2e-ocp-helm |
|
The container image build workflow finished with status: |
|
/retest |
|
/retest |
|
/retest |
c49189d to
8cd959c
Compare
The psql command in the create-sonataflow-db-manual Job used `&& echo ok || echo fail` which always exits 0, masking real errors like password authentication failures. The Job was marked Complete by Kubernetes even when psql failed, causing downstream jobs-service rollout to time out. Now capture psql output and only treat "already exists" as benign; all other failures (auth errors, connection refused, etc.) exit 1 so the Job correctly reports failure and the pipeline can detect it.
Two changes to address flaky audit-log tests: 1. Increase --tail from 100 to 500: the 100-line window was too small and target log lines were getting pushed out by concurrent test activity (permission evaluations, catalog reads) from other spec files running in parallel workers. 2. Add 2s delay before first log fetch: gives the backend time to flush the audit log entry to pod stdout before oc logs is called, eliminating the race between API response and log availability.
04c6519 to
74cd93a
Compare
|
The container image build workflow finished with status: |
This reverts commit 74cd93a.
|
/retest |
1 similar comment
|
/retest |
|
@zdrapela: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Cherry-pick of cb44f17 to release-1.8.
Changes
K8s cluster authentication (
fix(ci): k8s cluster auth)common::kubectl_login()function to authenticate to Kubernetes clusters using service account tokens (sets upkubectlcredentials, cluster, and context)common::oc_login()withcommon::require_varscheck andoc whoamiverificationcommon::require_vars()utility to validate required environment variableslib/common.shandlib/log.sh(structured logging with levels, colors, and timestamps)common::kubectl_loginin all K8s job files (AKS, EKS, GKE — both helm and operator)re_create_k8s_service_account_and_get_token()fromk8s-utils.sh(token is now provided externally)aws_eks_verify_cluster()andaws_eks_get_cluster_info()fromaws.sh(replaced bycommon::kubectl_loginauth check)is_openshift(),detect_ocp(), anddetect_container_platform()fromutils.shEnvironment variable defaults (
chore(ci): use IS_OPENSHIFT from CI)env_variables.shto inheritIS_OPENSHIFT,CONTAINER_PLATFORM, andCONTAINER_PLATFORM_VERSIONfrom the CI environment instead of initializing them as empty stringsIS_OPENSHIFTdefaults totrue,CONTAINER_PLATFORMandCONTAINER_PLATFORM_VERSIONdefault tounknownConflict resolution
common.shandlog.shdid not exist on release-1.8 — created both files from the commit's versionIS_OPENSHIFT,gcloud_auth, inline base64 encoding) — kept release-1.8 structure and placedcommon::kubectl_loginafter the existing GKE auth setup where bothK8S_CLUSTER_URLandK8S_CLUSTER_TOKENare availablesource "${DIR}/lib/common.sh"toutils.shhttps://redhat.atlassian.net/browse/RHDHBUGS-2863