-
Notifications
You must be signed in to change notification settings - Fork 141
Fdp update edpm #3467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fdp update edpm #3467
Conversation
8d17965 to
24774d6
Compare
24774d6 to
728f0fb
Compare
b645ded to
02be96f
Compare
michburk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this pr isn't split properly and also contains the changes from #3466
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/381a8462756e461fa40fdad368f58ea3 ✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 44m 12s |
02be96f to
c06f97a
Compare
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/8dae75330b79473b922c4f394be22509 ✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 38m 24s |
d4a6725 to
6410812
Compare
This pr depends on the other one. When the other one is merged, i will rebase this one and I will only have one commit here |
f09d329 to
2f1a170
Compare
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/a31750715d0943f7889539e20267c60d ✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 42m 49s |
2f1a170 to
8e50296
Compare
| # ============================================================================ | ||
|
|
||
| # Base directory for artifacts and temporary files | ||
| cifmw_fdp_update_container_images_basedir: "{{ cifmw_basedir | default(ansible_user_dir ~ '/ci-framework-data') }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove default. It should start using group_vars/all.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fdp_update_container_images not in this PR, this was managed in a different PR
|
|
||
| # Repository configuration | ||
| cifmw_fdp_update_container_images_repo_name: "custom-repo" | ||
| cifmw_fdp_update_container_images_repo_baseurl: "" # REQUIRED - must be set by user |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a nit: do we really want to have so much parameters available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fdp_update_container_images not in this PR, this was managed in a different PR
| @@ -0,0 +1,65 @@ | |||
| --- | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you sure we need that task?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we want to download images from openshift registry. That registry was not used before by the compute node. We need to either configure it as a insecure registry or configure the certificate in the compute node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure? Current CI job workflow seems not "catching" into your exception. Don't think it is needed, but if yes, would be surprise and probably you will not use ci-framework tool for preparing CI jobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edpm nodes are downloading modified images from openshift registry. It is needed to configure certificate or configure the registry as insecure.
This is needed because images modified are stored in openshift registry. For regular deployment, images are downloaded from external registries and it is configured the certificates needed to access those registries or it is configured as insecure.
| _cifmw_fdp_update_edpm_deployment_result.resources | length > 0 and | ||
| _cifmw_fdp_update_edpm_deployment_result.resources[0].status.conditions | | ||
| selectattr('type', 'equalto', 'Ready') | | ||
| map(attribute='status') | first | default('False') == 'True' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default is needed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is there in case the filter first return empty, so It set it to False
| @@ -0,0 +1,59 @@ | |||
| --- | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure we need that task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it is needed because our compute nodes does not have connectivity to the openshift registry in which we store the updated images. I only add a route so that compute nodes can access registry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i commented before the reason
0c3f13a#r2592543239
8e50296 to
8f40669
Compare
| register: _cifmw_fdp_update_edpm_current_logins_result | ||
| failed_when: false | ||
|
|
||
| - name: Parse existing registry logins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this fact and later using this fact is just: xDˣᴰ
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in a way it is easier to understand
danpawlik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take a look what AI generates you back.
Also some part of the code was already done in fdp_update_container_images.
| state: build | ||
|
|
||
| - name: Build custom container images dict | ||
| ansible.builtin.set_fact: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I'm unsure why this change is required.
roles/fdp_update_edpm/tasks/main.yml
Outdated
| when: | ||
| - cifmw_fdp_update_edpm_auto_deploy | bool | ||
| - not cifmw_fdp_update_edpm_dry_run | bool | ||
| - _cifmw_fdp_update_edpm_updated_nodesets | default([]) | length > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed not required. We've already added the default to the default role. And it'd not be te way to import this task instead importing this role.
| - name: Patch container images in nodeset | ||
| when: | ||
| - cifmw_fdp_update_edpm_containers_enabled | bool | ||
| - _cifmw_fdp_update_edpm_updated_images is defined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed is not necessary, as this is defined in the defualts of the role, and it doesn't seems that this task is included outside the natural include_role process. So then this var is always defined.
|
|
||
| - name: Add new repo to repos list | ||
| ansible.builtin.set_fact: | ||
| _cifmw_fdp_update_edpm_merged_repos: "{{ _cifmw_fdp_update_edpm_bootstrap_repos_list + [_cifmw_fdp_update_edpm_new_repo] }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(non-blocking) question: is required this to be a fact? I don't see it used anywhere outside this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used several times in the file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(non-blocking) suggestion: let's move into a variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it is a non-blocking issue, i can check these kind of improvementes later on in a different PR
mnietoji
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked most of the comments, i will continue tomorrow
| state: build | ||
|
|
||
| - name: Build custom container images dict | ||
| ansible.builtin.set_fact: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what is happening with current code without set_fact. I added a debug task before to show the content of the variable.
TASK [fdp_update_container_images : Debug - Show new image path internal msg=New image path (internal): {{ _cifmw_fdp_update_container_images_new_image_path_internal }}] ***
Wednesday 10 December 2025 14:38:01 +0000 (0:00:01.774) 0:00:26.053 ****
Wednesday 10 December 2025 14:38:01 +0000 (0:00:01.774) 0:00:26.052 ****
ok: [localhost] =>
msg: 'New image path (internal): image-registry.openshift-image-registry.svc:5000/openstack/fdp-update-ceilometersgcoreimage-1765377456'
TASK [fdp_update_container_images : Patch OpenStackVersion CR state=patched, api_version=core.openstack.org/v1beta1, kind=OpenStackVersion, name={{ cifmw_fdp_update_container_images_openstack_cr_name }}, namespace={{ cifmw_fdp_update_container_images_namespace }}, definition={'spec': {'customContainerImages': {'{{ image_entry.key }}': '{{ _cifmw_fdp_update_container_images_new_image_path_internal }}'}}}] ***
Wednesday 10 December 2025 14:38:01 +0000 (0:00:00.037) 0:00:26.091 ****
Wednesday 10 December 2025 14:38:01 +0000 (0:00:00.037) 0:00:26.089 ****
[WARNING]: unknown field "spec.customContainerImages.{{ image_entry.key }}"
ok: [localhost]
The problem is that Ansible doesn't evaluate {{ image_entry.key }} when used directly as a YAML dictionary key. It sends the literal string "{{ image_entry.key }}" to Kubernetes instead of the actual value like ceilometerSgcoreImage. The solution is to build the dictionary as a string template first, then convert it with from_yaml filter - this forces proper variable evaluation before creating the dictionary.
Instead of using the set_fact, i have changed to this other sintax:
customContainerImages: "{{ { image_entry.key: _cifmw_fdp_update_container_images_new_image_path_internal } }}"
| # Dry run - show changes without applying | ||
| cifmw_fdp_update_edpm_dry_run: false | ||
|
|
||
| # ============================================ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fdp_update_container_images patches images with ovn in FDP repo and uploads them to the openshift registry. It also updates openstackversion to force pods using those images to be updated.
fdp_update_edpm uses images already created by fdp_update_container_images. The goal of this role is to update images in compute nodes (they are not pods, just containers). It also update ovs packages in compute host
| @@ -0,0 +1,62 @@ | |||
| --- | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This role is updating containers running in compute nodes. Images are stored in openshift registry. I could have chosen to use any external registry to store that images, but I liked the idea of not depending on any other external resources and I do not need to keep those images, so openshift registry is nice.
But compute nodes are not connected to openshift baremetal network. One solution may have been to add a new vlan in compute nodes, configure the switch and allow compute nodes to access baremetal network, but i didnt want to have a modification on the templates only because i need it for testing.
What I am doing is to add some firewall rules in hypervisor to access openshift baremetal network from compute nodes using controlplane network
|
|
||
| - name: Fail if no NodeSets found | ||
| ansible.builtin.fail: | ||
| msg: "No OpenStackDataPlaneNodeSets found in namespace {{ cifmw_fdp_update_edpm_namespace }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it make sense:
if cifmw_fdp_update_edpm_nodeset_name == 'all' executes: oc get osdpns
if cifmw_fdp_update_edpm_nodeset_name != 'all' executes: oc get osdpns cifmw_fdp_update_edpm_nodeset_name
In both cases, if there is no nodeset created, _cifmw_fdp_update_edpm_nodesets will be empty
| # ============================================ | ||
|
|
||
| - name: Setup hypervisor firewall for registry access | ||
| ansible.builtin.include_tasks: setup_hypervisor_firewall.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i commented before the reason
0c3f13a#r2592543239
|
|
||
| - name: Add new repo to repos list | ||
| ansible.builtin.set_fact: | ||
| _cifmw_fdp_update_edpm_merged_repos: "{{ _cifmw_fdp_update_edpm_bootstrap_repos_list + [_cifmw_fdp_update_edpm_new_repo] }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used several times in the file
| that: | ||
| - cifmw_fdp_update_edpm_repo_baseurl is defined | ||
| - cifmw_fdp_update_edpm_repo_baseurl | length > 0 | ||
| - cifmw_fdp_update_edpm_packages is defined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
|
|
||
| - name: Initialize updated nodesets tracking list | ||
| ansible.builtin.set_fact: | ||
| _cifmw_fdp_update_edpm_updated_nodesets: [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
it is created here empty and later on it is updated for each nodeset and and is used to decide if a deployment is created or not and also for the debug message at the end
0c3f13a to
422e6e7
Compare
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/32e75138ecb8467586f36758556b4c4f ✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 44m 55s |
422e6e7 to
a3cf0b2
Compare
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/7bdf595fca3c4cf6adb0786fddcda0f5 ❌ openstack-k8s-operators-content-provider FAILURE in 7m 11s |
a3cf0b2 to
276c995
Compare
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/d03970aa66cb4aa2b09e0d3a4c74d2e4 ❌ openstack-k8s-operators-content-provider FAILURE in 5m 23s |
b0fbe03 to
e75e75f
Compare
Not same code, maybe name of the file is the same but it is doing it a different thing |
e75e75f to
39a340c
Compare
[fdp_update_edpm] Implement EDPM node update automation for FDP updates: - Role fdp_update_edpm: Updates EDPM nodes declaratively via Kubernetes CRs * Patches OpenStackDataPlaneNodeSet CRs with updated container images * Configures package updates via edpm_bootstrap_packages * Sets up registry authentication and CA certificates * Creates OpenStackDataPlaneDeployment to apply changes * Includes hypervisor firewall configuration for registry access - Fix hypervisor firewall configuration * Add delegate_to to execute iptables on correct hypervisor host * Previously executed on localhost instead of hypervisor - Integration in post-deployment.yml after control plane updates - Zuul CI configuration for automated testing [fdp_update_container_images] Fix to properly update OpenStackVersion CR * Add set_fact task to build customContainerImages dict correctly Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Miguel Angel Nieto Jimenez <mnietoji@redhat.com>
39a340c to
93e41a8
Compare
|
Hi, have rebased the patch and gone through alll pending comments. Testing is green again |
[multiple] Add fdp_update_edpm role for EDPM node updates
[fdp_update_edpm] Implement EDPM node update automation for FDP updates:
[fdp_update_container_images] Fix to properly update OpenStackVersion CR
Assisted-By: Claude noreply@anthropic.com
Signed-off-by: Miguel Angel Nieto Jimenez mnietoji@redhat.com