network: ConfigDrive must check Dhcp/Dns on the network, not on itself#132
Open
msinhore wants to merge 1 commit intoshapeblue:mainfrom
Open
network: ConfigDrive must check Dhcp/Dns on the network, not on itself#132msinhore wants to merge 1 commit intoshapeblue:mainfrom
msinhore wants to merge 1 commit intoshapeblue:mainfrom
Conversation
ConfigDriveNetworkElement.getSupportedServicesByElementForNetwork
gates the population of network_data.json on the ConfigDrive
provider declaring Dhcp and Dns services for the NIC's network. On
any modern offering where ConfigDrive is bound only to UserData and
Dhcp / Dns are delivered by a different provider — Ovn, Netris, Nsx,
even VirtualRouter when the operator has split UserData out — this
predicate returns false. ConfigDriveBuilder.writeNetworkData then
honours the empty service list and writes
`network_data.json = "{}"`.
The ISO ends up with hostname and SSH keys (read from
meta_data.json) but no `links` / `networks` / `services` for
cloud-init. Older cloud-init releases used to fall back to a
DHCP-everything heuristic; cloud-init >= 23 (shipped with Ubuntu
24.04 cloud images and current Debian 12 backports) does not, so
the VM boots with whatever pre-baked netplan is in the image. For
canonical cloud images that means no interface comes up at all,
because their netplan is rendered from datasource at first boot —
which is exactly the path we just defeated.
The Dhcp / Dns checks should not be scoped to the ConfigDrive
element; they should be scoped to the network, because the question
is "does cloud-init need network configuration in the ISO?" not
"does ConfigDrive itself implement Dhcp on this network?". Switch
those two probes to areServicesSupportedInNetwork. UserData stays
scoped to the ConfigDrive provider — that one IS implemented here.
Lab-verified before / after on an OVN-backed isolated network whose
offering is `Dhcp=Ovn / Dns=Ovn / UserData=ConfigDrive`:
Before
/openstack/latest/network_data.json = "{}"
Ubuntu 24.04 cloud image: hostname/SSH keys applied via
cloud-init, no interface up, VM unreachable.
After
/openstack/latest/network_data.json includes links[],
networks[], services[] for the NIC.
Ubuntu 24.04 cloud image: cloud-init renders netplan,
DHCPv4 lease acquired on first boot, VM reachable on its
DHCP-assigned IP.
Affects every CloudStack deployment whose offering routes UserData
to ConfigDrive while routing Dhcp/Dns to a separate provider —
including all OVN-backed offerings, the Netris and NSX route-mode
offerings, and any operator-built offering that decouples the
two. The fix is a one-line predicate change inside core, nothing
provider-specific.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
ConfigDriveNetworkElement.getSupportedServicesByElementForNetwork gates the population of network_data.json on the ConfigDrive provider declaring Dhcp and Dns services for the NIC's network. On any modern offering where ConfigDrive is bound only to UserData and Dhcp / Dns are delivered by a different provider — Ovn, Netris, Nsx, even VirtualRouter when the operator has split UserData out — this predicate returns false. ConfigDriveBuilder.writeNetworkData then honours the empty service list and writes
network_data.json = "{}".The ISO ends up with hostname and SSH keys (read from meta_data.json) but no
links/networks/servicesfor cloud-init. Older cloud-init releases used to fall back to a DHCP-everything heuristic; cloud-init >= 23 (shipped with Ubuntu 24.04 cloud images and current Debian 12 backports) does not, so the VM boots with whatever pre-baked netplan is in the image. For canonical cloud images that means no interface comes up at all, because their netplan is rendered from datasource at first boot — which is exactly the path I just defeated.The Dhcp / Dns checks should not be scoped to the ConfigDrive element; they should be scoped to the network, because the question is "does cloud-init need network configuration in the template?" not "does ConfigDrive itself implement Dhcp on this network?". Switch those two probes to areServicesSupportedInNetwork. UserData stays scoped to the ConfigDrive provider — that one IS implemented here.
Lab-verified before / after on an OVN-backed isolated network whose offering is
Dhcp=Ovn / Dns=Ovn / UserData=ConfigDrive:Before
/openstack/latest/network_data.json = "{}" Ubuntu 24.04 cloud image: hostname/SSH keys applied via cloud-init, no interface up, VM unreachable.
After
/openstack/latest/network_data.json includes links[], networks[], services[] for the NIC. Ubuntu 24.04 cloud image: cloud-init renders netplan, DHCPv4 lease acquired on first boot, VM reachable on its DHCP-assigned IP.
Affects every CloudStack deployment whose offering routes UserData to ConfigDrive while routing Dhcp/Dns to a separate provider — including all OVN-backed offerings, the Netris and NSX route-mode offerings, and any operator-built offering that decouples the two. The fix is a one-line predicate change inside core, nothing provider-specific.