Skip to content

Conversation

@chucklever
Copy link
Contributor

  • dhclient-cache
    Make instance rebooting more reliable by caching the DHCP-assigned IP address
  • fix-libaio-bullseye
    Support Debian Bullseye when checking for libaio packages
  • terraform-binary-path
    Fix the default path for the "terraform" executable
  • gen_tfvars-azure-accel-net-default
    Work around an "output yaml" quirk

chucklever and others added 4 commits December 20, 2025 09:40
Cloud VMs sometimes experience intermittent DHCP failures after reboot
where DHCP clients do not receive responses even with aggressive
retransmission. This issue has been observed primarily on Azure, but
could occur on other cloud providers. It typically happens on
subsequent reboots, while the first boot during bringup is reliable.

Add a new Ansible role dhclient_cache that implements persistent DHCP
lease caching with automatic fallback across multiple DHCP mechanisms.
The solution detects the DHCP client in use and deploys the appropriate
hooks or scripts to save successful DHCP configuration to a persistent
cache and automatically restore it if DHCP fails after a reboot.

Supported DHCP mechanisms:
- ISC dhclient: Used by Debian, Ubuntu, older RHEL/CentOS/Fedora
  Uses dhclient enter/exit hooks in /etc/dhcp/dhclient-*-hooks.d/
- NetworkManager: Used by RHEL 8+, Fedora 30+, CentOS 8+
  Uses dispatcher scripts in /etc/NetworkManager/dispatcher.d/
- wicked: Used by SUSE Linux Enterprise Server, openSUSE
  Uses wicked extensions in /etc/wicked/extensions/

The cache is valid for one hour and includes IP address, subnet mask,
gateway, DNS servers, and other essential network parameters. When DHCP
times out or fails, the appropriate mechanism applies the cached
configuration using native commands and updates resolv.conf with cached
DNS servers.

The role is automatically applied during terraform bringup after
wait_for_connection succeeds and activates for all terraform-based
cloud deployments. The implementation detects the distribution and DHCP
mechanism at runtime and configures the appropriate caching solution.

Generated-by: Claude AI
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
The libaio1t64 package was introduced as part of Debian's 64-bit time
transition starting in Debian 13 (Trixie). Earlier Debian releases use
the libaio1 package name instead.

The existing logic to override the package name for older releases
checked for buster and bookworm but missed bullseye (Debian 11). This
caused fstests build dependencies to fail on Debian 11 systems with
"No package matching 'libaio1t64' is available".

Add bullseye to the list of releases that use the libaio1 package name.

Generated-by: Claude AI
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Users need to be able to override the Infrastructure as Code tool
binary path when it's installed in non-default locations. The
hidden TERRAFORM_BINARY_PATH option made this difficult because
it wasn't visible in the configuration interface and used
hard-coded paths that don't match typical package manager
installations.

This change promotes the option to a visible menu item with
improved defaults that match common installation paths. The
Terraform default now uses /usr/bin/terraform instead of
/usr/local/bin/terraform, aligning with standard package
manager behavior. The new help text guides users on when and
how to override the path for custom installations.

Generated-by: Claude AI
Signed-off-by: Chuck Lever <cel@kernel.org>
The Ansible template terraform.tfvars.j2 fails with an undefined
variable error when generating Azure terraform configuration:

  AnsibleUndefinedVariable: 'terraform_azure_accelerated_networking_enabled' is undefined

This occurs because the Kconfig variable TERRAFORM_AZURE_ACCELERATED_NETWORKING_ENABLED
is a hidden bool (no user prompt) that uses conditional defaults based on VM size.
When a VM size like Standard_B2s is selected, the variable defaults to 'n' via
the condition "default n if TERRAFORM_AZURE_VM_SIZE_STANDARD_B2S".

Kconfig's "output yaml" mechanism only writes variables to the yaml output file
when they are explicitly set in .config or when hidden bools evaluate to 'y'.
Hidden bools that evaluate to 'n' through conditional defaults are never written
to the yaml output, leaving the Ansible variable undefined at template render time.

The kernel-builder defconfig works because it explicitly sets the variable to 'y',
which causes Kconfig to write it to the yaml. The nfsd-fstests defconfig relies
on the conditional default, which evaluates to 'n' and is never output.

Add a fallback default of 'false' in the gen_tfvars role defaults, following the
existing pattern used for AWS variables. This ensures the variable is always
defined regardless of whether Kconfig outputs it.

Generated-by: Claude AI
Signed-off-by: Chuck Lever <cel@kernel.org>
@chucklever chucklever merged commit 7d98cb2 into main Dec 20, 2025
20 of 22 checks passed
@chucklever chucklever deleted the cel/terraform-fixes branch December 20, 2025 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants