Tmp/aarch64 guest os boot combined#6873
Conversation
Libvirt rejects domain XML that combines <os>/boot (boots) with per-device <disk>/boot. When prepare_os_xml() sets os_dict with boots=, remove all existing //boot nodes first via remove_all_boots(). Committer: Bolatbek Issakh <bissakh@redhat.com>
When os_attrs_boots is set, remove all //boot nodes then apply os boots. Strip boot/loadparm from cloned disk attrs so inherited per-disk boot does not remain alongside <os><boot> (libvirt unsupported configuration). Committer: Bolatbek Issakh <bissakh@redhat.com>
When os_attrs_boots is set, call remove_all_boots() before setup_attrs, and drop boot/loadparm from cdrom attr dicts so libvirt does not see per-device boot together with <os>/boot. Committer: Bolatbek Issakh <bissakh@redhat.com>
On UEFI virt guests, boot order appears on the disk device (<boot order='1'/>) instead of <os><boot dev='hd'/>. Map the legacy xpath when the host or params indicate AArch64/arm64, including arm64-mmio machine types and nested os_xpath structures. Use a regex for xpath matching, walk dict values, and log the adapted os_xpath for debugging. Committer: Bolatbek Issakh <bissakh@redhat.com>
Params often point custom_loader_path under /usr/share/edk2/ovmf/, which does not exist on AArch64 hosts. Resolve to a writable path under the Avocado data directory or /tmp, rewrite loader_dict paths to match, makedirs before copy2, and surface OSError text on failure. Fixes positive_test.customize_loader on arm64-mmio. Committer: Bolatbek Issakh <bissakh@redhat.com>
Try multiple qcow2 URLs (arch-specific first on aarch64), optional disk1_img_url_extra and longer download timeout. When os_attrs_boots is set, clear conflicting per-disk boot/loadparm after os boots. Extend serial login wait on aarch64 (default 480s, overridable via serial_login_timeout). Committer: Bolatbek Issakh <bissakh@redhat.com>
WalkthroughThis PR consolidates and enhances boot configuration handling across guest OS booting test modules. It introduces conditional clearing of existing Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
libvirt/tests/src/guest_os_booting/ovmf_firmware/ovmf_loader.py (1)
37-70: ⚡ Quick winConsider logging when all fallback alternatives fail.
When both alternative paths are unusable (line 70), the function silently returns the original path, which will fail later in
create_custom_loader. While the error will eventually be reported clearly (thanks to the improved error message on line 86), adding a warning log here would make debugging easier if this scenario occurs.📝 Optional improvement
+ if os.access(parent, os.W_OK): test.log.info( "custom_loader_path %r not usable (dir=%r); using %s instead", custom_loader_path, dest_dir, alt, ) loader_dict = _replace_path_in_obj(loader_dict, custom_loader_path, alt) return alt, loader_dict + test.log.warning( + "custom_loader_path %r not usable (dir=%r) and no writable alternative found; " + "will attempt to use original path", + custom_loader_path, + dest_dir, + ) return custom_loader_path, loader_dict🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@libvirt/tests/src/guest_os_booting/ovmf_firmware/ovmf_loader.py` around lines 37 - 70, In _resolve_custom_loader_path, when neither alternative path is usable the function currently returns the original custom_loader_path silently; add a warning log via test.log.warning (or test.log.error) just before the final return to record that all fallback alternatives failed and the original path will be used, include the custom_loader_path and dest_dir in the message so create_custom_loader failures are easier to trace.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@libvirt/tests/src/guest_os_booting/boot_order/boot_from_disk_device.py`:
- Around line 46-51: The code currently uses platform.machine() to decide URL
ordering and timeouts (affecting urls built from aarch64_variants,
disk1_img_url, ovmf_url and the serial_login_timeout default) which makes
guest-specific decisions depend on the host; replace that host-only check with
the project’s guest-arch helper (the helper used elsewhere to determine the
VM/guest architecture) and use its result to: (1) choose the URL ordering
between aarch64_variants vs ovmf_url/disk1_img_url, and (2) select the default
serial_login_timeout; apply the same replacement at the other occurrence around
the serial timeout logic (the block referenced at lines ~152-153) so guest arch
drives both URL priority and timeout.
- Around line 70-83: The current loop uses dl_timeout per URL (via
utils_misc.wait_for), making total wait = dl_timeout * len(urls); change to
enforce a single total timeout budget by computing a deadline before the loop
(e.g., end = time.time() + dl_timeout) and for each candidate URL pass the
remaining time (remaining = end - time.time()) into utils_misc.wait_for
(skip/exit if remaining <= 0); update references to dl_timeout, urls, last_err,
guest_os.test_file_download and disk1_img_path accordingly so the code breaks
once the overall timeout elapses rather than waiting dl_timeout for every URL.
In
`@libvirt/tests/src/guest_os_booting/os_configuration/without_default_os_attributes.py`:
- Around line 80-81: The call to eval() when building os_xpath is unsafe;
replace eval(params.get("os_xpath")) with
ast.literal_eval(params.get("os_xpath")) to safely parse the parameter before
passing it to _adapt_os_xpath_uefi_disk_boot; import ast at the top if not
already present and ensure params and host_arch usage remains unchanged.
In `@provider/guest_os_booting/guest_os_booting_base.py`:
- Around line 71-72: The current condition skips cleanup when os_dict contains
an empty boots list; change the check to detect the presence of the "boots" key
rather than its truthiness so empty lists still trigger removal. Update the
condition around os_dict and vmxml.remove_all_boots() (referencing the os_dict
variable and vmxml.remove_all_boots() call) to use a key-presence check (e.g.,
"'boots' in os_dict") so existing <os>/boot entries are always cleared when the
boots key is present.
---
Nitpick comments:
In `@libvirt/tests/src/guest_os_booting/ovmf_firmware/ovmf_loader.py`:
- Around line 37-70: In _resolve_custom_loader_path, when neither alternative
path is usable the function currently returns the original custom_loader_path
silently; add a warning log via test.log.warning (or test.log.error) just before
the final return to record that all fallback alternatives failed and the
original path will be used, include the custom_loader_path and dest_dir in the
message so create_custom_loader failures are easier to trace.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e33eeb37-c4a2-4c92-9361-fe919da95d62
📒 Files selected for processing (5)
libvirt/tests/src/guest_os_booting/boot_order/boot_from_cdrom_device.pylibvirt/tests/src/guest_os_booting/boot_order/boot_from_disk_device.pylibvirt/tests/src/guest_os_booting/os_configuration/without_default_os_attributes.pylibvirt/tests/src/guest_os_booting/ovmf_firmware/ovmf_loader.pyprovider/guest_os_booting/guest_os_booting_base.py
| if platform.machine() == "aarch64": | ||
| # Prefer arch-named images first: generic URL may be x86-only | ||
| # and still HTTP-200, then guest never reaches login. | ||
| urls = aarch64_variants + [disk1_img_url, ovmf_url] | ||
| else: | ||
| urls = [ovmf_url, disk1_img_url] + aarch64_variants |
There was a problem hiding this comment.
Avoid host-only architecture checks for guest-specific decisions.
This logic treats arch as host-only, so arm64 guests running on non-aarch64 hosts won’t get arm64 URL priority or the longer serial timeout.
Suggested direction
+def _guest_is_arm64_like(params, host_arch):
+ if host_arch == "aarch64":
+ return True
+ arch = (params.get("arch") or params.get("vm_arch_name") or "").lower()
+ mt = (params.get("machine_type") or "").lower()
+ shortname = (params.get("shortname") or "").lower()
+ return arch in ("aarch64", "arm64") or "aarch64" in mt or "arm64" in mt or \
+ "aarch64" in shortname or "arm64" in shortnameUse this helper for both URL ordering and default serial_login_timeout.
Also applies to: 152-153
🧰 Tools
🪛 Ruff (0.15.12)
[warning] 49-49: Consider [*aarch64_variants, disk1_img_url, ovmf_url] instead of concatenation
Replace with [*aarch64_variants, disk1_img_url, ovmf_url]
(RUF005)
[warning] 51-51: Consider [ovmf_url, disk1_img_url, *aarch64_variants] instead of concatenation
Replace with [ovmf_url, disk1_img_url, *aarch64_variants]
(RUF005)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@libvirt/tests/src/guest_os_booting/boot_order/boot_from_disk_device.py`
around lines 46 - 51, The code currently uses platform.machine() to decide URL
ordering and timeouts (affecting urls built from aarch64_variants,
disk1_img_url, ovmf_url and the serial_login_timeout default) which makes
guest-specific decisions depend on the host; replace that host-only check with
the project’s guest-arch helper (the helper used elsewhere to determine the
VM/guest architecture) and use its result to: (1) choose the URL ordering
between aarch64_variants vs ovmf_url/disk1_img_url, and (2) select the default
serial_login_timeout; apply the same replacement at the other occurrence around
the serial timeout logic (the block referenced at lines ~152-153) so guest arch
drives both URL priority and timeout.
| if utils_misc.wait_for( | ||
| lambda u=url: guest_os.test_file_download(u, disk1_img_path), | ||
| dl_timeout, | ||
| ): | ||
| ok = True | ||
| test.log.info("Downloaded disk1 image from %s", url) | ||
| break | ||
| last_err = url | ||
| if not ok: | ||
| test.fail( | ||
| "Unable to download boot image (tried %d URLs, timeout=%ss; " | ||
| "last=%r; set disk1_img_url_extra in cfg if needed)" | ||
| % (len(urls), dl_timeout, last_err) | ||
| ) |
There was a problem hiding this comment.
disk1_img_download_timeout is currently per-URL, not a total retry budget.
With multiple candidate URLs, total wait can become timeout * len(urls), which is much larger than the configured value and can significantly delay failures.
Suggested fix
+import time
...
- for url in urls:
+ deadline = time.monotonic() + dl_timeout
+ for url in urls:
if not url:
continue
+ remaining = max(0, int(deadline - time.monotonic()))
+ if remaining <= 0:
+ break
if utils_misc.wait_for(
lambda u=url: guest_os.test_file_download(u, disk1_img_path),
- dl_timeout,
+ remaining,
):
ok = True
test.log.info("Downloaded disk1 image from %s", url)
break
last_err = url🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@libvirt/tests/src/guest_os_booting/boot_order/boot_from_disk_device.py`
around lines 70 - 83, The current loop uses dl_timeout per URL (via
utils_misc.wait_for), making total wait = dl_timeout * len(urls); change to
enforce a single total timeout budget by computing a deadline before the loop
(e.g., end = time.time() + dl_timeout) and for each candidate URL pass the
remaining time (remaining = end - time.time()) into utils_misc.wait_for
(skip/exit if remaining <= 0); update references to dl_timeout, urls, last_err,
guest_os.test_file_download and disk1_img_path accordingly so the code breaks
once the overall timeout elapses rather than waiting dl_timeout for every URL.
| os_xpath = _adapt_os_xpath_uefi_disk_boot( | ||
| eval(params.get("os_xpath")), params, host_arch) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify remaining eval() usage under guest_os_booting tests
rg -nP --type=py '\beval\s*\(' libvirt/tests/src/guest_os_bootingRepository: autotest/tp-libvirt
Length of output: 9819
🏁 Script executed:
#!/bin/bash
# Check the specific file and examine lines around 78-81
cat -n libvirt/tests/src/guest_os_booting/os_configuration/without_default_os_attributes.py | head -90 | tail -30Repository: autotest/tp-libvirt
Length of output: 1274
🏁 Script executed:
#!/bin/bash
# Check full file for imports and any other eval() calls
wc -l libvirt/tests/src/guest_os_booting/os_configuration/without_default_os_attributes.py
cat -n libvirt/tests/src/guest_os_booting/os_configuration/without_default_os_attributes.py | head -40Repository: autotest/tp-libvirt
Length of output: 1551
🏁 Script executed:
#!/bin/bash
# Check for all eval() calls in the file
rg -n '\beval\s*\(' libvirt/tests/src/guest_os_booting/os_configuration/without_default_os_attributes.pyRepository: autotest/tp-libvirt
Length of output: 168
Replace eval() with ast.literal_eval() for parameter parsing.
Lines 78 and 81 use eval() to parse configuration from test parameters, which allows arbitrary code execution. Use ast.literal_eval() instead, which safely evaluates only Python literals.
Suggested fix
import platform
import re
+import ast
from virttest import virsh vm_name = guest_os.get_vm(params)
- os_dict = eval(params.get("os_dict"))
+ os_dict = ast.literal_eval(params.get("os_dict"))
host_arch = platform.machine()
os_xpath = _adapt_os_xpath_uefi_disk_boot(
- eval(params.get("os_xpath")), params, host_arch)
+ ast.literal_eval(params.get("os_xpath")), params, host_arch)🧰 Tools
🪛 Ruff (0.15.12)
[error] 81-81: Use of possibly insecure function; consider using ast.literal_eval
(S307)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@libvirt/tests/src/guest_os_booting/os_configuration/without_default_os_attributes.py`
around lines 80 - 81, The call to eval() when building os_xpath is unsafe;
replace eval(params.get("os_xpath")) with
ast.literal_eval(params.get("os_xpath")) to safely parse the parameter before
passing it to _adapt_os_xpath_uefi_disk_boot; import ast at the top if not
already present and ensure params and host_arch usage remains unchanged.
| if isinstance(os_dict, dict) and os_dict.get("boots"): | ||
| vmxml.remove_all_boots() |
There was a problem hiding this comment.
Clear existing OS boots when the boots key exists, even if it’s empty.
Using os_dict.get("boots") skips cleanup for {"boots": []}, which can leave stale <os>/boot entries and break the intended conflict-avoidance behavior.
Suggested fix
- if isinstance(os_dict, dict) and os_dict.get("boots"):
+ if isinstance(os_dict, dict) and "boots" in os_dict:
vmxml.remove_all_boots()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if isinstance(os_dict, dict) and os_dict.get("boots"): | |
| vmxml.remove_all_boots() | |
| if isinstance(os_dict, dict) and "boots" in os_dict: | |
| vmxml.remove_all_boots() |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@provider/guest_os_booting/guest_os_booting_base.py` around lines 71 - 72, The
current condition skips cleanup when os_dict contains an empty boots list;
change the check to detect the presence of the "boots" key rather than its
truthiness so empty lists still trigger removal. Update the condition around
os_dict and vmxml.remove_all_boots() (referencing the os_dict variable and
vmxml.remove_all_boots() call) to use a key-presence check (e.g., "'boots' in
os_dict") so existing <os>/boot entries are always cleared when the boots key is
present.
|
Closing this PR due to current team constraints. This is part of a broader effort to triage all in-flight work across our upstream repos. If this work is still needed, please feel free to reopen and it will be picked up. Apologies for any inconvenience. |
Summary by CodeRabbit
New Features
Improvements