Skip to content

Tmp/aarch64 guest os boot combined#6873

Open
BulaYoungR wants to merge 6 commits into
autotest:masterfrom
BulaYoungR:tmp/aarch64-guest-os-boot-combined
Open

Tmp/aarch64 guest os boot combined#6873
BulaYoungR wants to merge 6 commits into
autotest:masterfrom
BulaYoungR:tmp/aarch64-guest-os-boot-combined

Conversation

@BulaYoungR
Copy link
Copy Markdown

@BulaYoungR BulaYoungR commented May 16, 2026

Summary by CodeRabbit

  • New Features

    • Added AArch64 guest detection and UEFI boot configuration support
    • Added OVMF custom firmware loader relocation capability
  • Improvements

    • Enhanced boot configuration management with improved clearing and per-device attribute handling
    • Improved boot image download resilience with platform-aware timeouts and retry logic
    • Enhanced error diagnostics for download failures with expanded reporting on attempted URLs and timeouts

Review Change Stack

Super User added 6 commits May 16, 2026 15:26
Libvirt rejects domain XML that combines <os>/boot (boots) with
per-device <disk>/boot. When prepare_os_xml() sets os_dict with
boots=, remove all existing //boot nodes first via remove_all_boots().

Committer: Bolatbek Issakh <bissakh@redhat.com>
When os_attrs_boots is set, remove all //boot nodes then apply os boots.
Strip boot/loadparm from cloned disk attrs so inherited per-disk boot
does not remain alongside <os><boot> (libvirt unsupported configuration).

Committer: Bolatbek Issakh <bissakh@redhat.com>
When os_attrs_boots is set, call remove_all_boots() before setup_attrs,
and drop boot/loadparm from cdrom attr dicts so libvirt does not see
per-device boot together with <os>/boot.

Committer: Bolatbek Issakh <bissakh@redhat.com>
On UEFI virt guests, boot order appears on the disk device
(<boot order='1'/>) instead of <os><boot dev='hd'/>. Map the legacy
xpath when the host or params indicate AArch64/arm64, including
arm64-mmio machine types and nested os_xpath structures.

Use a regex for xpath matching, walk dict values, and log the
adapted os_xpath for debugging.

Committer: Bolatbek Issakh <bissakh@redhat.com>
Params often point custom_loader_path under /usr/share/edk2/ovmf/, which
does not exist on AArch64 hosts. Resolve to a writable path under the
Avocado data directory or /tmp, rewrite loader_dict paths to match,
makedirs before copy2, and surface OSError text on failure.

Fixes positive_test.customize_loader on arm64-mmio.

Committer: Bolatbek Issakh <bissakh@redhat.com>
Try multiple qcow2 URLs (arch-specific first on aarch64), optional
disk1_img_url_extra and longer download timeout. When os_attrs_boots
is set, clear conflicting per-disk boot/loadparm after os boots.
Extend serial login wait on aarch64 (default 480s, overridable via
serial_login_timeout).

Committer: Bolatbek Issakh <bissakh@redhat.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 16, 2026

Walkthrough

This PR consolidates and enhances boot configuration handling across guest OS booting test modules. It introduces conditional clearing of existing <os>/boot entries before applying new OS boot attributes to prevent libvirt XML validation conflicts with per-device boot settings. Architecture detection logic for AArch64/arm64 guests enables conditional XPath adaptation for UEFI scenarios. Platform-aware serial login timeouts scale based on detected host architecture. OVMF disk image downloads now employ robust retry logic with architecture-dependent URL prioritization and configurable timeouts. Custom OVMF loader path resolution provides fallback directory support when the original loader location is unavailable or unwritable, with comprehensive error reporting.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 78.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Tmp/aarch64 guest os boot combined' is vague and uses unclear phrasing that doesn't convey what the changeset actually accomplishes. Revise the title to clearly describe the main change, such as 'Support AArch64 guest OS boot configuration with UEFI and firmware handling' or similar, avoiding the 'Tmp/' prefix and vague 'combined' terminology.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
libvirt/tests/src/guest_os_booting/ovmf_firmware/ovmf_loader.py (1)

37-70: ⚡ Quick win

Consider logging when all fallback alternatives fail.

When both alternative paths are unusable (line 70), the function silently returns the original path, which will fail later in create_custom_loader. While the error will eventually be reported clearly (thanks to the improved error message on line 86), adding a warning log here would make debugging easier if this scenario occurs.

📝 Optional improvement
+        if os.access(parent, os.W_OK):
             test.log.info(
                 "custom_loader_path %r not usable (dir=%r); using %s instead",
                 custom_loader_path,
                 dest_dir,
                 alt,
             )
             loader_dict = _replace_path_in_obj(loader_dict, custom_loader_path, alt)
             return alt, loader_dict
+    test.log.warning(
+        "custom_loader_path %r not usable (dir=%r) and no writable alternative found; "
+        "will attempt to use original path",
+        custom_loader_path,
+        dest_dir,
+    )
     return custom_loader_path, loader_dict
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@libvirt/tests/src/guest_os_booting/ovmf_firmware/ovmf_loader.py` around lines
37 - 70, In _resolve_custom_loader_path, when neither alternative path is usable
the function currently returns the original custom_loader_path silently; add a
warning log via test.log.warning (or test.log.error) just before the final
return to record that all fallback alternatives failed and the original path
will be used, include the custom_loader_path and dest_dir in the message so
create_custom_loader failures are easier to trace.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@libvirt/tests/src/guest_os_booting/boot_order/boot_from_disk_device.py`:
- Around line 46-51: The code currently uses platform.machine() to decide URL
ordering and timeouts (affecting urls built from aarch64_variants,
disk1_img_url, ovmf_url and the serial_login_timeout default) which makes
guest-specific decisions depend on the host; replace that host-only check with
the project’s guest-arch helper (the helper used elsewhere to determine the
VM/guest architecture) and use its result to: (1) choose the URL ordering
between aarch64_variants vs ovmf_url/disk1_img_url, and (2) select the default
serial_login_timeout; apply the same replacement at the other occurrence around
the serial timeout logic (the block referenced at lines ~152-153) so guest arch
drives both URL priority and timeout.
- Around line 70-83: The current loop uses dl_timeout per URL (via
utils_misc.wait_for), making total wait = dl_timeout * len(urls); change to
enforce a single total timeout budget by computing a deadline before the loop
(e.g., end = time.time() + dl_timeout) and for each candidate URL pass the
remaining time (remaining = end - time.time()) into utils_misc.wait_for
(skip/exit if remaining <= 0); update references to dl_timeout, urls, last_err,
guest_os.test_file_download and disk1_img_path accordingly so the code breaks
once the overall timeout elapses rather than waiting dl_timeout for every URL.

In
`@libvirt/tests/src/guest_os_booting/os_configuration/without_default_os_attributes.py`:
- Around line 80-81: The call to eval() when building os_xpath is unsafe;
replace eval(params.get("os_xpath")) with
ast.literal_eval(params.get("os_xpath")) to safely parse the parameter before
passing it to _adapt_os_xpath_uefi_disk_boot; import ast at the top if not
already present and ensure params and host_arch usage remains unchanged.

In `@provider/guest_os_booting/guest_os_booting_base.py`:
- Around line 71-72: The current condition skips cleanup when os_dict contains
an empty boots list; change the check to detect the presence of the "boots" key
rather than its truthiness so empty lists still trigger removal. Update the
condition around os_dict and vmxml.remove_all_boots() (referencing the os_dict
variable and vmxml.remove_all_boots() call) to use a key-presence check (e.g.,
"'boots' in os_dict") so existing <os>/boot entries are always cleared when the
boots key is present.

---

Nitpick comments:
In `@libvirt/tests/src/guest_os_booting/ovmf_firmware/ovmf_loader.py`:
- Around line 37-70: In _resolve_custom_loader_path, when neither alternative
path is usable the function currently returns the original custom_loader_path
silently; add a warning log via test.log.warning (or test.log.error) just before
the final return to record that all fallback alternatives failed and the
original path will be used, include the custom_loader_path and dest_dir in the
message so create_custom_loader failures are easier to trace.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e33eeb37-c4a2-4c92-9361-fe919da95d62

📥 Commits

Reviewing files that changed from the base of the PR and between c96ab65 and 25b7c02.

📒 Files selected for processing (5)
  • libvirt/tests/src/guest_os_booting/boot_order/boot_from_cdrom_device.py
  • libvirt/tests/src/guest_os_booting/boot_order/boot_from_disk_device.py
  • libvirt/tests/src/guest_os_booting/os_configuration/without_default_os_attributes.py
  • libvirt/tests/src/guest_os_booting/ovmf_firmware/ovmf_loader.py
  • provider/guest_os_booting/guest_os_booting_base.py

Comment on lines +46 to +51
if platform.machine() == "aarch64":
# Prefer arch-named images first: generic URL may be x86-only
# and still HTTP-200, then guest never reaches login.
urls = aarch64_variants + [disk1_img_url, ovmf_url]
else:
urls = [ovmf_url, disk1_img_url] + aarch64_variants
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid host-only architecture checks for guest-specific decisions.

This logic treats arch as host-only, so arm64 guests running on non-aarch64 hosts won’t get arm64 URL priority or the longer serial timeout.

Suggested direction
+def _guest_is_arm64_like(params, host_arch):
+    if host_arch == "aarch64":
+        return True
+    arch = (params.get("arch") or params.get("vm_arch_name") or "").lower()
+    mt = (params.get("machine_type") or "").lower()
+    shortname = (params.get("shortname") or "").lower()
+    return arch in ("aarch64", "arm64") or "aarch64" in mt or "arm64" in mt or \
+        "aarch64" in shortname or "arm64" in shortname

Use this helper for both URL ordering and default serial_login_timeout.

Also applies to: 152-153

🧰 Tools
🪛 Ruff (0.15.12)

[warning] 49-49: Consider [*aarch64_variants, disk1_img_url, ovmf_url] instead of concatenation

Replace with [*aarch64_variants, disk1_img_url, ovmf_url]

(RUF005)


[warning] 51-51: Consider [ovmf_url, disk1_img_url, *aarch64_variants] instead of concatenation

Replace with [ovmf_url, disk1_img_url, *aarch64_variants]

(RUF005)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@libvirt/tests/src/guest_os_booting/boot_order/boot_from_disk_device.py`
around lines 46 - 51, The code currently uses platform.machine() to decide URL
ordering and timeouts (affecting urls built from aarch64_variants,
disk1_img_url, ovmf_url and the serial_login_timeout default) which makes
guest-specific decisions depend on the host; replace that host-only check with
the project’s guest-arch helper (the helper used elsewhere to determine the
VM/guest architecture) and use its result to: (1) choose the URL ordering
between aarch64_variants vs ovmf_url/disk1_img_url, and (2) select the default
serial_login_timeout; apply the same replacement at the other occurrence around
the serial timeout logic (the block referenced at lines ~152-153) so guest arch
drives both URL priority and timeout.

Comment on lines +70 to +83
if utils_misc.wait_for(
lambda u=url: guest_os.test_file_download(u, disk1_img_path),
dl_timeout,
):
ok = True
test.log.info("Downloaded disk1 image from %s", url)
break
last_err = url
if not ok:
test.fail(
"Unable to download boot image (tried %d URLs, timeout=%ss; "
"last=%r; set disk1_img_url_extra in cfg if needed)"
% (len(urls), dl_timeout, last_err)
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

disk1_img_download_timeout is currently per-URL, not a total retry budget.

With multiple candidate URLs, total wait can become timeout * len(urls), which is much larger than the configured value and can significantly delay failures.

Suggested fix
+import time
...
-            for url in urls:
+            deadline = time.monotonic() + dl_timeout
+            for url in urls:
                 if not url:
                     continue
+                remaining = max(0, int(deadline - time.monotonic()))
+                if remaining <= 0:
+                    break
                 if utils_misc.wait_for(
                     lambda u=url: guest_os.test_file_download(u, disk1_img_path),
-                    dl_timeout,
+                    remaining,
                 ):
                     ok = True
                     test.log.info("Downloaded disk1 image from %s", url)
                     break
                 last_err = url
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@libvirt/tests/src/guest_os_booting/boot_order/boot_from_disk_device.py`
around lines 70 - 83, The current loop uses dl_timeout per URL (via
utils_misc.wait_for), making total wait = dl_timeout * len(urls); change to
enforce a single total timeout budget by computing a deadline before the loop
(e.g., end = time.time() + dl_timeout) and for each candidate URL pass the
remaining time (remaining = end - time.time()) into utils_misc.wait_for
(skip/exit if remaining <= 0); update references to dl_timeout, urls, last_err,
guest_os.test_file_download and disk1_img_path accordingly so the code breaks
once the overall timeout elapses rather than waiting dl_timeout for every URL.

Comment on lines +80 to +81
os_xpath = _adapt_os_xpath_uefi_disk_boot(
eval(params.get("os_xpath")), params, host_arch)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify remaining eval() usage under guest_os_booting tests
rg -nP --type=py '\beval\s*\(' libvirt/tests/src/guest_os_booting

Repository: autotest/tp-libvirt

Length of output: 9819


🏁 Script executed:

#!/bin/bash
# Check the specific file and examine lines around 78-81
cat -n libvirt/tests/src/guest_os_booting/os_configuration/without_default_os_attributes.py | head -90 | tail -30

Repository: autotest/tp-libvirt

Length of output: 1274


🏁 Script executed:

#!/bin/bash
# Check full file for imports and any other eval() calls
wc -l libvirt/tests/src/guest_os_booting/os_configuration/without_default_os_attributes.py
cat -n libvirt/tests/src/guest_os_booting/os_configuration/without_default_os_attributes.py | head -40

Repository: autotest/tp-libvirt

Length of output: 1551


🏁 Script executed:

#!/bin/bash
# Check for all eval() calls in the file
rg -n '\beval\s*\(' libvirt/tests/src/guest_os_booting/os_configuration/without_default_os_attributes.py

Repository: autotest/tp-libvirt

Length of output: 168


Replace eval() with ast.literal_eval() for parameter parsing.

Lines 78 and 81 use eval() to parse configuration from test parameters, which allows arbitrary code execution. Use ast.literal_eval() instead, which safely evaluates only Python literals.

Suggested fix
 import platform
 import re
+import ast
 
 from virttest import virsh
     vm_name = guest_os.get_vm(params)
-    os_dict = eval(params.get("os_dict"))
+    os_dict = ast.literal_eval(params.get("os_dict"))
     host_arch = platform.machine()
     os_xpath = _adapt_os_xpath_uefi_disk_boot(
-        eval(params.get("os_xpath")), params, host_arch)
+        ast.literal_eval(params.get("os_xpath")), params, host_arch)
🧰 Tools
🪛 Ruff (0.15.12)

[error] 81-81: Use of possibly insecure function; consider using ast.literal_eval

(S307)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@libvirt/tests/src/guest_os_booting/os_configuration/without_default_os_attributes.py`
around lines 80 - 81, The call to eval() when building os_xpath is unsafe;
replace eval(params.get("os_xpath")) with
ast.literal_eval(params.get("os_xpath")) to safely parse the parameter before
passing it to _adapt_os_xpath_uefi_disk_boot; import ast at the top if not
already present and ensure params and host_arch usage remains unchanged.

Comment on lines +71 to +72
if isinstance(os_dict, dict) and os_dict.get("boots"):
vmxml.remove_all_boots()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Clear existing OS boots when the boots key exists, even if it’s empty.

Using os_dict.get("boots") skips cleanup for {"boots": []}, which can leave stale <os>/boot entries and break the intended conflict-avoidance behavior.

Suggested fix
-    if isinstance(os_dict, dict) and os_dict.get("boots"):
+    if isinstance(os_dict, dict) and "boots" in os_dict:
         vmxml.remove_all_boots()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if isinstance(os_dict, dict) and os_dict.get("boots"):
vmxml.remove_all_boots()
if isinstance(os_dict, dict) and "boots" in os_dict:
vmxml.remove_all_boots()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@provider/guest_os_booting/guest_os_booting_base.py` around lines 71 - 72, The
current condition skips cleanup when os_dict contains an empty boots list;
change the check to detect the presence of the "boots" key rather than its
truthiness so empty lists still trigger removal. Update the condition around
os_dict and vmxml.remove_all_boots() (referencing the os_dict variable and
vmxml.remove_all_boots() call) to use a key-presence check (e.g., "'boots' in
os_dict") so existing <os>/boot entries are always cleared when the boots key is
present.

@harvey0100
Copy link
Copy Markdown

Closing this PR due to current team constraints. This is part of a broader effort to triage all in-flight work across our upstream repos. If this work is still needed, please feel free to reopen and it will be picked up. Apologies for any inconvenience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants