Skip to content

Bootloader regression testing#5552

Open
netmindz wants to merge 2 commits intowled:mainfrom
netmindz:bootloader-regression
Open

Bootloader regression testing#5552
netmindz wants to merge 2 commits intowled:mainfrom
netmindz:bootloader-regression

Conversation

@netmindz
Copy link
Copy Markdown
Member

@netmindz netmindz commented Apr 30, 2026

Summary by CodeRabbit

  • Tests

    • Added comprehensive end-to-end regression testing suite for OTA bootloader compatibility. Includes automated hardware validation, firmware version compatibility checks, and verification of successful upgrade paths to prevent future regressions.
  • Chores

    • Updated CI/CD pipeline workflows to optimize and standardize runtime environment configuration across all automated build and deployment actions.

Set FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true across all affected workflows
to avoid deprecation warnings ahead of the June 2026 forced migration.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 30, 2026

Walkthrough

The PR adds a global environment variable FORCE_JAVASCRIPT_ACTIONS_TO_NODE24 set to true across four GitHub Actions workflows to control Node.js runtime behavior, and introduces a new end-to-end regression test script for WLED OTA bootloader compatibility that auto-discovers release tags, builds bootloader artifacts in Docker, and validates old bootloader compatibility with new firmware images.

Changes

Cohort / File(s) Summary
Workflow Environment Configuration
.github/workflows/build.yml, .github/workflows/nightly.yml, .github/workflows/release.yml, .github/workflows/usermods.yml
Added global FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: true environment variable to force JavaScript-based GitHub Actions to use Node.js 24 runtime across all four workflows.
OTA Bootloader Regression Suite
tools/ota_bootloader_regression.py
New regression test script that auto-discovers firmware releases, fetches assets from GitHub, reads configurations via git, builds ESP32 bootloader artifacts in Docker, and validates OTA bootloader compatibility by flashing test devices and analyzing serial output for successful transitions between old and new firmware versions.

Sequence Diagram

sequenceDiagram
    participant User
    participant Script as ota_bootloader_regression.py
    participant GitHub as GitHub API
    participant Git as Git Repository
    participant Docker as Docker Container
    participant Device as Hardware Device
    participant Serial as Serial Interface

    User->>Script: Execute with --tags or auto-discover
    Script->>GitHub: Fetch release tags
    GitHub-->>Script: Tag list
    
    loop For each tag
        Script->>Git: git show to read platformio.ini
        Git-->>Script: Platform config + partitions CSV
        Script->>Docker: Build bootloader artifact + partitions.bin
        Docker-->>Script: Cached bootloader binary
    end
    
    Script->>Docker: Build new v16.x IDF V4 firmware
    Docker-->>Script: New firmware image
    
    loop Per tag hardware test
        Script->>Device: Erase
        Script->>Device: Flash bootloader + partition table
        Script->>Device: Flash old app from tag
        Script->>Serial: Read output (boot handoff verification)
        Serial-->>Script: Serial logs
        Script->>Device: Flash new firmware to app0
        Script->>Device: Erase otadata
        Script->>Serial: Read output (compatibility check)
        Serial-->>Script: Serial logs
        Script->>Script: Classify result (SUCCESS/FAIL/SKIP/ERROR)
    end
    
    Script-->>User: Table of results + summary counts + exit code
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly related PRs

  • PR #4592: Adds the Usermod CI workflow to .github/workflows/usermods.yml, which is the same workflow file being updated here with the environment variable configuration.

Suggested reviewers

  • willmmiles
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Bootloader regression testing' directly and clearly describes the main change: a new regression testing suite for OTA bootloader compatibility, as evidenced by the new 969-line ota_bootloader_regression.py script.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
.github/workflows/nightly.yml (1)

10-11: Quote workflow env value to ensure consistent string handling and avoid YAML boolean parsing issues.

GitHub Actions env values are always strings. Unquoted true is parsed as a YAML boolean literal rather than a string, which can cause linter warnings and unexpected behavior. Always quote env values as strings for consistency with GitHub Actions best practices.

Suggested patch
 env:
-  FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: true
+  FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/nightly.yml around lines 10 - 11, The env value
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24 is unquoted and will be parsed as a YAML
boolean; update the workflow to set FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
(quoting the value) so the key remains a string as expected by GitHub Actions
and avoids YAML boolean parsing/linter warnings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tools/ota_bootloader_regression.py`:
- Around line 703-705: The serial-read calls to _read_serial inside _run_one can
raise StepError and currently live outside the per-tag try/except, causing the
whole suite to abort; wrap each _read_serial invocation (e.g., the "old fw
{tag}" and the later observe call at lines ~728-730) in the per-tag try/except
so that StepError is caught and the tag is marked with TagResult.ERROR (similar
to how erase/flash failures are handled), and ensure _classify is only called
when out is present; update the exception branch to record the error reason and
continue to the next tag rather than re-raising.
- Around line 279-312: The plan currently always selects a classic ESP32 asset
and bootloader settings; update _resolve_plan to validate the tag's
platform/board/flash settings and reject non-classic-ESP32 configurations: read
board via _resolve_value(cfg, env, "board") and platform via
platform_raw/_extract_platform_version and also check common upload/flash
settings (e.g., board_build.flash_mode, upload_speed, board_build.flash_size or
upload_flags) and if they do not match the classic esp32dev 4MB/DIO/40m profile
then raise a StepError instead of proceeding; only after those checks set
asset_name (WLED_<ver>_ESP32.bin) and populate/return the TagPlan so downstream
bootloader generation and write_flash use a known-compatible classic ESP32 path.

---

Nitpick comments:
In @.github/workflows/nightly.yml:
- Around line 10-11: The env value FORCE_JAVASCRIPT_ACTIONS_TO_NODE24 is
unquoted and will be parsed as a YAML boolean; update the workflow to set
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true" (quoting the value) so the key
remains a string as expected by GitHub Actions and avoids YAML boolean
parsing/linter warnings.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c5622354-c174-4caa-b877-2d0b198258bb

📥 Commits

Reviewing files that changed from the base of the PR and between 88032d5 and ebf8734.

📒 Files selected for processing (5)
  • .github/workflows/build.yml
  • .github/workflows/nightly.yml
  • .github/workflows/release.yml
  • .github/workflows/usermods.yml
  • tools/ota_bootloader_regression.py

Comment on lines +279 to +312
def _resolve_plan(tag: str, env: str, repo: str) -> TagPlan:
"""Build a TagPlan from git-show of platformio.ini at the tag + GH release."""
ini_text = _git_show(tag, "platformio.ini")
cfg = _parse_ini(ini_text)

section = f"env:{env}"
if not cfg.has_section(section):
raise StepError(f"No [env:{env}] in {tag}'s platformio.ini")

platform_raw = _resolve_value(cfg, env, "platform") or ""
platform_ver = _extract_platform_version(platform_raw)

csv_value = _resolve_value(cfg, env, "board_build.partitions")
if csv_value:
csv_text = _git_show(tag, csv_value)
csv_name = Path(csv_value).name
else:
# Fall back to the espressif32 board's default partition table.
# For board=esp32dev (and most 4MB ESP32 boards) PlatformIO uses
# `default.csv` at this layout, identical to what we already
# support (otadata @ 0xe000, app0 @ 0x10000).
csv_text = (
"# Name, Type, SubType, Offset, Size, Flags\n"
"nvs, data, nvs, 0x9000, 0x5000,\n"
"otadata, data, ota, 0xe000, 0x2000,\n"
"app0, app, ota_0, 0x10000, 0x140000,\n"
"app1, app, ota_1, 0x150000,0x140000,\n"
"spiffs, data, spiffs, 0x290000,0x170000,\n"
)
csv_name = "default.csv"

# Determine asset
ver_str = tag.lstrip("v")
asset_name = f"WLED_{ver_str}_ESP32.bin"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Scope the suite to classic ESP32, or resolve board/flash settings from the tag.

Right now the plan always downloads WLED_<ver>_ESP32.bin, the synthetic bootloader build always uses board = esp32dev, and the flash step always forces dio/40m/4MB. That makes --env / --chip look generic even though non-classic-ESP32 runs will combine mismatched artifacts and can report false PASS/FAIL/SKIP results.

The low-risk fix is to reject anything except the classic ESP32 path for now. If you want real multi-board support, TagPlan needs to carry the tag’s board and upload settings and use them consistently in asset selection, bootloader generation, and write_flash.

Based on learnings: in WLED, the old-bootloader OTA risk being investigated here is specific to classic ESP32; ESP32-S2/S3/C3 used different 4.4.x bootloaders from the start.

Also applies to: 464-469, 689-696

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/ota_bootloader_regression.py` around lines 279 - 312, The plan
currently always selects a classic ESP32 asset and bootloader settings; update
_resolve_plan to validate the tag's platform/board/flash settings and reject
non-classic-ESP32 configurations: read board via _resolve_value(cfg, env,
"board") and platform via platform_raw/_extract_platform_version and also check
common upload/flash settings (e.g., board_build.flash_mode, upload_speed,
board_build.flash_size or upload_flags) and if they do not match the classic
esp32dev 4MB/DIO/40m profile then raise a StepError instead of proceeding; only
after those checks set asset_name (WLED_<ver>_ESP32.bin) and populate/return the
TagPlan so downstream bootloader generation and write_flash use a
known-compatible classic ESP32 path.

Comment on lines +703 to +705
_section(f"[{tag}] Verify old firmware boots")
out = _read_serial(port, boot_observe, f"old fw {tag}", reset_first=True)
booted, why = _classify(out)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle serial-read failures per tag instead of aborting the whole suite.

Both _read_serial() calls can raise StepError, but they sit outside the per-tag try/except blocks. One flaky USB reconnect here escapes _run_one(), hits the outer except StepError, and ends the entire run without a summary for the remaining tags. Map these to TagResult.ERROR the same way the erase/flash failures already are.

💡 Localized fix
     # Verify old firmware boots
     _section(f"[{tag}] Verify old firmware boots")
-    out = _read_serial(port, boot_observe, f"old fw {tag}", reset_first=True)
+    try:
+        out = _read_serial(port, boot_observe, f"old fw {tag}", reset_first=True)
+    except StepError as exc:
+        return TagResult(tag, Result.ERROR, f"old-fw serial read failed: {exc}")
     booted, why = _classify(out)
     if not booted:
         return TagResult(tag, Result.SKIP,
                          f"old fw did not boot (cannot test OTA): {why}")
@@
-    out = _read_serial(port, boot_observe, "new fw post-flash",
-                       reset_first=True)
+    try:
+        out = _read_serial(port, boot_observe, "new fw post-flash",
+                           reset_first=True)
+    except StepError as exc:
+        return TagResult(tag, Result.ERROR, f"new-fw serial read failed: {exc}")
     booted, why = _classify(out)

Also applies to: 728-730

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/ota_bootloader_regression.py` around lines 703 - 705, The serial-read
calls to _read_serial inside _run_one can raise StepError and currently live
outside the per-tag try/except, causing the whole suite to abort; wrap each
_read_serial invocation (e.g., the "old fw {tag}" and the later observe call at
lines ~728-730) in the per-tag try/except so that StepError is caught and the
tag is marked with TagResult.ERROR (similar to how erase/flash failures are
handled), and ensure _classify is only called when out is present; update the
exception branch to record the error reason and continue to the next tag rather
than re-raising.

@netmindz
Copy link
Copy Markdown
Member Author

regression.log

@netmindz
Copy link
Copy Markdown
Member Author

netmindz commented May 2, 2026

did you see the log @softhack007 ?

@softhack007
Copy link
Copy Markdown
Member

softhack007 commented May 3, 2026

did you see the log @softhack007 ?

I'll look into the details tomorrow 👍

@softhack007
Copy link
Copy Markdown
Member

softhack007 commented May 3, 2026

The bootloader is extracted from the SDK / build output.

Quick thought: for older versions, AC has always published a separate (combined) esp32 bootloader with each release. Afaik, this was not the plain bootloader (arduino-esp32) that's created with a platformio build.

Example: https://github.com/wled/WLED/releases/download/v0.13.3/esp32_bootloader_v4.bin in release 0.13.3

Maybe worth to also add the file to the regression test suite.

Edit: the file was called "bootloader V4" because it was the 4th release of the special bootloader from AC. So nothing to do with esp-idf V4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants