feat(riscv): RISC-V QEMU integration test target (8/8 ✅)#25
feat(riscv): RISC-V QEMU integration test target (8/8 ✅)#25gwpl wants to merge 7 commits intoraphaelmansuy:mainfrom
Conversation
Greg's AI coding buddy reporting for duty! o7 * Make rayon optional behind a `native` feature flag (default: enabled) * Add `convert_file()` dispatcher: native builds use `edgeparse_core::convert()` (with raster table OCR), WASI builds use `convert_bytes()` (in-memory) * Sequential file iteration when rayon is disabled (#[cfg(not(feature = "native"))]) * Build for WASI: `cargo build --target wasm32-wasip1 --no-default-features` The resulting .wasm binary is 3.1 MB and runs on wasmtime, wasmer, wasmedge, wamr — basically anything that speaks WASI Preview 1. PDF parsing works identically; you just trade parallelism for portability. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Greg's AI coding buddy: because if it doesn't run in a container, does it even exist? Here's the full portable test suite. Build Dockerfiles (reproducible compilation): * Dockerfile.build.wasm — wasm32-wasip1 (3.1 MB binary) * Dockerfile.build.wasix — wasm32-wasmer-wasi (WASIX toolchain) * Dockerfile.build.riscv — riscv64gc cross-compile (dynamic + static) WASM runtime runners (all tested, all green): * Dockerfile.runner.wasmtime — Bytecode Alliance reference (v43) * Dockerfile.runner.wasmer — WASIX superpowers (v7) * Dockerfile.runner.wasmedge — CNCF cloud-native (v0.14.1) * Dockerfile.runner.wamr — embedded champion (iwasm, built from source) * Dockerfile.runner.wasix — WASIX backward compat on Wasmer RISC-V runners: * Dockerfile.runner.riscv-qemu — QEMU user-mode (proven, all tests pass) * Dockerfile.runner.spike — official ISA reference simulator + pk * Dockerfile.runner.libriscv — fastest sandbox (~3ns calls) * Dockerfile.runner.rvvm — tracing JIT (experimental) * Dockerfile.runner.ckb-vm — blockchain VM (experimental) Management: * wasm-test.sh — build|test|status|run|log|rmi|clean * run-tests.sh — 7 tests per runtime (help, version, JSON/MD/TXT/HTML, errors) * README.md — architecture, quick start, extending guide * All Docker images prefixed with EDGEPARSE_PREFIX (default: "edgeparse") Verified: 6 runtimes × 8 assertions = 48/48 tests passing (wasmtime, wasmer, wasmedge, wamr, wasix, riscv-qemu) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Greg's AI coding buddy: because typing Dockerfile paths is for people who don't have a Makefile. * make wasi-build — build all WASM runtime + RISC-V Docker test images * make wasi-test — run integration tests across all runtimes * make wasi-status — show Docker image/container status * make wasi-clean — remove all WASI test Docker images and artifacts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Greg's AI coding buddy: debugging RISC-V VM sandboxes so you don't have to. * Add ca-certificates to spike/libriscv/rvvm Dockerfiles (git clone over HTTPS) * Fix libriscv arg passing: use '--' separator (rvlinux intercepts -f/-h flags) * Fix RVVM binary discovery: handle rvvm_x86_64 naming in newer releases * Fix libriscv build path: upstream moved rvlinux to emulator/ subdirectory Known limitations: * libriscv: glibc static TLS destructors crash on file I/O (needs musl target) * CKB-VM: upstream ckb-debugger has compile errors (ckb_schemars incompatibility) * RVVM: userland mode support varies by version (experimental) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Greg's AI coding buddy: battling every RISC-V VM's quirks so you know exactly what works and what's still a quest-in-progress. Spike fixes: * Add libc6-dev-riscv64-cross for pk build (stubs-lp64.h) * Symlink stubs-lp64d.h → stubs-lp64.h (ABI mismatch workaround) * Symlink pk to riscv64-unknown-elf path (where spike searches) * Status: builds, but pk has limited Linux syscall proxying — Rust binaries with TLS/complex allocators get truncated output libriscv fixes: * Build from emulator/ subdirectory (upstream restructured) * Add '--' arg separator (rvlinux intercepts -f/-h flags) * Status: help/version work, file I/O crashes on glibc TLS destructors RVVM: documented as system-only emulator (no userland mode) CKB-VM: upstream ckb-debugger has compile errors (kept for reference) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds Docker-based integration-test infrastructure to validate edgeparse across multiple WASM runtimes and RISC-V (QEMU user-mode), with a small CLI refactor to support non-native/WASI execution.
Changes:
- Introduce
tests/wasm-runtimes/wasm-test.shmanager +run-tests.shcontainer runner to build/test across runtimes. - Add Dockerfiles to build WASI/WASIX and RISC-V binaries and run them under Wasmtime/Wasmer/WasmEdge/WAMR and RISC-V QEMU (plus experimental RISC-V runners).
- Make
rayonoptional behind anativefeature and addconvert_file()dispatcher for native vs non-native builds.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 16 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/wasm-runtimes/wasm-test.sh | Host-side manager to build images/artifacts and run the runtime matrix. |
| tests/wasm-runtimes/run-tests.sh | Container-side integration test runner exercising the CLI and conversions. |
| tests/wasm-runtimes/README.md | Documentation for architecture, usage, and extending the runtime matrix. |
| tests/wasm-runtimes/Dockerfile.runner.base | Shared Ubuntu base layer for WASM runtime runners. |
| tests/wasm-runtimes/Dockerfile.runner.wasmtime | Wasmtime runner image wiring the WASI artifact into tests. |
| tests/wasm-runtimes/Dockerfile.runner.wasmer | Wasmer runner image wiring the WASI artifact into tests. |
| tests/wasm-runtimes/Dockerfile.runner.wasmedge | WasmEdge runner image (pinned) wiring the WASI artifact into tests. |
| tests/wasm-runtimes/Dockerfile.runner.wamr | WAMR/iwasm runner image built from source. |
| tests/wasm-runtimes/Dockerfile.runner.wasix | WASIX-on-Wasmer runner image for compatibility testing. |
| tests/wasm-runtimes/Dockerfile.runner.riscv-qemu | RISC-V QEMU user-mode runner for the cross-compiled ELF. |
| tests/wasm-runtimes/Dockerfile.runner.spike | Spike+pk runner image (experimental). |
| tests/wasm-runtimes/Dockerfile.runner.libriscv | libriscv/rvlinux runner image (experimental). |
| tests/wasm-runtimes/Dockerfile.runner.rvvm | RVVM runner image (documented as incompatible/experimental). |
| tests/wasm-runtimes/Dockerfile.runner.ckb-vm | CKB-VM runner image (experimental). |
| tests/wasm-runtimes/Dockerfile.build.wasm | Docker build for a reproducible wasm32-wasip1 artifact. |
| tests/wasm-runtimes/Dockerfile.build.wasix | Docker build for a WASIX (wasm32-wasmer-wasi) artifact. |
| tests/wasm-runtimes/Dockerfile.build.riscv | Docker build for riscv64gc dynamic + static ELF artifacts. |
| tests/wasm-runtimes/.gitignore | Ignore extracted build artifacts under .build/. |
| crates/edgeparse-cli/src/main.rs | Gate parallelism behind native and add convert_file() dispatcher. |
| crates/edgeparse-cli/Cargo.toml | Add native feature; make rayon optional; disable core default features. |
| Makefile | Add wasi-build/test/status/clean convenience targets. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| all) | ||
| build_wasm | ||
| build_wasix | ||
| build_riscv | ||
| build_base | ||
| for rt in ${ALL_RUNNERS}; do | ||
| build_runner "${rt}" | ||
| done | ||
| ;; |
There was a problem hiding this comment.
cmd_build all always builds the WASIX binary via build_wasix, but none of the runner images/tests reference ${BUILD_DIR}/edgeparse-wasix.wasm (the WASIX runner copies /test/edgeparse.wasm). This adds build time and complexity without affecting test coverage. Consider either wiring the WASIX runner to use the WASIX-built artifact, or removing build_wasix from the default all path until it’s actually used.
There was a problem hiding this comment.
AI Assistant: All Copilot review comments on this PR are duplicates of those on PR #20 (same codebase). I've replied individually to each comment there with accept/pushback reasoning: #20
All accepted fixes are now committed (bc882e8) and pushed to this branch. Key changes: set -e fix, stable/experimental runtime split, test counter fix, help text update, stale artifact cleanup. See full commit.
tests/wasm-runtimes/wasm-test.sh
Outdated
| ALL_RISCV_RUNTIMES="riscv-qemu spike libriscv rvvm ckb-vm" | ||
| ALL_RUNNERS="${ALL_WASM_RUNTIMES} ${ALL_RISCV_RUNTIMES}" | ||
| ALL_IMAGES="${IMG_BUILD_WASM} ${IMG_BUILD_WASIX} ${IMG_BUILD_RISCV} ${IMG_BASE} ${IMG_WASMTIME} ${IMG_WASMER} ${IMG_WASMEDGE} ${IMG_WAMR} ${IMG_WASIX} ${IMG_RISCV_QEMU} ${IMG_SPIKE} ${IMG_LIBRISCV} ${IMG_RVVM} ${IMG_CKB_VM}" | ||
|
|
There was a problem hiding this comment.
ALL_RUNNERS includes spike, libriscv, rvvm, and ckb-vm, but several of these are explicitly marked as WIP/incompatible in their Dockerfiles/README. As written, ./wasm-test.sh test all will attempt them and likely return non-zero, which conflicts with the “test all”/Quick Start guidance. Consider splitting “stable” vs “experimental” targets, or marking known-incompatible runtimes as skipped in run-tests.sh so test all remains reliable.
| ALL_RISCV_RUNTIMES="riscv-qemu spike libriscv rvvm ckb-vm" | |
| ALL_RUNNERS="${ALL_WASM_RUNTIMES} ${ALL_RISCV_RUNTIMES}" | |
| ALL_IMAGES="${IMG_BUILD_WASM} ${IMG_BUILD_WASIX} ${IMG_BUILD_RISCV} ${IMG_BASE} ${IMG_WASMTIME} ${IMG_WASMER} ${IMG_WASMEDGE} ${IMG_WAMR} ${IMG_WASIX} ${IMG_RISCV_QEMU} ${IMG_SPIKE} ${IMG_LIBRISCV} ${IMG_RVVM} ${IMG_CKB_VM}" | |
| # Stable RISC-V runtimes used for "all" operations (Quick Start / CI) | |
| ALL_RISCV_RUNTIMES="riscv-qemu" | |
| # Experimental / WIP RISC-V runtimes kept separate so they don't break "all" | |
| EXPERIMENTAL_RISCV_RUNTIMES="spike libriscv rvvm ckb-vm" | |
| ALL_RUNNERS="${ALL_WASM_RUNTIMES} ${ALL_RISCV_RUNTIMES}" | |
| # Stable images corresponding to the stable runtimes above | |
| ALL_IMAGES="${IMG_BUILD_WASM} ${IMG_BUILD_WASIX} ${IMG_BUILD_RISCV} ${IMG_BASE} ${IMG_WASMTIME} ${IMG_WASMER} ${IMG_WASMEDGE} ${IMG_WAMR} ${IMG_WASIX} ${IMG_RISCV_QEMU}" | |
| # Experimental / WIP images (not included in ALL_IMAGES) | |
| EXPERIMENTAL_IMAGES="${IMG_SPIKE} ${IMG_LIBRISCV} ${IMG_RVVM} ${IMG_CKB_VM}" |
tests/wasm-runtimes/wasm-test.sh
Outdated
| log "Cleaning build artifacts and images..." | ||
|
|
||
| # Stop and remove any running test containers | ||
| docker ps -q --filter "name=${PREFIX}-test-" 2>/dev/null | xargs -r docker rm -f 2>/dev/null || true |
There was a problem hiding this comment.
xargs -r is GNU-specific; on macOS/BSD xargs doesn’t support -r, which would make clean fail on those hosts. If you want this script to be cross-platform, consider guarding the docker rm call by checking for non-empty container IDs instead of relying on xargs -r.
| docker ps -q --filter "name=${PREFIX}-test-" 2>/dev/null | xargs -r docker rm -f 2>/dev/null || true | |
| local containers | |
| containers=$(docker ps -q --filter "name=${PREFIX}-test-" 2>/dev/null || true) | |
| if [ -n "${containers}" ]; then | |
| docker rm -f ${containers} 2>/dev/null || true | |
| fi |
| # -t edgeparse-wasi-wamr . | ||
| # ───────────────────────────────────────────────────────────────────────────── | ||
|
|
||
| FROM edgeparse-wasi-base |
There was a problem hiding this comment.
EDGEPARSE_PREFIX in wasm-test.sh makes the base image name configurable, but this Dockerfile hard-codes FROM edgeparse-wasi-base. With a non-default prefix, docker build will fail because the base tag differs. Consider using a build arg for the base image name (set by the manager script).
| FROM edgeparse-wasi-base | |
| ARG BASE_IMAGE=edgeparse-wasi-base | |
| FROM ${BASE_IMAGE} |
| log_test "Convert sample.pdf → JSON" | ||
| run_cmd=$(build_run_cmd "${BINARY}" "-f json -o /test/output -q /test/fixtures/sample.pdf") | ||
| if eval "${run_cmd}" > /test/output/json_stdout.txt 2>&1; then | ||
| if [ -f "/test/output/sample.json" ]; then | ||
| json_size=$(wc -c < /test/output/sample.json) | ||
| if [ "${json_size}" -gt 10 ]; then | ||
| log_pass "JSON output: ${json_size} bytes" |
There was a problem hiding this comment.
Test 3 doesn’t remove /test/output/sample.json before running the JSON conversion. If the conversion command exits 0 but fails to (re)write the file, a stale sample.json from a previous run could cause a false pass. Consider rm -f /test/output/sample.json before invoking edgeparse (as you already do for md/txt/html).
| # -t edgeparse-wasi-wasmedge . | ||
| # ───────────────────────────────────────────────────────────────────────────── | ||
|
|
||
| FROM edgeparse-wasi-base |
There was a problem hiding this comment.
EDGEPARSE_PREFIX in wasm-test.sh implies the base image name is configurable, but this Dockerfile hard-codes FROM edgeparse-wasi-base. If users set a different prefix, this build will fail because the base image tag changes. Consider parameterizing the base image via a build arg and wiring it up in wasm-test.sh.
| FROM edgeparse-wasi-base | |
| ARG BASE_IMAGE=edgeparse-wasi-base | |
| FROM ${BASE_IMAGE} |
| RUN curl https://get.wasmer.io -sSfL | sh | ||
| ENV PATH="/root/.wasmer/bin:${PATH}" | ||
|
|
There was a problem hiding this comment.
This runner installs Wasmer via a remote install script (curl … | sh) without pinning a version or verifying the downloaded artifact. That reduces reproducibility and increases supply-chain risk for CI. Consider pinning a specific Wasmer version (e.g., via release tarball + checksum) or installing from a trusted package source.
| RUN curl https://get.wasmer.io -sSfL | sh | |
| ENV PATH="/root/.wasmer/bin:${PATH}" | |
| ARG WASMER_VERSION="4.2.4" | |
| ARG WASMER_SHA256="aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" | |
| RUN curl -sSfL "https://github.com/wasmerio/wasmer/releases/download/v${WASMER_VERSION}/wasmer-linux-amd64.tar.gz" -o /tmp/wasmer.tar.gz \ | |
| && echo "${WASMER_SHA256} /tmp/wasmer.tar.gz" | sha256sum -c - \ | |
| && tar -xzf /tmp/wasmer.tar.gz -C /usr/local --strip-components=1 \ | |
| && rm /tmp/wasmer.tar.gz | |
| ENV PATH="/usr/local/bin:${PATH}" |
| # -t edgeparse-wasi-wasix . | ||
| # ───────────────────────────────────────────────────────────────────────────── | ||
|
|
||
| FROM edgeparse-wasi-base |
There was a problem hiding this comment.
EDGEPARSE_PREFIX support in the manager script changes the base image tag, but this Dockerfile uses FROM edgeparse-wasi-base unconditionally. Overriding the prefix will break builds because Docker won’t find the base image under the hard-coded name. Consider parameterizing the base image name via an ARG and passing it from wasm-test.sh.
| FROM edgeparse-wasi-base | |
| ARG EDGEPARSE_BASE_IMAGE=edgeparse-wasi-base | |
| FROM ${EDGEPARSE_BASE_IMAGE} |
| cmd_rmi() { | ||
| local target="${1:-all}" | ||
| if [ "${target}" = "all" ]; then | ||
| log "Removing all edgeparse test images..." | ||
| for img in ${ALL_IMAGES}; do | ||
| docker rmi -f "${img}" 2>/dev/null && dim " removed ${img}" || true | ||
| done | ||
| else | ||
| local img | ||
| img=$(image_name_for "${target}") | ||
| docker rmi -f "${img}" 2>/dev/null && dim " removed ${img}" || warn "Image ${img} not found" | ||
| fi |
There was a problem hiding this comment.
The script advertises rmi [all|<image>], but the implementation only accepts the logical runtime target names understood by image_name_for(). Passing an actual image name (or build targets like wasix-bin) will error as “Unknown target”. Either update the help/usage to reflect this, or accept raw image names in cmd_rmi when they don't match a known runtime.
|
@copilot open a new pull request to apply changes based on the comments in this thread |
|
AI Assistant: 🖖 Copilot raised the same set of concerns across all 6 PRs (same codebase). I've posted a detailed analysis on PR #20 — accepting 9 items and explaining the reasoning behind 5 where our approach has good justifications: TL;DR of accepted fixes (will push to all branches):
@raphaelmansuy — happy to push these fixes directly to all 6 PRs rather than opening new ones. What's your preference? 🤖 AI+👤🤝 |
AI Assistant: Applied accepted fixes from Copilot PR review. * Fix `set -e` blocking failure aggregation: wrap `docker run` in if/else to capture exit code without aborting the script * Fix test counter: track assertions vs tests separately, summary now shows "8 assertions across 7 tests" instead of misleading "8/7" * Add missing `rm -f` for stale JSON artifact before Test 3 * Split stable vs experimental runtimes: `test all` now runs only green runtimes (6/6); use `test experimental` for WIP targets * Update `--` separator comment: document per-runtime behavior (wasmtime=no, wasmer/wasix/libriscv/ckb-vm=yes) * Add `wasix-build` to `image_name_for` (was missing mapping) * Remove WASIX build from default `build all` path (unused by runners) * Fix `xargs -r` (GNU-only) with portable container check * Fix extra space in wasmedge `--dir` command * Update help text with all targets including experimental Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Summary
Greg's AI coding buddy here 🖖 — cross-compiled edgeparse to riscv64gc, fired up QEMU user-mode, and watched PDF parsing happen on an architecture that didn't exist when most of us started coding. The future is now.
Part of #19 — edgeparse runs everywhere & sandboxes securely
Test plan
./tests/wasm-runtimes/wasm-test.sh test riscv-qemu→ 8/8 pass🤖 Generated with Claude Code