Skip to content

Commit 719de10

Browse files
committed
chore: remove tart-based test-vm targets; L4 now runs on github actions
test-vm, test-vm-run, test-vm-parallel and their VM_A/VM_B variables are deleted. L4 destructive e2e now runs on macos-14 GitHub Actions runners (vm-e2e-spike.yml) — each job gets a fresh macOS VM at no extra cost for public repos. Kept test-vm-inner / test-vm-inner-run for local use and as the commands the CI workflow invokes. Updated: Makefile, CLAUDE.md, CONTRIBUTING.md, AGENTS.md, HARNESS.md, auto-release.yml checklist.
1 parent 3816385 commit 719de10

7 files changed

Lines changed: 31 additions & 73 deletions

File tree

.github/workflows/auto-release.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -176,13 +176,13 @@ jobs:
176176
**not** auto-tag. Run the destructive e2e suite locally, then
177177
cut the release manually:
178178
179-
- [ ] \`make test-vm-parallel\` passes (Apple Silicon + Tart required — see scripts/vm/README.md)
179+
- [ ] L4 CI (\`vm-e2e-spike.yml\`) is green on the latest commit on \`main\`
180180
- [ ] sanity-check the curl|bash smoke and cli-compat results in the most recent test.yml run on main
181181
- [ ] \`git tag -a ${NEW_TAG} -m "..."\` and \`git push origin ${NEW_TAG}\`
182182
- [ ] close this issue
183183
184-
Skipping \`make test-vm-parallel\` is allowed (it is not a hard gate),
185-
but \`feat:\` changes carry more risk than \`fix:\` patches.
184+
L4 CI is not yet a hard merge gate, but \`feat:\` changes carry
185+
more risk than \`fix:\` patches — verify it before tagging.
186186
EOF
187187
)
188188

.github/workflows/vm-e2e-spike.yml

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,7 @@ on:
66
workflow_dispatch:
77

88
jobs:
9-
# Mirrors VM_A in make test-vm-parallel:
10-
# long-running journey tests that install packages and modify system state.
9+
# Group A: long-running journey tests that install packages and modify system state.
1110
group-a:
1211
runs-on: macos-14
1312
timeout-minutes: 60
@@ -28,8 +27,7 @@ jobs:
2827
-run 'TestVM_Journey_FirstTimeUser|TestVM_Journey_DryRunIsCompletelySafe|TestVM_Interactive_InstallScript' \
2928
./test/e2e/...
3029
31-
# Mirrors VM_B in make test-vm-parallel:
32-
# dotfiles, macOS defaults, edge cases, sync, and non-destructive e2e.
30+
# Group B: dotfiles, macOS defaults, edge cases, sync, and non-destructive e2e.
3331
group-b:
3432
runs-on: macos-14
3533
timeout-minutes: 60

AGENTS.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -81,8 +81,8 @@ These are loaded automatically when Claude runs in this repo.
8181
- `git push --force` against `main` or release tags.
8282
- `git commit --amend` on commits already pushed.
8383
- `git reset --hard` discarding uncommitted work.
84-
- Running `make test-vm` (or any other `test-vm-*` target) outside an ephemeral
85-
VM — these install real packages.
84+
- Running `make test-vm-inner` (or `test-vm-inner-run`) outside a throwaway
85+
machine — these install real packages onto the current host.
8686
- Anything that modifies the user's `~/.zshrc`, Homebrew install, or
8787
macOS `defaults`.
8888

CLAUDE.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,8 +24,8 @@ make build-release VERSION=0.25.0 # optimized + UPX
2424
# Test — full tier table in CONTRIBUTING.md
2525
make test-unit # L1 (~75s) — unit + integration + contract; pre-push hook
2626
make test-e2e # L3 compiled binary
27-
make test-vm-parallel # L4 (~14 min) — destructive e2e in 2 parallel Tart VMs; before tagging
28-
make test-vm # L4 serial fallback — single VM, use for debugging
27+
make test-vm-inner # L4 — full destructive e2e suite (runs in CI on macos-14; locally on a spare machine only)
28+
make test-vm-inner-run TEST=Foo # L4 — single test
2929
make test-coverage # coverage.out + coverage.html
3030

3131
# Single test

CONTRIBUTING.md

Lines changed: 7 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -37,37 +37,24 @@ Tests are split across four tiers. Which one runs where:
3737
| **L1 Unit + Integration + Contract** | Pure-Go logic with faked `Runner` *plus* real `brew` / `git` / `npm` against temp dirs and real `httptest` servers | `make test-unit` (~75s) | Every push (pre-push hook); CI on push/PR |
3838
| **L2 Contract schema** | JSON schema validation against [openboot-contract](https://github.com/openbootdotdev/openboot-contract) | (runs in CI only) | CI on push/PR |
3939
| **L3 E2E binary** | Compiled binary driven by scripts; `-tags=e2e` | `make test-e2e` | CI on release |
40-
| **L4 VM e2e** | Full destructive suite (`-tags="e2e,vm"`) runs inside an ephemeral Tart VM provisioned by `scripts/vm/run.sh`. Installs real packages, modifies `~/.zshrc`, writes `defaults` — all contained to the throwaway VM. | `make test-vm-parallel` (~14 min, Apple Silicon + Tart required); `make test-vm` is the serial single-VM fallback | **Local only**convention is to run before tagging a release. No CI gate. |
40+
| **L4 VM e2e** | Full destructive suite (`-tags="e2e,vm"`). Installs real packages, modifies `~/.zshrc`, writes `defaults`. Each run requires a clean macOS host (Apple Silicon). | `make test-vm-inner` (or single test: `make test-vm-inner-run TEST=Foo`) | **CI**runs on GitHub Actions `macos-14` runner (every PR via `vm-e2e-spike.yml`). Locally only on a throwaway machine. |
4141

4242
Rules of thumb:
4343

4444
- **Local dev:** run nothing manually if hooks are installed. `make test-unit` on demand when you want a sanity check. Skip L2+ unless you're cutting a release.
4545
- **Before pushing:** `make test-unit` (the pre-push hook does this automatically). Requires `brew` / `git` / `npm` on PATH — they are queried read-only against temp dirs, no real installs.
46-
- **Before tagging a release (convention, not enforced):** `make test-vm-parallel` on an Apple Silicon Mac with Tart installed. See [VM E2E setup](#vm-e2e-setup) below. `auto-release.yml` opens a `release-ready` issue on `feat:` thresholds to nudge you here.
46+
- **Before tagging a release:** check that the L4 CI job (`vm-e2e-spike.yml`) is green on the latest commit on `main`. `auto-release.yml` opens a `release-ready` issue on `feat:` thresholds to nudge you here.
4747

48-
## VM E2E setup
48+
## VM E2E
4949

50-
Destructive tests (L4) run inside an ephemeral Tart VM. One-time setup
51-
on an Apple Silicon Mac:
50+
L4 tests run on GitHub Actions (`macos-14` runner, Apple Silicon). Each job
51+
gets a fresh macOS VM — no local setup required.
5252

5353
```bash
54-
brew install cirruslabs/cli/tart
55-
tart pull ghcr.nju.edu.cn/cirruslabs/tahoe-base:latest
56-
tart clone ghcr.nju.edu.cn/cirruslabs/tahoe-base:latest tahoe-base
54+
make test-vm-inner # full suite (use on a throwaway machine only)
55+
make test-vm-inner-run TEST=TestVM_Journey_FirstTimeUser # single test
5756
```
5857

59-
Then:
60-
61-
```bash
62-
make test-vm-parallel # full suite in parallel (~14 min)
63-
make test-vm # full suite serial fallback (~30 min)
64-
make test-vm-run TEST=TestVM_Journey_FirstTimeUser # one test
65-
OPENBOOT_VM_KEEP=1 make test-vm-parallel # don't destroy VMs at exit (debug)
66-
```
67-
68-
See `scripts/vm/README.md` for full environment-variable docs and
69-
troubleshooting.
70-
7158
## Git Hooks
7259

7360
`make install-hooks` symlinks two hooks from `scripts/hooks/` into `.git/hooks/`:

Makefile

Lines changed: 7 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,7 @@
11
.PHONY: test-unit test-e2e test-coverage test-all \
2-
test-vm test-vm-run test-vm-parallel test-vm-inner test-vm-inner-run \
2+
test-vm-inner test-vm-inner-run \
33
install-hooks uninstall-hooks
44

5-
# VM A: install/journey tests that touch real system state (longest-running).
6-
VM_A_TESTS := TestVM_Journey_FirstTimeUser|TestVM_Journey_DryRunIsCompletelySafe|TestVM_Interactive_InstallScript
7-
# VM B: all other VM tests — dotfiles, macOS, edge cases, smoke, real-install, sync.
8-
VM_B_TESTS := TestVM_Journey_Dotfiles|TestVM_Journey_MacOS|TestVM_Journey_FullSetupConfiguresEverything|TestVM_Edge_|TestSmoke_|TestE2E_
9-
105
BINARY_NAME=openboot
116
BINARY_PATH=./$(BINARY_NAME)
127
VERSION ?= dev
@@ -34,38 +29,17 @@ test-all:
3429
$(MAKE) test-coverage
3530

3631
# =============================================================================
37-
# Tart VM e2e — destructive tests run inside a throwaway Tart VM provisioned
38-
# by scripts/vm/run.sh. Files tagged `e2e,vm` run via `make test-vm-inner`;
39-
# files tagged `e2e && !vm` (auth, snapshot_api) run as L3 on the host.
40-
#
41-
# Requires Apple Silicon + Tart installed locally. See scripts/vm/README.md
42-
# for one-time setup. The relevant targets are defined immediately below
43-
# this header: test-vm, test-vm-run, test-vm-inner, test-vm-inner-run.
32+
# L4 VM e2e — destructive tests tagged `e2e,vm`. Run directly on any clean
33+
# macOS host (Apple Silicon). In CI this is a GitHub Actions macos-14 runner
34+
# (see .github/workflows/vm-e2e-spike.yml). Locally, run on a throwaway
35+
# machine or a Tart VM — do NOT run on your primary dev machine.
4436
# =============================================================================
4537

46-
# Developer-facing: provisions a Tart VM and runs the full e2e suite inside.
47-
test-vm: build
48-
scripts/vm/run.sh test-vm-inner
49-
50-
# Developer-facing: runs one named test inside a Tart VM.
51-
test-vm-run: build
52-
scripts/vm/run.sh "test-vm-inner-run TEST=$(TEST)"
53-
54-
# Developer-facing: runs e2e in two parallel VMs — VM A (system tests) and
55-
# VM B (mock-server tests). Requires ~16 GB RAM and 8 cores free.
56-
# Exit code is non-zero if either VM fails.
57-
test-vm-parallel: build
58-
@OPENBOOT_VM_TEST='$(VM_A_TESTS)' scripts/vm/run.sh test-vm-inner & PID_A=$$!; \
59-
OPENBOOT_VM_TEST='$(VM_B_TESTS)' scripts/vm/run.sh test-vm-inner & PID_B=$$!; \
60-
A_EXIT=0; B_EXIT=0; \
61-
wait $$PID_A || A_EXIT=$$?; \
62-
wait $$PID_B || B_EXIT=$$?; \
63-
[ $$A_EXIT -eq 0 ] && [ $$B_EXIT -eq 0 ]
64-
65-
# In-VM: invoked over SSH by run.sh — not called by developers directly.
38+
# Run the full L4 suite (same command CI uses).
6639
test-vm-inner:
6740
go test -v -timeout 60m -tags="e2e,vm" ./test/e2e/...
6841

42+
# Run a single L4 test by name: make test-vm-inner-run TEST=TestVM_Journey_FirstTimeUser
6943
test-vm-inner-run:
7044
go test -v -timeout 45m -tags="e2e,vm" -run '$(TEST)' ./test/e2e/...
7145

docs/HARNESS.md

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -48,9 +48,9 @@ Three regulation categories:
4848
| Behav. | L1 unit + integration + contract (faked runners *and* real brew/git/npm in temp dirs) | pre-push, CI | `make test-unit` |
4949
| Behav. | L2 contract schema (against openboot-contract repo) | CI | `.github/workflows/test.yml` `contract` job |
5050
| Behav. | L3 e2e binary | release | `make test-e2e` |
51-
| Behav. | L4 VM e2e (`vm`) — runs full destructive suite in a local Tart VM | local only (convention is pre-release; no CI gate) | `make test-vm-parallel` (~14 min, driver: `scripts/vm/run.sh`); `make test-vm` is the serial single-VM fallback |
51+
| Behav. | L4 VM e2e (`vm`) — full destructive suite on a clean macOS host | every PR | `.github/workflows/vm-e2e-spike.yml` (macos-14 runner, two parallel jobs); `make test-vm-inner` for local runs |
5252
| Behav. | curl\|bash smoke (install.sh + mock server) | every PR | `.github/workflows/test.yml` `curl-bash-smoke` job |
53-
| Behav. | Auto-release sensor — patch fast lane (`fix:`-only) auto-tags + dispatches `release.yml`; feat threshold opens a `release-ready` issue with a `make test-vm-parallel` checklist instead | push to `main` | `.github/workflows/auto-release.yml` |
53+
| Behav. | Auto-release sensor — patch fast lane (`fix:`-only) auto-tags + dispatches `release.yml`; feat threshold opens a `release-ready` issue (check L4 CI green, then tag manually) | push to `main` | `.github/workflows/auto-release.yml` |
5454
| Behav. | Release notes — Conventional Commits since previous tag, grouped by type (Features / Bug Fixes / etc) + Full Changelog link, appended to the install-instructions template | tag push or `workflow_dispatch` | `.github/workflows/release.yml` (`Write release notes` step) |
5555
| Behav. | Old-CLI compat (previous release × current mock server) | every PR | `.github/workflows/test.yml` `cli-compat` job |
5656
| Feedfwd. | Agent conventions | every AI turn | `CLAUDE.md`, `AGENTS.md` |
@@ -110,13 +110,12 @@ it survives doc rot.
110110
to the inline `\r\033[K` renderer when unavailable. A static rule can't
111111
see runtime terminal capabilities, so this stays a runtime concern. The
112112
fallback is covered by `TestStickyProgressFallsBackWhenScrollRegionUnsupported`.
113-
- **No CI gate for VM e2e.** Apple Silicon Tart VMs don't run on
114-
GitHub-hosted `macos-latest` runners (no nested virt, wrong arch
115-
guarantees), and we declined to set up a self-hosted runner. L4 is
116-
local-only. Running `make test-vm-parallel` before tagging is convention,
117-
encoded as a `release-ready` issue opened by `auto-release.yml`
118-
on `feat:` thresholds — not a hard gate. A human can release without
119-
it.
113+
- **L4 runs on GitHub Actions, not a self-hosted runner.** `macos-14`
114+
runners are Apple Silicon VMs — each job gets a fresh clean macOS
115+
environment, which is exactly what L4 needs. Tart is no longer required.
116+
The L4 workflow (`vm-e2e-spike.yml`) is not yet a hard merge gate (not in
117+
`required-checks.txt`); it runs on every PR. Promoting it to a required
118+
check is the next step once the workflow has proven stable.
120119

121120
## How agents should think about this file
122121

0 commit comments

Comments
 (0)