rivet-dev · NathanFlurry · May 4, 2026 · May 2, 2026
diff --git a/.claude/skills/driver-test-runner/SKILL.md b/.claude/skills/driver-test-runner/SKILL.md
@@ -1,12 +1,12 @@
 ---
 name: driver-test-runner
-description: Methodically run the RivetKit driver test suite file by file, tracking progress in .agent/notes/driver-test-progress.md. Use when you need to validate the driver test suite after changes, bring up a new driver, or debug test failures systematically.
+description: Methodically run the RivetKit driver test suite file by file across the native (NAPI) and wasm runtimes, tracking progress in .agent/notes/driver-test-progress.md. Use when you need to validate the driver test suite after changes, bring up a new driver, or debug test failures systematically.
 allowed-tools: Bash, Read, Write, Edit, Grep, Glob, Agent, TaskCreate, TaskUpdate
 ---
 
 # Driver Test Suite Runner
 
-Methodically run the RivetKit driver test suite one file group at a time, tracking progress in `.agent/notes/driver-test-progress.md`.
+Methodically run the RivetKit driver test suite one file group at a time across the native (NAPI) and wasm runtimes, tracking progress in `.agent/notes/driver-test-progress.md`.
 
 ## Arguments
 
@@ -16,10 +16,27 @@ The skill accepts optional arguments:
 - **`resume`** — Pick up from where we left off (default behavior).
 - **`only <file>`** — Run only a specific test file group (e.g., `only actor-conn`).
 - **`from <file>`** — Start from a specific file group, skipping earlier ones.
+- **`runtime <native|wasm|both>`** — Which runtime(s) to run (default: `both`).
 - **`encoding <bare|cbor|json>`** — Override encoding (default: `bare`).
-- **`client <http|inline>`** — Override client type (default: `http`).
 - **`registry <static>`** — Override registry type (default: `static`).
 
+## Runtime Matrix
+
+The driver suite runs over a runtime × SQLite-backend × encoding matrix defined in `rivetkit-typescript/packages/rivetkit/tests/driver/shared-matrix.ts`. The runtime dimension has two values:
+
+- **`native`** — NAPI bindings (`@rivetkit/rivetkit-napi`). Pairs with `sqlite=local` (the in-process SQLite VFS). This is the default when no env override is set.
+- **`wasm`** — WebAssembly bindings (`@rivetkit/rivetkit-wasm`). Wasm **cannot** use local SQLite; it must pair with `sqlite=remote` (executes SQL through the engine over the wire). Setting `RIVETKIT_DRIVER_TEST_RUNTIME=wasm` with `RIVETKIT_DRIVER_TEST_SQLITE=local` fails fast.
+
+The skill defaults to running each test file twice: once on `native/local` and once on `wasm/remote`, each at `encoding=bare`. A file is checked off only when both runtimes pass.
+
+Env overrides recognized by the test harness:
+
+- `RIVETKIT_DRIVER_TEST_RUNTIME` — comma-separated subset of `native,wasm`.
+- `RIVETKIT_DRIVER_TEST_SQLITE` — comma-separated subset of `local,remote`.
+- `RIVETKIT_DRIVER_TEST_ENCODING` — comma-separated subset of `bare,cbor,json`.
+
+When **any** of these env vars is set, the inner describe block name changes from `encoding (<encoding>)` to `runtime (<runtime>) / sqlite (<backend>) / encoding (<encoding>)`. The skill always sets the env vars, so the longer form is always what `-t` must match.
+
 ## How It Works
 
 ### 0. Anchor the reference before fixing parity bugs
@@ -31,9 +48,35 @@ If a RivetKit driver test fails because native or Rust behavior diverges from th
 3. Patch native/Rust to match the original TypeScript behavior.
 4. Rerun the same TypeScript driver test before adding any lower-level native tests.
 
+If a test passes on `native` but fails on `wasm` (or vice versa), the divergence is in the runtime adapter (`packages/rivetkit/src/registry/wasm-runtime.ts` or `napi-runtime.ts`) or in `rivetkit-core`'s wasm/native feature gates — not in user-facing actor code.
+
 Native unit tests are allowed only after the failing TypeScript driver test has reproduced the bug and after the fix is validated against that same TypeScript driver test.
 
-### 1. Ensure the engine is running
+### 1. Ensure runtime artifacts are built
+
+Both runtime adapters need their build outputs on disk before the suite can load them. A fresh checkout, a Rust edit under `packages/rivetkit-napi` / `packages/sqlite-native`, or any change under `packages/rivetkit-wasm` invalidates these.
+
+**NAPI (`@rivetkit/rivetkit-napi`)** — produces a platform-specific `.node` next to `package.json`:
+
+```bash
+ls rivetkit-typescript/packages/rivetkit-napi/*.node 2>/dev/null
+# missing? rebuild:
+pnpm --filter @rivetkit/rivetkit-napi run build:force
+```
+
+After Rust changes, always use `build:force` (per `rivetkit-typescript/CLAUDE.md`); the non-`:force` variant can skip the rebuild and leave the suite running against a stale `.node`.
+
+**Wasm (`@rivetkit/rivetkit-wasm`)** — produces `packages/rivetkit-wasm/pkg/rivetkit_wasm.{js,wasm,d.ts}`:
+
+```bash
+ls rivetkit-typescript/packages/rivetkit-wasm/pkg/rivetkit_wasm.wasm 2>/dev/null
+# missing? rebuild (uses the package-pinned wasm-pack, do not use npx):
+pnpm --filter @rivetkit/rivetkit-wasm run build
+```
+
+Skip the wasm build only if `runtime` is `native` and you're certain the wasm fixture path won't be loaded. With the default `runtime=both`, the wasm build is always required.
+
+### 2. Ensure the engine is running
 
 Before running any tests, check if the RocksDB engine is already running:
 
@@ -49,93 +92,124 @@ If it's not running, start it:
 
 Wait for health check to pass (poll every 2 seconds, up to 60 seconds).
 
-### 2. Initialize or load progress file
+### 3. Initialize or load progress file
+
+The progress file lives at `.agent/notes/driver-test-progress.md`. If it doesn't exist or `reset` was passed, create it with the template below. If it exists and `resume` was passed, read it and pick up from the first file with an unchecked runtime box.
 
-The progress file lives at `.agent/notes/driver-test-progress.md`. If it doesn't exist or `reset` was passed, create it with the template below. If it exists and `resume` was passed, read it and pick up from the first unchecked file.
+Each file row gets two checkboxes — one for each runtime. Check off a runtime independently as soon as it passes, and only advance to the next file when both runtimes for the current file are checked.
 
 Progress file template:
 
 ```markdown
 # Driver Test Suite Progress
 
 Started: <timestamp>
-Config: registry (static), client type (http), encoding (bare)
+Config: registry (static), encoding (bare), runtimes (native, wasm)
+
+Each row: `[native] [wasm] <file> | <suite description>`
 
 ## Fast Tests
 
-- [ ] manager-driver | Manager Driver Tests
-- [ ] actor-conn | Actor Connection Tests
-- [ ] actor-conn-state | Actor Connection State Tests
-- [ ] conn-error-serialization | Connection Error Serialization Tests
-- [ ] actor-destroy | Actor Destroy Tests
-- [ ] request-access | Request Access in Lifecycle Hooks
-- [ ] actor-handle | Actor Handle Tests
-- [ ] action-features | Action Features Tests
-- [ ] access-control | access control
-- [ ] actor-vars | Actor Variables
-- [ ] actor-metadata | Actor Metadata Tests
-- [ ] actor-onstatechange | Actor State Change Tests
-- [ ] actor-db | Actor Database
-- [ ] actor-db-raw | Actor Database Raw Tests
-- [ ] actor-workflow | Actor Workflow Tests
-- [ ] actor-error-handling | Actor Error Handling Tests
-- [ ] actor-queue | Actor Queue Tests
-- [ ] actor-kv | Actor KV Tests
-- [ ] actor-stateless | Actor Stateless Tests
-- [ ] raw-http | raw http
-- [ ] raw-http-request-properties | raw http request properties
-- [ ] raw-websocket | raw websocket
-- [ ] actor-inspector | Actor Inspector Tests
-- [ ] gateway-query-url | Gateway Query URL Tests
-- [ ] actor-db-pragma-migration | Actor Database Pragma Migration
-- [ ] actor-state-zod-coercion | Actor State Zod Coercion
-- [ ] actor-conn-status | Connection Status Changes
-- [ ] gateway-routing | Gateway Routing
-- [ ] lifecycle-hooks | Lifecycle Hooks
+- [ ] [ ] manager-driver | Manager Driver Tests
+- [ ] [ ] actor-conn | Actor Connection Tests
+- [ ] [ ] actor-conn-state | Actor Connection State Tests
+- [ ] [ ] conn-error-serialization | Connection Error Serialization Tests
+- [ ] [ ] actor-destroy | Actor Destroy Tests
+- [ ] [ ] request-access | Request Access in Lifecycle Hooks
+- [ ] [ ] actor-handle | Actor Handle Tests
+- [ ] [ ] action-features | Action Features Tests
+- [ ] [ ] access-control | access control
+- [ ] [ ] actor-vars | Actor Variables
+- [ ] [ ] actor-metadata | Actor Metadata Tests
+- [ ] [ ] actor-onstatechange | Actor State Change Tests
+- [ ] [ ] actor-db | Actor Database
+- [ ] [ ] actor-db-raw | Actor Database Raw Tests
+- [ ] [ ] actor-db-init-order | Actor Db Init Order
+- [ ] [ ] actor-workflow | Actor Workflow Tests
+- [ ] [ ] actor-error-handling | Actor Error Handling Tests
+- [ ] [ ] actor-queue | Actor Queue Tests
+- [ ] [ ] actor-kv | Actor KV Tests
+- [ ] [ ] actor-stateless | Actor Stateless Tests
+- [ ] [ ] raw-http | raw http
+- [ ] [ ] raw-http-request-properties | raw http request properties
+- [ ] [ ] raw-websocket | raw websocket
+- [ ] [ ] actor-inspector | Actor Inspector Tests
+- [ ] [ ] gateway-query-url | Gateway Query URL Tests
+- [ ] [ ] actor-db-pragma-migration | Actor Database Pragma Migration
+- [ ] [ ] actor-state-zod-coercion | Actor State Zod Coercion
+- [ ] [ ] actor-conn-status | Connection Status Changes
+- [ ] [ ] gateway-routing | Gateway Routing
+- [ ] [ ] lifecycle-hooks | Lifecycle Hooks
+- [ ] [ ] serverless-handler | Serverless Handler Tests
 
 ## Slow Tests
 
-- [ ] actor-state | Actor State Tests
-- [ ] actor-schedule | Actor Schedule Tests
-- [ ] actor-sleep | Actor Sleep Tests
-- [ ] actor-sleep-db | Actor Sleep Database Tests
-- [ ] actor-lifecycle | Actor Lifecycle Tests
-- [ ] actor-conn-hibernation | Actor Connection Hibernation Tests
-- [ ] actor-run | Actor Run Tests
-- [ ] hibernatable-websocket-protocol | hibernatable websocket protocol
-- [ ] actor-db-stress | Actor Database Stress Tests
+- [ ] [ ] actor-state | Actor State Tests
+- [ ] [ ] actor-save-state | Actor Save State Tests
+- [ ] [ ] actor-schedule | Actor Schedule Tests
+- [ ] [ ] actor-sleep | Actor Sleep Tests
+- [ ] [ ] actor-sleep-db | Actor Sleep Database Tests
+- [ ] [ ] actor-lifecycle | Actor Lifecycle Tests
+- [ ] [ ] actor-conn-hibernation | Actor Connection Hibernation Tests
+- [ ] [ ] actor-run | Actor Run Tests
+- [ ] [ ] hibernatable-websocket-protocol | hibernatable websocket protocol
+- [ ] [ ] actor-db-stress | Actor Database Stress Tests
 
 ## Excluded
 
-- [ ] actor-agent-os | Actor agentOS Tests (skip unless explicitly requested)
+- [ ] [ ] actor-agent-os | Actor agentOS Tests (skip unless explicitly requested)
 
 ## Log
 ```
 
-### 3. Run tests file by file
+### 4. Run tests file by file
+
+For each unchecked row in order, run the runtimes selected by the `runtime` arg (default `both`). For each runtime:
 
-For each unchecked file in order:
+**a) Pick the runtime/sqlite pair:**
 
-**a) Build the filter command:**
+| Runtime | SQLite backend |
+|---------|----------------|
+| native  | local          |
+| wasm    | remote         |
 
-Each suite now lives in its own file under `rivetkit-typescript/packages/rivetkit/tests/driver/<file>.test.ts`. The describe block nesting is:
+**b) Build the filter command:**
+
+Each suite lives in its own file under `rivetkit-typescript/packages/rivetkit/tests/driver/<file>.test.ts`. With env overrides set, the describe block nesting is:
 
 ```
-<Outer Suite> > static registry > encoding (<encoding>) > <Suite Description>
+<Outer Suite> > static registry > runtime (<runtime>) / sqlite (<backend>) / encoding (<encoding>) > <Suite Description>
 ```
 
-There is no longer a `Driver Tests` or `client type (http)` layer.
+Base command (native):
 
-Base command:
+```bash
+cd rivetkit-typescript/packages/rivetkit && \
+  RIVETKIT_DRIVER_TEST_RUNTIME=native \
+  RIVETKIT_DRIVER_TEST_SQLITE=local \
+  RIVETKIT_DRIVER_TEST_ENCODING=bare \
+  pnpm test tests/driver/<FILE>.test.ts \
+    -t "static registry.*runtime \\(native\\) / sqlite \\(local\\) / encoding \\(bare\\).*<SUITE_DESCRIPTION>" \
+    > /tmp/driver-test-current.log 2>&1
+echo "EXIT: $?"
+```
+
+Base command (wasm):
 
 ```bash
-cd rivetkit-typescript/packages/rivetkit && pnpm test tests/driver/<FILE>.test.ts -t "static registry.*encoding \\(bare\\).*<SUITE_DESCRIPTION>" > /tmp/driver-test-current.log 2>&1
+cd rivetkit-typescript/packages/rivetkit && \
+  RIVETKIT_DRIVER_TEST_RUNTIME=wasm \
+  RIVETKIT_DRIVER_TEST_SQLITE=remote \
+  RIVETKIT_DRIVER_TEST_ENCODING=bare \
+  pnpm test tests/driver/<FILE>.test.ts \
+    -t "static registry.*runtime \\(wasm\\) / sqlite \\(remote\\) / encoding \\(bare\\).*<SUITE_DESCRIPTION>" \
+    > /tmp/driver-test-current.log 2>&1
 echo "EXIT: $?"
 ```
 
-Replace `<FILE>` with the file name stem (part before the `|` in the progress file) and `<SUITE_DESCRIPTION>` with the suite description (part after the `|`). Escape parentheses in the description if present.
+Replace `<FILE>` with the file name stem (part before the `|` in the progress file) and `<SUITE_DESCRIPTION>` with the suite description (part after the `|`). Escape parentheses in the description if present. Forward slashes inside the describe path do not need to be escaped.
 
-**Important:** The suite description in the `-t` filter must match the `describe(...)` text in the test file exactly. Some mappings:
+**Important:** The suite description in the `-t` filter must match the inner `describe(...)` text in the test file exactly. Some mappings:
 
 | File | Suite Description Text |
 |------|----------------------|
@@ -153,6 +227,7 @@ Replace `<FILE>` with the file name stem (part before the `|` in the progress fi
 | actor-onstatechange | Actor State Change Tests |
 | actor-db | Actor Database |
 | actor-db-raw | Actor Database Raw Tests |
+| actor-db-init-order | Actor Db Init Order |
 | actor-workflow | Actor Workflow Tests |
 | actor-error-handling | Actor Error Handling Tests |
 | actor-queue | Actor Queue Tests |
@@ -168,7 +243,9 @@ Replace `<FILE>` with the file name stem (part before the `|` in the progress fi
 | actor-conn-status | Connection Status Changes |
 | gateway-routing | Gateway Routing |
 | lifecycle-hooks | Lifecycle Hooks |
+| serverless-handler | Serverless Handler Tests |
 | actor-state | Actor State Tests |
+| actor-save-state | Actor Save State Tests |
 | actor-schedule | Actor Schedule Tests |
 | actor-sleep | Actor Sleep Tests |
 | actor-sleep-db | Actor Sleep Database Tests |
@@ -179,70 +256,77 @@ Replace `<FILE>` with the file name stem (part before the `|` in the progress fi
 | actor-db-stress | Actor Database Stress Tests |
 | actor-agent-os | Actor agentOS Tests |
 
-**b) Pipe output to file and analyze:**
-
-Always pipe test output to `/tmp/driver-test-current.log` so you can grep it afterward:
-
-```bash
-cd rivetkit-typescript/packages/rivetkit && pnpm test tests/driver/<FILE>.test.ts -t "static registry.*encoding \\(bare\\).*<SUITE>" > /tmp/driver-test-current.log 2>&1
-echo "EXIT: $?"
-```
+**c) Pipe output to file and analyze:**
 
-Then analyze:
+Always pipe test output to `/tmp/driver-test-current.log` so you can grep it afterward. Then analyze:
 
 ```bash
 grep -E "Tests|FAIL|PASS|Error|✓|✗|×" /tmp/driver-test-current.log | tail -30
 ```
 
-**c) If all tests pass:** Check off the file in the progress file and append to the log section:
+**d) If all tests pass for that runtime:** Check off only that runtime's box in the progress file and append to the log:
 
 ```
-- <timestamp> <file>: PASS (<N> tests, <duration>)
+- <timestamp> <file> [<runtime>]: PASS (<N> tests, <duration>)
 ```
 
-**d) If tests fail:**
+If both runtime boxes are now checked, the file is fully done; advance to the next file.
 
-1. Do NOT move to the next file.
-2. Narrow down to the first failing test using a more specific `-t` filter.
+**e) If tests fail:**
+
+1. Do NOT move to the next runtime or file.
+2. Narrow down to the first failing test using a more specific `-t` filter (keep the same env vars).
 3. Read the error output to understand the failure.
-4. Append to the log section:
+4. Append to the log:
 
 ```
-- <timestamp> <file>: FAIL - <brief description of failure>
+- <timestamp> <file> [<runtime>]: FAIL - <brief description of failure>
 ```
 
 5. Report the failure to the user with:
-   - Which test file group failed
+   - Which test file group failed and on which runtime
    - Which specific test(s) failed
    - The error message
+   - Whether the failure is runtime-specific (e.g. fails on `wasm` but passes on `native`)
    - Suggested next steps
 
-### 4. Narrowing scope on failure
+### 5. Narrowing scope on failure
 
-If a file group fails, narrow to individual tests:
+If a file group fails, narrow to individual tests while keeping the same runtime env vars:
 
 ```bash
-cd rivetkit-typescript/packages/rivetkit && pnpm test tests/driver/<FILE>.test.ts -t "static registry.*encoding \\(bare\\).*<SUITE>.*<PARTIAL_TEST_NAME>" > /tmp/driver-test-narrow.log 2>&1
+cd rivetkit-typescript/packages/rivetkit && \
+  RIVETKIT_DRIVER_TEST_RUNTIME=<runtime> \
+  RIVETKIT_DRIVER_TEST_SQLITE=<backend> \
+  RIVETKIT_DRIVER_TEST_ENCODING=bare \
+  pnpm test tests/driver/<FILE>.test.ts \
+    -t "static registry.*runtime \\(<runtime>\\) / sqlite \\(<backend>\\) / encoding \\(bare\\).*<SUITE>.*<PARTIAL_TEST_NAME>" \
+    > /tmp/driver-test-narrow.log 2>&1
 ```
 
-### 5. Completion
+If the bug only appears on one runtime, that's a strong signal — focus the diff hunt on the corresponding runtime adapter (`napi-runtime.ts` / `wasm-runtime.ts`) and any wasm-feature-gated code in `rivetkit-core` and `rivetkit-typescript/packages/rivetkit-wasm`.
+
+### 6. Completion
 
-When all files are checked, append to the log:
+When all rows are fully checked (both runtime boxes), append to the log:
 
 ```
 - <timestamp> ALL TESTS COMPLETE
 ```
 
 Report summary:
-- Total files passing
-- Total files failing (with names)
+- Total files passing per runtime
+- Total files failing per runtime (with names)
+- Files where one runtime passes and the other fails (parity gaps)
 - Total duration
 
 ## Rules
 
 1. **One file at a time.** Never run the full suite. The whole point is methodical, scoped testing.
-2. **Fix before advancing.** Do not skip a failing file to test the next one (unless the user says to skip).
-3. **Always pipe to file.** Never rely on inline terminal output for test results. Always write to `/tmp/driver-test-current.log` and grep afterward.
-4. **Track everything.** Every run gets logged in the progress file.
-5. **Use `actor-db-stress` encoding config.** The stress tests run once with `bare` encoding, not per-encoding. They are outside the encoding loop in mod.ts.
-6. **Respect timeouts.** Set a 600-second timeout for slow tests (sleep, lifecycle, stress). Use 120 seconds for fast tests.
+2. **Both runtimes per file before advancing** (when `runtime=both`). Run native, then wasm, on the same file. Check off each independently as it passes, but do not advance to the next file until both are checked.
+3. **Fix before advancing.** Do not skip a failing runtime/file to test the next one (unless the user says to skip).
+4. **Always pipe to file.** Never rely on inline terminal output for test results. Always write to `/tmp/driver-test-current.log` and grep afterward.
+5. **Track everything.** Every run gets logged in the progress file with its runtime tag.
+6. **Always set the env vars.** Even when running a single runtime, set `RIVETKIT_DRIVER_TEST_RUNTIME`, `RIVETKIT_DRIVER_TEST_SQLITE`, and `RIVETKIT_DRIVER_TEST_ENCODING`. The describe path depends on having any of them set.
+7. **Never pair `wasm` with `local` SQLite.** The harness throws on this combination. If a wasm run somehow needs local SQLite to repro a bug, that's a bug in the matrix, not a workaround to apply.
+8. **Respect timeouts.** Set a 600-second timeout for slow tests (sleep, lifecycle, stress). Use 120 seconds for fast tests. Wasm runs may be slower than native — extend timeouts proportionally if you see consistent timeouts on wasm only.