Replace OpenRouter HTTP backend with opencode by jurgenwerk · Pull Request #4653 · cardstack/boxel

jurgenwerk · 2026-05-05T10:34:39Z

Replaces the direct-HTTP `OpenRouterFactoryAgent` with an opencode-driven `OpencodeFactoryAgent` so `--agent openrouter` runs benefit from the same native fs / Bash / Glob / Grep tools the Claude path already uses. Both backends now go through native tools, so the five MCP wrappers that existed purely to compensate for the prior fs-less OpenRouter path are retired.

What's in

`OpencodeFactoryAgent` (`src/factory-agent/opencode.ts`)

Spawns opencode via `createOpencodeServer` from `@opencode-ai/sdk`, lazy-imported via dynamic `import()` because the SDK is ESM-only and our test runner is CommonJS via ts-node.
Two auth modes:
- Direct API key — `--openrouter-api-key ` (or env `OPENROUTER_API_KEY`) → opencode is configured with a custom OpenAI-compatible provider via `@ai-sdk/openai-compatible`, key in the Authorization header.
- Proxy — no key → spin up a tiny localhost relay HTTP server in-process that translates OpenAI-style requests into the realm-server `_request-forward` shape (`{ url, method, requestBody }`) and posts via JWT-authed `BoxelCLIClient.authedServerFetch`. Burns boxel tokens — same behavior as the prior proxy mode.
In-process HTTP MCP server (`@modelcontextprotocol/sdk` Streamable HTTP transport) exposes the surviving 7 factory tools (5 validators + `signal_done` + `request_clarification`) to the opencode subprocess.
Path scoping via opencode's built-in `permission.external_directory: 'deny'` + workspace `cwd` — replaces the `buildWorkspaceScopedCanUseTool` callback we use on the Claude side. Same effect, no plugin file required.
DONE / CLARIFICATION signals: tool symbols don't survive JSON-RPC, so the MCP server tags them `factory:done` / `factory:clarification` and the agent's signal-capture hook matches on the tag.

CLI + wiring

New `--openrouter-api-key ` flag in `factory-entrypoint.ts`. Falls back to env `OPENROUTER_API_KEY`, then to proxy mode if both are missing.
Plumbed through `FactoryEntrypointOptions` → `IssueLoopConfig` → `CreateLoopAgentConfig`.
`--agent openrouter` now dispatches to `OpencodeFactoryAgent`. Run label is `openrouter (model=…, mode=direct|proxy)`.

Dependencies

`@opencode-ai/sdk` 1.14.34 (pinned; opencode publishes hourly).
`opencode-ai` 1.14.34 — tiny stub with esbuild-style per-platform `optionalDependencies`, so `pnpm install` resolves the matching binary into `node_modules/.bin/opencode`. No manual `npm i -g` step.
`@modelcontextprotocol/sdk` ^1.29.0 (was already transitive via the Claude SDK; promoted to a direct dep).
`pnpm-workspace.yaml`: opencode + each platform variant added to `minimumReleaseAgeExclude` (the global 24h filter rejects every release otherwise).
root `package.json`: `opencode-ai` added to `onlyBuiltDependencies` so its postinstall (which symlinks the platform binary) is allowed to run.

Deletions

`src/factory-agent/openrouter.ts` — direct-HTTP class retired.
`read_file` / `write_file` / `search_realm` / `fetch_transpiled_module` / `run_command` builders in `factory-tool-builder.ts` — replaced by native fs / `boxel read-transpiled` / `boxel search` (via `Bash`) / `boxel run-command` (via `Bash`).
`CLAUDE_FILTERED_FACTORY_TOOLS` filter in `claude-code.ts` — no longer needed since neither backend wants those tools.
`tests/factory-agent-schema-boundary.test.ts` — the Zod-vs-JSON-Schema boundary it asserted no longer applies (the OpenRouter side now goes through MCP rather than raw HTTP).

Skill updates

`.agents/skills/software-factory-bootstrap/SKILL.md` and `.agents/skills/software-factory-operations/SKILL.md`: dropped the `(Claude backend)` / `(OpenRouter backend)` dichotomy throughout. Single description: native `Read` / `Write` / `Edit` / `Bash` for workspace files, `Bash` + `boxel read-transpiled` / `boxel search` for realm reads.

Tests

`factory-tool-builder.test.ts`: dropped tests for retired tools; expanded the regression-list to assert all 5 OpenRouter-only + 5 structured-update tools are absent from `buildFactoryTools()`.
`factory-agent-claude-code.test.ts`: rewrote "filters factory tools that have native or boxel CLI alternatives" to assert the surviving filter (registry-sourced shadow tools still excluded; native-fs-replaced tools no longer asserted because they don't exist any more).
137 targeted unit tests pass (factory-tool-builder, factory-agent-claude-code, factory-prompt-loader, factory-context-builder, issue-loop). Lint + types clean.

Honest caveats — needs your verification

I cannot end-to-end verify the opencode subprocess or the relay server without an OpenRouter API key + a live realm server. The shapes type-check and unit tests pass, but the first `pnpm factory:go --agent openrouter` is the real test. Likely failure modes I can think of:

opencode SDK's `session.prompt` may expect a slightly different `model` shape than I used (`{providerID, modelID}`).
The relay server's response Content-Type passthrough may not be exactly what the `@ai-sdk/openai-compatible` consumer expects (it expects JSON for chat/completions; I forward whatever the proxy returns).
The MCP HTTP transport (`StreamableHTTPServerTransport`) may require a specific path or session handshake I haven't accounted for.

Each of those is a quick fix once observed empirically.

Test plan

`pnpm factory:go --agent claude --brief-url ... --target-realm-url ...` still completes end-to-end (no regression on the Claude path).
`pnpm factory:go --agent openrouter --openrouter-api-key sk-or-... --brief-url ... --target-realm-url ... --debug` runs to `outcome=all_issues_done`.
`pnpm factory:go --agent openrouter --brief-url ... --target-realm-url ... --debug` (no key, proxy mode) runs to `outcome=all_issues_done`.
Path scoping verified: agent attempt to write outside `workspaceDir` is denied (look for "external directory" in opencode logs).

🤖 Generated with Claude Code

Foundation for replacing OpenRouterFactoryAgent with an opencode-driven agent. Lays in the deps and a typed skeleton so the design is reviewable in isolation; the full runtime — subprocess + relay server + MCP wrapper + signal capture — is deferred to a follow-up so this commit does not break the working `--agent openrouter` HTTP path. What's in: - `@opencode-ai/sdk` and `opencode-ai` (1.14.34) added as devDependencies. `opencode-ai` is a tiny stub with per-platform optionalDependencies (esbuild-style) so `pnpm install` resolves just the matching binary into `node_modules/.bin/opencode`. No manual `npm i -g` step. - `@modelcontextprotocol/sdk` added as a direct dep — the future MCP server wrapping `FactoryTool[]` will use it. - pnpm-workspace.yaml: opencode publishes ~hourly, so the `minimumReleaseAge: 1440` filter rejects every release. Added opencode + each platform variant to `minimumReleaseAgeExclude`. - root package.json: `opencode-ai` added to `onlyBuiltDependencies` so its postinstall (which symlinks the platform binary) is allowed to run. - `src/factory-agent/opencode.ts`: typed skeleton + design notes. Documents the target architecture (subprocess, dual auth, MCP for factory tools, `permission.external_directory: 'deny'` for path scoping). `run()` throws so a misconfigured wiring can't accidentally route here. What's pending (CS-11034 follow-up): - Relay HTTP server for proxy auth mode. - In-process / subprocess MCP server wrapping FactoryTool[]. - Event-stream consumption + DONE / CLARIFICATION signal capture. - Wiring change in factory-issue-loop-wiring.ts. - --openrouter-api-key CLI flag. - Deletion of OpenRouterFactoryAgent + the 5 OpenRouter-only tools + the CLAUDE_FILTERED_FACTORY_TOOLS filter, once opencode is verified end-to-end. Lint + types clean. No behavior change for any existing run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-05-05T10:57:49Z

Host Test Results

1 files ± 0 1 suites ±0 1h 39m 14s ⏱️ - 6m 37s
2 519 tests - 115 2 503 ✅ - 116 15 💤 ±0 1 ❌ +1
2 538 runs - 115 2 522 ✅ - 116 15 💤 ±0 1 ❌ +1

Results for commit 4384ad3. ± Comparison against earlier commit abd61cb.

For more details on these errors, see this check.

Realm Server Test Results

1 files ±0 1 suites ±0 17m 44s ⏱️ -58s
1 267 tests ±0 1 267 ✅ ±0 0 💤 ±0 0 ❌ ±0
1 345 runs ±0 1 345 ✅ ±0 0 💤 ±0 0 ❌ ±0

Results for commit 4384ad3. ± Comparison against earlier commit abd61cb.

Drops the direct-HTTP `OpenRouterFactoryAgent` for an opencode-backed `OpencodeFactoryAgent` so `--agent openrouter` runs benefit from the same native fs / Bash / Glob / Grep tools the Claude path already uses. Both backends now go through native tools, so the five MCP wrappers that existed purely to compensate for the prior fs-less OpenRouter path can be retired. What's in ========= `OpencodeFactoryAgent` (`src/factory-agent/opencode.ts`): - Spawns `opencode` via `createOpencodeServer` from `@opencode-ai/sdk` - Two auth modes: 1. `--openrouter-api-key <key>` (or env `OPENROUTER_API_KEY`) → opencode is configured with a direct OpenRouter provider via `@ai-sdk/openai-compatible`, key in the Authorization header. 2. No key → spin up a tiny localhost relay HTTP server in-process that translates OpenAI-style requests into the realm-server `_request-forward` shape (`{ url, method, requestBody }`) and posts via JWT-authed `BoxelCLIClient.authedServerFetch`. Burns boxel tokens — same as the prior proxy mode. - In-process HTTP MCP server (`@modelcontextprotocol/sdk` Streamable HTTP transport) exposes the surviving 7 factory tools (5 validators + `signal_done` + `request_clarification`) to the opencode subprocess. - Path scoping via opencode's built-in `permission.external_directory: 'deny'` + workspace `cwd` — replaces the `buildWorkspaceScopedCanUseTool` callback shape on the Claude side. - DONE / CLARIFICATION signals: tool symbols don't survive JSON-RPC, so the MCP server tags them `factory:done` / `factory:clarification` and the agent's signal-capture hook matches on the tag. - Lazy-imported via dynamic `import()` because the SDK is ESM-only and the test runner is CommonJS via ts-node. CLI flag: `--openrouter-api-key <key>` plumbed through `FactoryEntrypointOptions` → `IssueLoopConfig` → `CreateLoopAgentConfig`. Falls back to env `OPENROUTER_API_KEY` when absent, then to proxy mode when both are missing. Wiring: `--agent openrouter` now dispatches to `OpencodeFactoryAgent`. Label is `openrouter (model=…, mode=direct|proxy)`. Requires `workspaceDir` (errors if missing — opencode mounts it as `cwd`). Deletions ========= - `src/factory-agent/openrouter.ts` — direct-HTTP class retired. - `read_file`, `write_file`, `search_realm`, `fetch_transpiled_module`, `run_command` builders in `factory-tool-builder.ts` — replaced by native fs / `boxel read-transpiled` / `boxel search` (via Bash) / `boxel run-command` (via Bash). - `CLAUDE_FILTERED_FACTORY_TOOLS` filter in `claude-code.ts` — no longer needed since neither backend wants the retired tools. - `tests/factory-agent-schema-boundary.test.ts` — the Zod-vs-JSON-Schema boundary it asserted no longer applies (the OpenRouter side now goes through MCP rather than raw HTTP). Skill updates ============= `.agents/skills/software-factory-bootstrap/SKILL.md` and `.agents/skills/software-factory-operations/SKILL.md`: dropped the `(Claude backend)` / `(OpenRouter backend)` dichotomy throughout. Single description: native `Read` / `Write` / `Edit` / `Bash` for workspace files, `Bash` + `boxel read-transpiled` / `boxel search` for realm reads. Tests ===== - `factory-tool-builder.test.ts`: dropped tests for retired tools; expanded the regression-list to assert all 5 OpenRouter-only + 5 structured-update tools are absent. - `factory-agent-claude-code.test.ts`: rewrote "filters factory tools that have native or boxel CLI alternatives" to assert the surviving filter (registry-sourced shadow tools still excluded; native-fs-replaced tools no longer asserted because they don't exist any more). - 137 targeted unit tests pass (factory-tool-builder, factory-agent-claude-code, factory-prompt-loader, factory-context-builder, issue-loop). Lint + types clean. Honest caveats — needs your verification ======================================== I cannot end-to-end verify the opencode subprocess or the relay server without an OpenRouter API key + a live realm server. The shapes type-check and the unit tests pass, but the first run with `pnpm factory:go --agent openrouter ...` is the real test. The likely failure modes I can think of: - opencode SDK's `session.prompt` may expect a slightly different `model` shape than I used. - The relay server's response Content-Type passthrough may not be exactly what the AI SDK expects (it expects JSON for chat/completions; I forward whatever the proxy returns). - The MCP HTTP transport may require a specific path or session handshake I haven't accounted for. Each of those is a quick fix once observed empirically. Run with `--debug` and share output if anything misbehaves. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…factory-replace-openrouter-backend-with-opencode # Conflicts: # packages/software-factory/.agents/skills/software-factory-bootstrap/SKILL.md # packages/software-factory/.agents/skills/software-factory-operations/SKILL.md # packages/software-factory/src/factory-tool-builder.ts # packages/software-factory/tests/factory-tool-builder.test.ts

…ncode relay Replaces the in-process HTTP relay that software-factory's opencode agent spun up in passthrough mode with a dedicated realm-server endpoint that accepts a verbatim OpenAI chat-completions body. Same JWT + credit-strategy + streaming pipeline as `_request-forward`, just without the OpenAI-→-`{url, method, requestBody}` re-shape. - Extracted the shared `pendingCostPromises` barrier, cost-deduction scheduler, and SSE streaming handler into `lib/proxy-forward.ts`; both `_request-forward` and the new endpoint use it so per-user cost ordering stays consistent. - New `POST /_openrouter/chat/completions` handler pins the upstream to OPENROUTER_CHAT_URL server-side, looks up the destination config from the existing `proxy_endpoints` whitelist, and forwards verbatim. Streaming is driven by `stream: true` in the body. - opencode now points its OpenAI-compatible provider's `baseURL` at `<realmServerUrl>/_openrouter` and stamps the realm-server JWT (fetched once via the new `BoxelCLIClient.getServerToken`) into the static Authorization header. The 7-day JWT TTL means a single ticket run is in no danger of outlasting it. - Removes `startProxyRelayServer`, `buildRelayProviderConfig`, and the unused `OPENROUTER_CHAT_URL` constant from software-factory. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`client.session.prompt` is documented as blocking until the model + tool loop completes, but in opencode SDK 1.14.34 the HTTP response isn't reliably flushed once the loop exits — the server-side session goes idle (`session.idle` is published, snapshot cleanup runs) but our await keeps hanging indefinitely, never reaching the teardown in `finally`. Subscribe to the per-directory event bus before creating the session, fire the prompt without awaiting it directly, and drive completion off the first `session.idle` event matching our sessionId. Also break on `session.error` so an upstream auth/length/abort failure doesn't keep us stuck. The prompt's return value was unused (DONE / CLARIFICATION signals come back through the MCP server), so dropping the await on it costs nothing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous attempt to drive completion off `client.event.subscribe` also turned out unreliable in opencode 1.14.34: the SSE stream established mid-session and silently missed the eventual `session.idle` event published when the loop exited (the realm-server log clearly showed multiple successful 200 responses to `/_openrouter/chat/completions` followed by opencode emitting `session.idle publishing` — but our parent never saw the event and hung indefinitely). Switch to polling `client.session.status` every 750ms instead. SessionStatus is a discriminated `idle | retry | busy` union, so the only edge to handle is the post-create-but-pre-prompt window where status is still `idle`. The polling helper waits for the first non-idle observation before treating a subsequent `idle` as "the loop finished", with a 30-minute upper bound so a hung session can't trap the factory loop forever. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…in opencode poll The previous polling pass caught the right-shaped HTTP loop running but the response body was always `{}`, so we never saw the session transition. Two issues, both apparent only against a live opencode 1.14.34: 1. `/session/status` requires the same `directory` query that `session.create` was called with. Without it the response is unconditionally empty regardless of session state. 2. Empirically the endpoint returns *currently busy* sessions only — when a session goes idle, its entry disappears from the map instead of staying with `type: 'idle'`. Pass the workspace directory through to the status call, and treat "session disappeared after first being seen busy" as completion in addition to an explicit `type: 'idle'` reading. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

opencode internally normalizes the `directory` query through its own realpath before storing the session. On macOS this rewrites `/var/folders/...` (the path Node hands us via tmpdir) to `/private/var/folders/...` — they're the same directory but distinct strings, and opencode's `/session/status?directory=...` filter is a straight string match. Result: the status endpoint returned `{}` for every poll because we were asking about `/var/...` while opencode had filed the session under `/private/var/...`. Pre-resolve `workspaceDir` once with `realpathSync` and use the canonical form for both `session.create` and `session.status`. The "session disappears from the status map after observed busy" branch added in the previous commit is still useful as a belt-and-suspenders completion signal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ability Both prior completion signals turned out unusable in opencode 1.14.34: the `session.prompt` HTTP response hangs after the loop exits, the `/session/status` map returns `{}` regardless of session state (live probing on a busy session confirmed this even with the canonical directory query), and `client.event.subscribe` was unreliable in earlier attempts. The only signal that's both present and reliable in this version is `session.list[id].time.updated`: opencode bumps it on every `message.part.delta` and step transition, so we can watch it through a 5-second stability window. When `time.updated` hasn't moved for 5s, the model + tool loop is idle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Previous commit added a 5s stability window on `time.updated` polling to detect opencode loop completion. That penalized the happy path where the model calls `signal_done` (or `request_clarification`) — the captured signal was already available, but we kept polling for 5s anyway. Race the captured-signal promise against `waitForSessionIdle`. The MCP server resolves a one-shot promise the moment it sees a `factory:done` / `factory:clarification` tag come back, so the normal flow returns with zero added latency. The polling stays as a fallback for the (rare) case where the model exits the loop without emitting either signal — and even there the stability window drops from 5s to 2s now that polling is only the fallback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…tions Two issues observed in a real run: 1. The 2s `time.updated` stability window false-positive-detected the session as idle while the model was actively streaming. Empirically `time.updated` only ticks at step boundaries (not on every `message.part.delta`), and opus can sit 30+ seconds between steps. Bump the window to 60s. The polling is only the fallback for when the model exits without `signal_done` / `request_clarification` — the captured-signal race short-circuits this on the happy path, so the wider window doesn't add latency to normal runs. 2. `opencode.close()` returns synchronously but doesn't wait for the spawned subprocess to actually exit. The next iteration's `createOpencodeServer` then hits EADDRINUSE on opencode's fixed port 4096 and the whole factory:go run dies. Add a `waitForPortFree(4096, 5000ms)` poll after `opencode.close()` so the next iteration can bind cleanly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`opencode.close()` from the SDK only sends SIGTERM via `proc.kill()`, and the precompiled opencode 1.14.34 binary apparently ignores it — so the spawned subprocess keeps running and continues holding the fixed port 4096 long after we ask it to close. Iteration 2 of factory:go (and every iteration after) then hits EADDRINUSE because the SDK has no force-kill path we can call into. Replace the blind "wait for port to free" loop with a wait-then- escalate strategy: 1s graceful window for SIGTERM to land, then look up the listening PID via `lsof` and `process.kill(pid, 'SIGKILL')`, then a short post-kill wait so the kernel releases the port before the next iteration spawns its own opencode. The lsof path is best-effort — if it can't run we fall through and let the next iteration surface a clearer EADDRINUSE error. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…t POST Two issues blocking real opencode runs: 1. Opus was treating "tools for reading and writing the workspace mirror" as descriptive language and never invoked the actual fs tools — iterations spent minutes generating reasoning text and produced zero files. Replace the vague paragraph with an explicit tool inventory (`Write`, `Read`, `Edit`, `Glob`, `Grep`, `Bash`, plus the factory-specific tools) and an explicit instruction to call tools rather than describe what would be written. 2. opencode 1.14.34 occasionally rejects the very first `/session/{id}/message` POST on a freshly-spawned subprocess with `TypeError: fetch failed`, which kills every iteration after the first. Wrap the call in a 500ms retry so the flake is hidden. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Iter 1 succeeded (`Agent returned 1 tool call(s)`, validation passed, SIGKILL-on-teardown clean). Iter 2's opencode subprocess came up but both `session.prompt` attempts immediately failed with `TypeError: fetch failed`, indicating the subprocess died right after the "server listening" line. The polling helper's `client.session.list` call then also threw `fetch failed`, propagated past the agent.run() boundary, and crashed the entire factory:go process — losing all progress from iter 1. Wrap `session.list` in try/catch inside `waitForSessionIdle`, count consecutive failures, and return cleanly after 5 in a row (~3.75s). That lets the agent surface "0 tool calls" for the failed iteration while the outer issue loop keeps going to iter 3. Doesn't fix the underlying opencode flakiness — that's a separate chase. But it stops a single bad iteration from killing the whole run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Spawning a fresh opencode subprocess per agent.run() was the actual root cause of the iter-2+ `TypeError: fetch failed` cascade. opencode 1.14.34 is shaped to be a long-lived server with many short-lived sessions; rapid restarts (close → SIGTERM → SIGKILL → respawn → first prompt) hit failure modes around the SQLite-backed session store and the `/session/{id}/message` handler. Refactor: - `OpencodeFactoryAgent` now holds the MCP server, opencode subprocess, and SDK client as instance state. `ensureStarted()` spawns them lazily on first `run()` and is idempotent on subsequent calls. `run()` only creates a new session, fires the prompt, waits for completion, and clears its per-run hooks — no teardown. - The MCP server's `onToolCall` / `onSignal` callbacks now forward into a swappable `currentHooks` pointer that `run()` swaps in / out around each session, so a single long-lived MCP server can serve many sequential runs. - Add optional `close(): Promise<void>` to the `LoopAgent` interface. `factory-issue-loop-wiring` calls it in a `finally` after `runIssueLoop` returns (or throws), so the opencode subprocess and MCP server are torn down exactly once per factory:go run instead of N times. - Drop the per-run `session.prompt` retry — the flake it papered over was almost entirely caused by the rapid-restart pattern this refactor eliminates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…delta `proxySSE` accumulated bytes into `buffer` and then called a helper `extractSSELines(buffer)` to split off complete lines. The helper reassigned its *local* `buffer` parameter as it consumed lines, so the caller's `buffer` in `proxySSE` never got trimmed. On every new network read we re-extracted and re-dispatched every previously seen line. For the AI Bot path through `_request-forward` this was latent but not catastrophic. For opencode driving the new `/_openrouter/chat/completions` passthrough it was fatal: the model saw each text delta repeated N times and concatenated them, so assistant text came out as `"II'll processI'll process this b..."` and tool-call argument JSON became `{"command{"command{"command": "ls...`. Every tool invocation rejected with `Invalid input ... JSON parsing failed`, so the model never managed to call Write or any other native tool — explaining the full day of "model thinks for minutes but produces zero files" runs. Inline the line-splitting in `proxySSE` so the trailing incomplete fragment is kept in `buffer` and complete lines are dispatched exactly once. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Factory:go ran silent for minutes between "Inner iteration N/8" and the next visible log line, even when the realm-server log showed the model actively making chat completions. Adds three sources of visible progress so users don't fly blind: - Log every factory MCP tool call as it lands (run_lint, run_tests, signal_done, ...) with a short arg summary. - Subscribe to opencode's per-directory event stream for *logging only* — surfaces native opencode tool invocations (Read / Write / Bash / Edit) and the eventual `session.idle` / `session.error` events. Best-effort: any SSE failure is swallowed; completion detection still uses `time.updated` polling, this stream isn't on the critical path. - Heartbeat log every 15s from the polling loop showing elapsed time and whether the session is still actively updating or idle pending the stability window. Also log the session id when a new session is created so users can grep the opencode log file by that id if they need deeper detail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

opencode bundles a default tool kit (read, write, edit, bash, glob, grep, plus webfetch, task, todowrite, skill, question, invalid). Every tool definition is included in every chat completion, so the extras cost tokens on every model call and adds up dramatically over a multi-step session. They also give the model more ways to "stall" on unhelpful actions (writing TODOs about what to do instead of doing it, dispatching subtasks, etc.). Pass an explicit `tools` map to `session.prompt` enabling only the six we actually use (`read` / `write` / `edit` / `bash` / `glob` / `grep`) and disabling the rest. Factory MCP tools are unaffected — they ride the MCP transport, not this whitelist. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…hat we send PR #4652 (`retire-structured-update-tools`) updated `system.md` and the two SKILL files to drop references to removed wrappers, but left the four ticket prompt templates and the seed-issue description still telling the model to call `write_file`, `read_file`, `search_realm`, `update_issue`, `create_knowledge`. The opencode model would then try to invoke tools that don't exist, fall back to `Edit` or `Bash`, and in test-18 just gave up and called `signal_done` after creating zero files. - `prompts/bootstrap-implement.md` / `ticket-implement.md` / `ticket-iterate.md` / `ticket-test.md`: replace dead tool refs with the real native opencode tools (`Write`, `Read`, `Edit`, `Glob`, `Bash`) and the surviving factory MCP tools (`signal_done`, validators). Add an explicit "calling `signal_done` without writing the required files is a failure" line so opus stops bailing early. - `src/factory-seed.ts`: include the full `brief.content` in the seed issue description (was only embedding `brief.contentSummary` — a one-line blurb, way too thin to drive a bootstrap from). Replace the stale "mark this issue done via `update_issue`" footer with explicit Write + signal_done instructions. - `src/factory-agent/opencode.ts`: when `--debug`, log the full system prompt, user prompt, enabled native opencode tools, and enabled factory MCP tools right before sending each `session.prompt` so we can see exactly what the model gets. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…gent context `mapCardToSchedulableIssue` was extracting only `status`, `priority`, `blockedBy`, `order`, `summary`, `issueType` from the realm card — silently dropping `description` and `acceptanceCriteria`. The scheduler-issued objects then flowed all the way into the agent prompt as `{{issue.description}}`, which rendered to an empty string. That meant every iteration the model got "## Current Issue\n\nID: ...\nSummary: Process brief and create project artifacts\n\nDescription:\n\n## What to Create\n..." — i.e. the brief content the seed creator went to the trouble of embedding in the issue description was being thrown away before the model ever saw it. The model bootstrap-completed with zero artifacts because it had no idea what the brief was about. Pass `description` and `acceptanceCriteria` through the mapper so they reach the agent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The opencode SDK's `createOpencodeServer` spawns the binary with no `cwd` option set on the child process — the subprocess inherits the parent's cwd. The model's native fs tools (`Read` / `Write` / `Edit`) then resolve relative paths against THAT inherited cwd, not the workspace. Result: when the model called `Read("Projects/sticky-note.json")` the permission log showed it trying `/Users/jurgen/development/boxel/packages/software-factory/Projects/sticky-note.json` (the directory `pnpm factory:go` was invoked from) rather than `/private/var/folders/.../boxel-factory-workspaces/<realm>/Projects/sticky-note.json` (the actual workspace). Reads always failed (those files don't exist under the source tree), the model never managed to inspect existing state cleanly, and never went on to call `Write` for the artifacts. Pre-resolve the workspace's canonical realpath once and `process.chdir` into it across the `createOpencodeServer` call, so the subprocess forks with the right cwd. Restore the parent's cwd in `finally` — once the child has forked, the parent's cwd doesn't matter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ation The test-21 run finally produced cards but two of them showed "Card Error: Expected array for field value tags" in Boxel because the model wrote `tags: "a, b, c"` (a comma-separated string) where the schema declared `tags` as a `containsMany StringField` (a real array). Same root cause for any other guess-the-shape failure: the model never called `get_card_schema`, despite the prompt saying it should, because the instruction was buried in the skill file. - `prompts/bootstrap-implement.md`: promote schema fetching to a mandatory **Step 0** at the top of the Instructions block, with the three required calls written out verbatim and an explicit warning that `containsMany` fields must be JSON arrays in `attributes`. The rest of the steps come after, framed as "now create the artifacts in this order so relationship targets exist when referenced." - `.agents/skills/boxel-file-structure/SKILL.md`: add a "containsMany Attributes" section right next to the existing "linksToMany Relationships" section, with a `["a", "b", "c"]` example and the exact error string the wrong shape produces. Add a matching row to the Common Mistakes table. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…:go is slow Captures the architecture-level differences between the previous direct OpenAI tool-use loop and the current opencode SDK runtime, four hypotheses for the observed slowdown (model emitting fewer tool_calls per step, per-step prompt overhead, opencode bookkeeping, model thinking time), and what we'd need to instrument to verify them. Note: token counts and step counts in the doc are explicitly flagged as estimates, not measurements. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds opt-in observability for the realm-server's openrouter passthrough so we can measure the per-step prompt overhead and tool-call distribution that drives the factory's wall-clock cost. When FACTORY_INSTRUMENT_PATH is set, every chat-completion request through /_openrouter/chat/completions writes a JSONL record with: - request: model, system_chars, tools_count + tools_chars, messages_count + messages_chars, total input chars, rough token estimates, parallel_tool_calls value, tool_choice - response: tool_calls count, tool call names, assistant text size, finish_reason, provider usage tokens, TTFB, duration `pnpm factory:stats <jsonl>` (in software-factory) summarises the log: model identity, distribution of tool_calls per assistant response, per-step prompt overhead, ground-truth usage tokens, and wall-clock per request. Designed to answer the four hypotheses in OPENCODE_PERFORMANCE.md (H1 tool-call batching, H2 prompt overhead, H3 wall-clock, H4 model identity). Off by default; no behaviour change unless FACTORY_INSTRUMENT_PATH is set. The streaming hook is plumbed through handleStreamingRequest as an optional StreamingInstrumentation parameter so the existing _request-forward caller is unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…prompt dump Two unrelated observability fixes for the opencode-backed agent. 1. fetch-error unwrapping. opencode 1.14.34 surfaces every network failure as undici's TypeError("fetch failed"), with the real cause (ECONNREFUSED, UND_ERR_HEADERS_TIMEOUT, AbortError, etc.) buried in `err.cause`. The previous String(err) threw all of that away, so a `session.prompt rejected: TypeError: fetch failed` warning was indistinguishable between "subprocess crashed", "upstream timed out", and "we cancelled the request". Adds describeFetchError() that walks up to four levels of cause chain and reports the codes, plus probeOpencode() that hits the subprocess's /app endpoint with a short timeout to report alive/dead at the moment of failure. Both session.prompt and session.list catch sites now use them. A startup info line points at the live opencode log directory so the operator knows where to tail when warnings fire. 2. drop --debug prompt dump. The `--- system prompt (N chars) ---` block printed the entire merged system prompt (~10K+ tokens worth of skills) on every iteration. With concurrent loggers writing to stdout, the multi-line message racing on writes produced garbled output where the `factory-agent-opencode` prefix got chewed mid-word. Removed; the existing `Agent backend: ...` line is enough. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Both tools route through the realm-server's prerender sandbox, which loads modules from the realm filesystem. Before this change, an agent that wrote a .gts via Write and then immediately called run_evaluate({ path }) hit a 404 because the realm hadn't seen the file yet — the orchestrator only synced between iterations. Fix: ToolBuilderConfig gains an optional syncWorkspace callback. The issue-loop wiring passes the same syncWorkspaceToRealm function the orchestrator uses for post-signal_done validation. run_evaluate and run_instantiate now call syncWorkspace() first; on failure they return a typed error result without attempting the realm call. Cost: ~500ms-2s on first call after writes, near-zero on subsequent calls since boxel-cli's sync is mtime-aware. The orchestrator's post-signal_done sync is now a no-op when nothing changed. run_parse, run_lint, and run_tests already read directly from the workspace, so they're unaffected. The software-factory-operations skill's "Self-Validation" section is updated to clarify which tools sync (run_evaluate, run_instantiate) and which don't (run_lint, run_parse, run_tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

updateProjectStatus() writes the new "completed" status to the workspace mirror but doesn't push to the realm. The previous flow called updateProjectStatus right before runIssueLoop returned, with no sync after — so the realm-side Project card kept its bootstrap "active" status forever and the catalog UI showed the project as ACTIVE even after every issue had been marked done. Adds a syncWorkspace() call immediately after the updateProjectStatus('completed') in issue-loop.ts. Failure logs but doesn't block — same tolerance the rest of the file already uses for sync hiccups. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…-backend-with-opencode # Conflicts: # packages/software-factory/.agents/skills/software-factory-operations/SKILL.md # packages/software-factory/src/factory-agent/openrouter.ts # packages/software-factory/src/factory-issue-loop-wiring.ts # packages/software-factory/src/factory-tool-builder.ts # packages/software-factory/tests/factory-agent-schema-boundary.test.ts

Investigation note from while we were diagnosing factory:go wall-time. Hypotheses are now reflected in commits + skill copy; the standalone markdown is no longer needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Removes the FACTORY_INSTRUMENT_PATH-gated chat-completions JSONL logger and the factory:stats CLI that were added to investigate factory:go wall-time. The investigation is closed (see prior commits for the fixes) and the scaffolding has no remaining consumer. - packages/realm-server/lib/proxy-instrument.ts (deleted) - packages/realm-server/lib/proxy-forward.ts (drop StreamingInstrumentation parameter + onSSEData/onDone hooks; restores the pre-instrumentation shape of handleStreamingRequest) - packages/realm-server/handlers/handle-openrouter-passthrough.ts (drop the analyze/wire/write block in both streaming and non-streaming paths) - packages/software-factory/scripts/factory-stats.ts (deleted) - packages/software-factory/package.json (drop factory:stats script) If we need to measure again, the change is recoverable from history. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Removes "we used to ship X / now we ship Y" / "after CS-XXXXX retired Y" / "the previous Z" wording from comments and JSDoc. Comments now describe current behaviour only — no migration story, no roll-call of deleted tools. Also fixes a stale FactoryTool.source JSDoc that still referenced retired tool names (read_file, search_realm, "structured update tools", boxel-sync, etc.) and an inaccurate claim that OpenRouter "gets every tool". Touched files (comments only — no behaviour change): - packages/software-factory/src/factory-tool-builder.ts - packages/software-factory/src/factory-tool-executor.ts - packages/software-factory/src/factory-tool-registry.ts - packages/software-factory/src/factory-agent/claude-code.ts - packages/software-factory/src/factory-agent/opencode.ts - packages/software-factory/src/parse-execution.ts - packages/software-factory/.agents/skills/software-factory-operations/SKILL.md - packages/realm-server/handlers/handle-openrouter-passthrough.ts Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Normal upgrade policy bumps to a stable that's already several days old, well past the 24h minimumReleaseAge threshold — opencode doesn't actually need to bypass it. The supply-chain protection has more value applied to a fast-moving dep than excluding it does. If we ever need to pin to a <24h-old opencode for a hot fix, we can override at that point. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The "containsMany goes in attributes as a JSON array, not a comma-separated string" lesson was taught in three places after we saw the agent emit comma-strings on early factory:go runs. With mandatory get_card_schema introspection now in the flow, the schema itself reports `array` types — the explicit prose is largely scar tissue. One mention in the boxel-file-structure Common Mistakes table is enough; drop the dedicated section + bootstrap-prompt warning. - packages/software-factory/.agents/skills/boxel-file-structure/SKILL.md: drop the dedicated `## containsMany Attributes (CRITICAL)` section - packages/software-factory/prompts/bootstrap-implement.md: drop the trailing comma-string warning on the schema-fetch step Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR migrates the software-factory “openrouter” backend from a direct OpenRouter HTTP implementation to an opencode-driven agent, so both Claude and OpenRouter runs use native filesystem/shell tooling. It also retires legacy factory tool wrappers that only existed to compensate for the prior fs-less OpenRouter path, and adds a realm-server OpenAI-compatible passthrough endpoint to support opencode in proxy mode.

Changes:

Replace OpenRouterFactoryAgent with OpencodeFactoryAgent and plumb --openrouter-api-key through CLI/wiring.
Remove OpenRouter-only fs/search/command wrapper tools and update prompts/skills/tests to rely on native Read/Write/Edit/Bash (+ boxel CLI).
Add realm-server /_openrouter/chat/completions passthrough + shared proxy-forward streaming/cost-tracking helpers.

Reviewed changes

Copilot reviewed 37 out of 38 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
pnpm-lock.yaml	Adds opencode + MCP SDK dependency resolutions.
packages/software-factory/tests/index.ts	Removes schema-boundary test from test suite.
packages/software-factory/tests/factory-tool-builder.test.ts	Updates tool list assertions; removes tests for retired tools.
packages/software-factory/tests/factory-agent-schema-boundary.test.ts	Deletes obsolete OpenRouter-vs-Claude schema-boundary integration test.
packages/software-factory/tests/factory-agent-claude-code.test.ts	Adjusts filtering expectations now that retired tools no longer exist.
packages/software-factory/src/parse-execution.ts	Minor comment update about monorepo layout assumption.
packages/software-factory/src/issue-scheduler.ts	Includes additional Issue fields when mapping cards for scheduling.
packages/software-factory/src/issue-loop.ts	Syncs workspace after marking project completed so realm reflects status.
packages/software-factory/src/factory-tool-schema-adapter.ts	Updates adapter rationale now that OpenRouter runs through opencode.
packages/software-factory/src/factory-tool-registry.ts	Simplifies registry documentation; emphasizes `realm-create` only.
packages/software-factory/src/factory-tool-executor.ts	Removes outdated commentary about retired registry tool categories.
packages/software-factory/src/factory-tool-builder.ts	Retires OpenRouter-only wrapper tools; updates validator descriptions; keeps core validator/signal tools.
packages/software-factory/src/factory-seed.ts	Updates bootstrap issue description/instructions to reflect native Write + signal_done flow.
packages/software-factory/src/factory-issue-loop-wiring.ts	Switches openrouter provider to `OpencodeFactoryAgent`; adds close() teardown; plumbs API key/workspaceDir requirements.
packages/software-factory/src/factory-entrypoint.ts	Adds `--openrouter-api-key` flag and passes through options.
packages/software-factory/src/factory-agent/types.ts	Removes `OPENROUTER_CHAT_URL` constant; adds optional `LoopAgent.close()`.
packages/software-factory/src/factory-agent/openrouter.ts	Removes legacy direct-HTTP OpenRouter agent implementation.
packages/software-factory/src/factory-agent/opencode.ts	Introduces opencode-backed LoopAgent, MCP server bridge, and direct/passthrough auth modes.
packages/software-factory/src/factory-agent/index.ts	Exports `OpencodeFactoryAgent` instead of `OpenRouterFactoryAgent`.
packages/software-factory/src/factory-agent/claude-code.ts	Removes filtering for now-retired tools; keeps filtering of registry-sourced tools.
packages/software-factory/prompts/ticket-test.md	Updates instructions to use native `Write` + factory `signal_done`.
packages/software-factory/prompts/ticket-iterate.md	Updates workflow guidance to use native tools and boxel CLI via Bash.
packages/software-factory/prompts/ticket-implement.md	Updates workflow steps to use native `Write`/`Edit` and factory `signal_done`.
packages/software-factory/prompts/system.md	Rewrites system prompt tool documentation around opencode/native tooling.
packages/software-factory/prompts/bootstrap-implement.md	Updates bootstrap flow to require schema fetch + native file writes.
packages/software-factory/package.json	Adds direct dev deps for MCP SDK + opencode SDK/binary package.
packages/software-factory/.agents/skills/software-factory-operations/SKILL.md	Removes backend split; documents unified native tooling + boxel CLI usage.
packages/software-factory/.agents/skills/software-factory-bootstrap/SKILL.md	Removes OpenRouter-vs-Claude write tool split; standardizes on native `Write`.
packages/software-factory/.agents/skills/boxel-file-structure/SKILL.md	Adds `containsMany` encoding guidance note.
packages/realm-server/tests/openrouter-passthrough-test.ts	Adds tests for new `/_openrouter/chat/completions` passthrough (including streaming + credits).
packages/realm-server/tests/index.ts	Registers new passthrough test file.
packages/realm-server/routes.ts	Adds POST route for `/_openrouter/chat/completions` with JWT auth.
packages/realm-server/lib/proxy-forward.ts	Extracts shared streaming proxy + per-user cost-deduction ordering helpers.
packages/realm-server/handlers/handle-request-forward.ts	Refactors to reuse shared proxy-forward helpers.
packages/realm-server/handlers/handle-openrouter-passthrough.ts	Implements OpenAI-compatible OpenRouter passthrough endpoint with JWT + credits.
packages/boxel-cli/src/lib/boxel-cli-client.ts	Adds `getServerToken()` accessor for downstream clients needing the raw realm-server JWT.
package.json	Allows `opencode-ai` postinstall by adding it to `onlyBuiltDependencies`.

Files not reviewed (1)

pnpm-lock.yaml: Language not supported

Comments suppressed due to low confidence (1)

packages/software-factory/src/factory-tool-builder.ts:373

The run_evaluate tool description still instructs agents to use fetch_transpiled_module, but that tool has been retired in this PR. Update the description to point to the supported path (e.g. Bash + boxel read-transpiled ...).

      'an EvalResult card) automatically after signal_done, so calling ' +
      'this is optional. When a failure reports a line/column, those ' +
      'numbers refer to the transpiled module — use `fetch_transpiled_module` ' +
      'to locate the offending source construct, then fix the .gts source ' +
      '(never copy transpiled patterns back into source). Auth: realm ' +

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

jurgenwerk · 2026-05-07T12:05:20Z

+const FACTORY_MCP_TOOL_NAMES = new Set([
+  'run_tests',
+  'run_lint',
+  'run_evaluate',
+  'run_parse',


Good catch — get_card_schema was missing from FACTORY_MCP_TOOL_NAMES, so the OpenCode-side filter was dropping it before opencode saw the catalog. Every prompt that told the agent to call it was hitting a non-existent tool on --agent openrouter (the Sticky Note runs that worked were the model getting away with guessing the schemas).

Fixed in 19a7c56 — added get_card_schema to the allowlist; updated the "7 factory tools" comments to "8".

jurgenwerk · 2026-05-07T12:06:29Z

+      // `opencode.close()` only sends SIGTERM, which the 1.14.34
+      // binary ignores. waitForPortFree escalates to SIGKILL.
+      await waitForPortFree(4096, 1000);
+    }


Right — the SDK defaults to 4096 today and we never override it, but that coupling shouldnt be load-bearing. Fixed in 19a7c56: parse the actual port out of this.opencode.url via a small parseOpencodePort helper and only escalate on that specific port. If parsing fails (malformed URL or non-numeric port — would happen if the SDK changes defaults or a future caller passes port: 0), we skip the SIGKILL escalation entirely rather than risk killing whatever unrelated process happens to be on 4096.

jurgenwerk · 2026-05-07T12:06:30Z

+  ctx.status = 200;
+  ctx.res.flushHeaders();


Confirmed — flushHeaders() was committing the 200 status before the upstream-OK check could change it, so error responses appeared as 200 to the client even when ctxt.status = externalResponse.status had run.

Fixed in 19a7c56: dropped flushHeaders() from setupSSEHeaders (added a comment explaining why) and reordered handleStreamingRequest so the : connected keep-alive write — the implicit flush — happens after the OK check. The error path now sets ctxt.status while headers are still mutable, and the first ctxt.res.write commits whatever status was assigned.

jurgenwerk · 2026-05-07T12:06:32Z

+      'those numbers refer to the transpiled module — use ' +
+      '`fetch_transpiled_module` to locate the offending source construct, ' +
+      'then fix the .gts source (never copy transpiled patterns back into ' +
+      'source). Auth: realm server token.',


Fixed in 19a7c56 — replaced the fetch_transpiled_module reference with Bash + boxel read-transpiled <path> --realm <url>, which is the path that survived this PR. Also fixed the same stale reference in the run_evaluate description (the suppressed low-confidence comment near factory-tool-builder.ts:373 was the same issue).

Four fixes from Copilot's review: 1. opencode agent: expose `get_card_schema` over MCP. The OpenCode path filters MCP tools by FACTORY_MCP_TOOL_NAMES, and `get_card_schema` was missing — so every prompt that told the agent to call it was hitting a non-existent tool on `--agent openrouter`. Added it; updated the "7 factory tools" comments to "8". 2. opencode agent: stop hardcoding port 4096 in close()'s SIGKILL escalation. Parse the port from the SDK-returned URL via a small `parseOpencodePort` helper and fall back to no escalation when parsing fails. Avoids killing whatever unrelated process happens to be on 4096 if the SDK changed defaults or a future caller passes `port: 0`. 3. proxy-forward: fix SSE pre-flush bug. setupSSEHeaders was calling `flushHeaders()` after setting status 200, so any later `ctx.status = upstream.status` on upstream failure was unobservable on the wire — clients always saw 200. Drop the pre-flush; defer the `: connected` keep-alive write until after the upstream-OK check so the first wire write happens with the correct status. 4. factory-tool-builder: drop stale `fetch_transpiled_module` references in run_evaluate / run_instantiate descriptions. The tool was retired in this PR; both descriptions now point to `Bash` + `boxel read-transpiled` instead. Also pruned a parallel "we used to filter X tools too" comment in factory-agent-claude-code.test.ts that lingered from the migration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three failing tests on shard 1/3: - factory-prompt-loader > FilePromptLoader > loads and interpolates a template (test 27) - factory-prompt-loader > assembleSystemPrompt > includes role and tool-use rules (test 31) Both assert `result.includes('workspace mirror of')`. Our system.md said "local workspace mirroring the target realm" with a newline between "workspace" and "mirror". Reflowed the line so the literal phrase "workspace mirror of" appears unbroken in the rendered prompt. - factory-entrypoint > parseFactoryEntrypointArgs accepts required inputs (test 52) The Copilot-review fix added `openRouterApiKey` to the parsed options shape, but the deepEqual expected object wasn't updated. Adds `openRouterApiKey: undefined` to match. Test 69 (factory:go --debug integration → "Atomic upload failed: 204 No Content") fails on `main` too — pre-existing issue downstream of CS-11003's atomic-endpoint changes, not introduced here. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The recent ?waitForIndex=true opt-in adds a query string to the /_atomic POST URL when the factory entrypoint syncs the seed batch. The integration mock's exact-string match against '/hassan/personal/_atomic' no longer matches, so the request fell through to the generic POST handler, which returned 204 No Content. The sync client expects 201, surfaced "Atomic upload failed: 204 No Content", and the factory:go subprocess exited 1 — failing every shard 1 run of pnpm test:node since 9444b82. Switch to a startsWith check so the handler accepts both the plain URL and the ?waitForIndex=true form, matching the pattern used by the other /hassan/personal/* handlers in this mock. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The integration mock and a few sibling tests hardcoded 'hassan' as the Matrix user / realm owner across user IDs, account-data paths, and realm URLs. Swap to the neutral 'testuser' so the fixture doesn't read like it belongs to a real teammate. No behavior change — the username is just the test fixture. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…-factory-replace-openrouter-backend-with-opencode

jurgenwerk changed the title ~~CS-11034: replace OpenRouter backend with opencode (WIP — foundation only)~~ CS-11034: replace OpenRouter backend with opencode May 5, 2026

jurgenwerk changed the base branch from main to retire-structured-update-tools May 5, 2026 12:32

jurgenwerk changed the title ~~CS-11034: replace OpenRouter backend with opencode~~ CS-11034: replace OpenRouter HTTP backend with opencode (native fs for both backends) May 5, 2026

jurgenwerk and others added 24 commits May 6, 2026 11:55

jurgenwerk changed the base branch from retire-structured-update-tools to main May 7, 2026 11:11

jurgenwerk and others added 5 commits May 7, 2026 13:19

software-factory: drop OPENCODE_PERFORMANCE.md

263f50a

Investigation note from while we were diagnosing factory:go wall-time. Hypotheses are now reflected in commits + skill copy; the standalone markdown is no longer needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jurgenwerk requested a review from Copilot May 7, 2026 11:47

Copilot started reviewing on behalf of jurgenwerk May 7, 2026 11:52 View session

Copilot AI reviewed May 7, 2026

View reviewed changes

jurgenwerk force-pushed the cs-11034-software-factory-replace-openrouter-backend-with-opencode branch from 6f6900e to 1fe459e Compare May 7, 2026 11:57

jurgenwerk and others added 2 commits May 7, 2026 14:03

jurgenwerk changed the title ~~CS-11034: replace OpenRouter HTTP backend with opencode (native fs for both backends)~~ Replace OpenRouter HTTP backend with opencode May 7, 2026

habdelra and others added 3 commits May 7, 2026 08:33

Merge branch 'fix-sf-atomic-mock-waitforindex' into cs-11034-software…

4384ad3

…-factory-replace-openrouter-backend-with-opencode

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace OpenRouter HTTP backend with opencode #4653

Replace OpenRouter HTTP backend with opencode #4653
jurgenwerk wants to merge 38 commits intomainfrom
cs-11034-software-factory-replace-openrouter-backend-with-opencode

jurgenwerk commented May 5, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 5, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

jurgenwerk May 7, 2026 •

edited

Loading

Uh oh!

jurgenwerk May 7, 2026

Uh oh!

jurgenwerk May 7, 2026

Uh oh!

jurgenwerk May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jurgenwerk commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's in

OpencodeFactoryAgent (`src/factory-agent/opencode.ts`)

CLI + wiring

Dependencies

Deletions

Skill updates

Tests

Honest caveats — needs your verification

Test plan

Uh oh!

github-actions Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Host Test Results

Realm Server Test Results

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

jurgenwerk May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jurgenwerk May 7, 2026

Choose a reason for hiding this comment

Uh oh!

jurgenwerk May 7, 2026

Choose a reason for hiding this comment

Uh oh!

jurgenwerk May 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jurgenwerk commented May 5, 2026 •

edited

Loading

`OpencodeFactoryAgent` (`src/factory-agent/opencode.ts`)

github-actions Bot commented May 5, 2026 •

edited

Loading

jurgenwerk May 7, 2026 •

edited

Loading