cardstack · jurgenwerk · May 11, 2026 · May 11, 2026
diff --git a/...ftware-factory/.agents/skills/boxel-development/references/dev-qunit-testing.md b/...ftware-factory/.agents/skills/boxel-development/references/dev-qunit-testing.md
@@ -134,7 +134,7 @@ Add `data-test-*` attributes to card templates for stable test selectors:
 
 When tests fail, the orchestrator feeds test failure details back to the agent. For more detail:
 
-- **TestRun cards** live in the target realm's `Validations/` folder with a `test_` prefix (e.g., `Validations/test_issue-slug-1.json`). To find all test runs, search by the TestRun card type in the target realm. Each TestRun has a `sequenceNumber` that increases with each iteration. Use `read_file` on a specific TestRun for full details.
+- **TestRun cards** live in the target realm's `Validations/` folder with a `test_` prefix (e.g., `Validations/test_issue-slug-1.json`). To find all test runs, run `Glob` over `Validations/test_*.json` or shell out via `Bash` to `boxel search --realm <url>` filtered on the TestRun card type. Each TestRun has a `sequenceNumber` that increases with each iteration. Use native `Read` on a specific TestRun for full details — paths are workspace-relative.
 
 ## Rules
 

diff --git a/...oftware-factory/.agents/skills/boxel-development/references/dev-realm-search.md b/...oftware-factory/.agents/skills/boxel-development/references/dev-realm-search.md
@@ -1,6 +1,12 @@
 # Realm Search Query Reference
 
-How to use the `search_realm` tool to query cards in a realm. The query object follows the Boxel realm search API format.
+How to construct queries for the Boxel realm search index. Run them from `Bash` against your **target realm** with:
+
+```
+boxel search --realm <target-realm-url> --query '<json-query>'
+```
+
+The query JSON below is what goes into `--query`. Do not query other realms (base, software-factory, experiments, catalog) — the skills you've loaded are authoritative for patterns; cross-realm exploration burns tokens without helping.
 
 ## Basic Structure
 
@@ -216,27 +222,18 @@ Descending order:
 
 ## Discovering Available Fields
 
-You can only filter/sort on fields that exist on the card type. To find which fields a card type has:
+You can only filter/sort on fields that exist on the card type. To find which fields a card type has, call the `get_card_schema` factory tool:
 
-1. Use `run_command` to fetch the JSON schema for a card type:
-
-```json
-{
-  "command": "@cardstack/boxel-host/commands/get-card-type-schema/default",
-  "commandInput": {
-    "codeRef": {
-      "module": "http://localhost:4201/software-factory/darkfactory",
-      "name": "Issue"
-    }
-  }
-}
+```
+get_card_schema({
+  module: 'http://localhost:4201/software-factory/darkfactory',
+  name: 'Issue'
+})
 ```
 
-2. The result contains `attributes.properties` listing all searchable fields (e.g., `status`, `summary`, `priority`).
-
-3. Use those field names in your `eq`, `contains`, `range`, or `sort` with the matching `on` type.
+The result contains `schema.attributes.properties` listing all searchable fields (e.g., `status`, `summary`, `priority`) plus their types and any enum values. Use those field names in your `eq`, `contains`, `range`, or `sort` with the matching `on` type.
 
-The card tools (`update_project`, `update_issue`, `create_knowledge`, `create_catalog_spec`) also have dynamic JSON schemas in their parameters that list available fields.
+`get_card_schema` is also how you learn the shape for writing tracker (Project / Issue / KnowledgeArticle) and Spec card JSON files — call it before writing the JSON so what you write matches the live `CardDef`.
 
 ### Inheritance
 

diff --git a/.../software-factory/.agents/skills/boxel-development/references/dev-spec-usage.md b/.../software-factory/.agents/skills/boxel-development/references/dev-spec-usage.md
@@ -1,24 +1,52 @@
 # Catalog Spec Card Instances
 
-For each top-level card definition, create a Catalog Spec card instance in the target realm's `Spec/` folder using the `create_catalog_spec` tool. This makes the card discoverable in the Boxel catalog.
+For each top-level card definition, write a Catalog Spec card instance in the target realm's `Spec/` folder. This makes the card discoverable in the Boxel catalog.
 
-The `create_catalog_spec` tool has the authoritative JSON schema for Spec card fields — use its parameter definitions to know which attributes and relationships are available. The tool auto-constructs the document with the correct `adoptsFrom` (`https://cardstack.com/base/spec#Spec`).
+Specs adopt from `https://cardstack.com/base/spec#Spec` — that module lives in the base realm, not your target realm. Fetch the authoritative schema by calling the `get_card_schema` factory tool:
 
-## Usage
+```
+get_card_schema({ module: 'https://cardstack.com/base/spec', name: 'Spec' })
+```
 
-Use the `create_catalog_spec` tool to create a Spec card. The tool's parameters define the available fields dynamically from the card definition — consult the tool schema for the exact field names and types.
+The result gives you the exact `attributes` and `relationships` shape. Write the JSON file with native `Write` (paths are workspace-relative, e.g. `Spec/sticky-note.json`); `boxel sync` pushes it to the realm between iterations.
+
+## Required Shape
+
+```json
+{
+  "data": {
+    "type": "card",
+    "attributes": {
+      "specType": "card",
+      "ref": { "module": "../sticky-note", "name": "StickyNote" },
+      "readMe": "...",
+      "cardInfo": { "name": "Sticky Note", "summary": "..." }
+    },
+    "relationships": {
+      "linkedExamples.0": { "links": { "self": "../StickyNote/welcome-note" } }
+    },
+    "meta": {
+      "adoptsFrom": {
+        "module": "https://cardstack.com/base/spec",
+        "name": "Spec"
+      }
+    }
+  }
+}
+```
 
 Key concepts:
 
 - `ref` — a CodeRef pointing to the card definition (module path + exported class name). The module path is relative from the Spec card to the `.gts` file (e.g., `../sticky-note` from `Spec/sticky-note.json`).
 - `specType` — `"card"` for CardDef, `"field"` for FieldDef, `"component"` for standalone components.
-- `linkedExamples` — a relationship pointing to sample card instances. Create at least one sample instance and link it here.
+- `linkedExamples` — a `linksToMany` relationship pointing to sample card instances. Use dotted keys (`linkedExamples.0`, `linkedExamples.1`, …) — the array form is rejected by the indexer. Create at least one sample instance and link it here.
+- **Do NOT call `run_instantiate` on the Spec file itself.** Spec's module lives in the base realm; the prerender enforces same-origin module loads and the call always fails. To validate Specs, call `run_instantiate` WITHOUT a `path`; it discovers Specs in the target realm and exercises their `linkedExamples` against the card classes you wrote.
 
 ## Sample Card Instances
 
 Create at least one sample instance with realistic data for each top-level card. Sample instances serve as both catalog examples and test fixtures.
 
-Place sample instances in a folder named after the card type (e.g., `StickyNote/welcome-note.json`). Use `write_file` to create them. The `linkedExamples` relationship in the Spec card points to these using a relative path (e.g., `../StickyNote/welcome-note`).
+Place sample instances in a folder named after the card type (e.g., `StickyNote/welcome-note.json`) and write them with native `Write`. The `linkedExamples` relationship in the Spec card points to these using a relative path without the `.json` suffix (e.g., `../StickyNote/welcome-note`).
 
 ---
 

diff --git a/packages/software-factory/src/factory-agent/claude-code.ts b/packages/software-factory/src/factory-agent/claude-code.ts
@@ -66,11 +66,10 @@ const MAX_TOOL_USE_TURNS = 50;
 
 /**
  * Built-in Claude Code tools the factory exposes to the model on the
- * Claude backend. These replace the custom `read_file` / `write_file`
- * factory tools — they operate on the SDK query's `cwd` (the factory
- * workspace), so the model uses native semantics for fs work and we
- * keep MCP focused on operations that genuinely need realm runtime
- * access (search_realm, validators, structured updates, signals).
+ * Claude backend. They operate on the SDK query's `cwd` (the factory
+ * workspace), so the model handles workspace files natively while MCP
+ * stays focused on what needs realm runtime access (`get_card_schema`,
+ * validators, control signals).
  */
 const NATIVE_FS_TOOLS = ['Read', 'Write', 'Edit', 'Bash', 'Glob', 'Grep'];
 
@@ -298,14 +297,12 @@ export class ClaudeCodeFactoryAgent implements LoopAgent {
     // Two tool surfaces are visible to the model on the Claude backend:
     //   1. Native Claude Code tools (Read / Write / Edit / Bash / Glob /
     //      Grep) — anchored to the factory workspace via the SDK query's
-    //      `cwd`. These replace the factory's old `read_file` /
-    //      `write_file` shims; the model works on the local mirror of the
-    //      target realm directly.
+    //      `cwd`. The model works on the local mirror of the target realm
+    //      directly; `boxel sync` pushes between iterations.
     //   2. Factory tools exposed via an in-process MCP server, prefixed
-    //      with `mcp__<server>__`. Used for everything that needs realm
-    //      runtime access (search, validators, host commands, structured
-    //      updates) and for control signals (signal_done /
-    //      request_clarification).
+    //      with `mcp__<server>__`. Used for realm-runtime operations
+    //      (`get_card_schema`, the five validators) and for control
+    //      signals (`signal_done`, `request_clarification`).
     //
     // The shared prompt template / skills reference factory operations by
     // their plain names (e.g. `signal_done`). Append a short rename map

diff --git a/packages/software-factory/src/factory-agent/index.ts b/packages/software-factory/src/factory-agent/index.ts
@@ -16,4 +16,4 @@ export * from './types';
 export { OpencodeFactoryAgent } from './opencode';
 export type { OpencodeAgentConfig } from './opencode';
 export { ClaudeCodeFactoryAgent } from './claude-code';
-export { MockFactoryAgent, MockLoopAgent } from './mocks';
+export { MockLoopAgent } from './mocks';
diff --git a/packages/software-factory/src/factory-agent/mocks.ts b/packages/software-factory/src/factory-agent/mocks.ts
@@ -1,51 +1,14 @@
 /**
  * Mock agent implementations for testing.
  *
- * These are deterministic agents that return pre-scripted responses,
- * used by unit tests and smoke tests to verify orchestration logic
- * without calling a real LLM.
+ * Deterministic agents that return pre-scripted responses, used by unit
+ * tests and smoke tests to verify orchestration logic without calling
+ * a real LLM.
  */
 
-import type { AgentAction, AgentContext, FactoryAgent } from './types';
-import type { LoopAgent, AgentRunResult } from './types';
+import type { AgentContext, LoopAgent, AgentRunResult } from './types';
 import type { FactoryTool } from '../factory-tool-builder';
 
-// ---------------------------------------------------------------------------
-// MockFactoryAgent — deterministic FactoryAgent for declarative model tests
-// ---------------------------------------------------------------------------
-
-export class MockFactoryAgent implements FactoryAgent {
-  private responses: AgentAction[][];
-  private callIndex = 0;
-
-  /** All AgentContext inputs received, in order. */
-  readonly receivedContexts: AgentContext[] = [];
-
-  constructor(responses: AgentAction[][]) {
-    this.responses = responses;
-  }
-
-  async plan(context: AgentContext): Promise<AgentAction[]> {
-    this.receivedContexts.push(context);
-
-    if (this.callIndex >= this.responses.length) {
-      throw new Error(
-        `MockFactoryAgent exhausted: called ${this.callIndex + 1} times ` +
-          `but only ${this.responses.length} response(s) were configured`,
-      );
-    }
-
-    let response = this.responses[this.callIndex];
-    this.callIndex++;
-    return response;
-  }
-
-  /** Number of times plan() has been called. */
-  get callCount(): number {
-    return this.callIndex;
-  }
-}
-
 // ---------------------------------------------------------------------------
 // MockLoopAgent — deterministic LoopAgent for tool-use model tests
 // ---------------------------------------------------------------------------

diff --git a/packages/software-factory/src/factory-agent/types.ts b/packages/software-factory/src/factory-agent/types.ts
@@ -1,9 +1,10 @@
 /**
  * Shared types, interfaces, and constants for the factory agent system.
  *
- * This module contains all the data types used across the declarative agent
- * (factory-agent.ts), the tool-use agent (factory-agent-tool-use.ts), and
- * their consumers (loop, context builder, prompt loader, etc.).
+ * The runtime agents (`ClaudeCodeFactoryAgent` in `claude-code.ts`,
+ * `OpencodeFactoryAgent` in `opencode.ts`) implement the `LoopAgent`
+ * interface declared here; orchestration consumers (issue loop, context
+ * builder, prompt loader) share the data types declared below.
  */
 
 // ---------------------------------------------------------------------------
@@ -17,7 +18,7 @@
  * Pinned to `claude-opus-4-7` rather than the unversioned `claude-opus-4`
  * alias. The alias route exhibited a deterministic mid-stream truncation
  * on large tool-call arguments (`finish_reason: null`, `completion=1`)
- * that broke every `write_file` for full `.gts` card definitions. Opus
+ * that broke every native `Write` for full `.gts` card definitions. Opus
  * 4.7 on the pinned route returned clean `finish_reason: tool_calls`
  * responses with completions up to ~4.7K tokens in a single turn, and
  * ran an end-to-end factory loop to `outcome=all_issues_done` with no
@@ -54,14 +55,6 @@ export const VALID_ACTION_TYPES = [
 
 export const VALID_REALMS = ['target', 'test'] as const;
 
-// Action types that require path + content
-export const FILE_ACTION_TYPES: ReadonlySet<string> = new Set([
-  'create_file',
-  'update_file',
-  'create_test',
-  'update_test',
-]);
-
 // ---------------------------------------------------------------------------
 // Types
 // ---------------------------------------------------------------------------
@@ -93,8 +86,9 @@ export interface ClaudeCodeAgentConfig {
    * query's `cwd` so the model's native Read / Write / Edit / Bash / Glob /
    * Grep tools operate against the factory workspace by default — paths like
    * `sticky-note.gts` resolve inside the workspace, with no surprise hits
-   * against the user's filesystem. Realm I/O still goes through factory
-   * MCP tools (search_realm, run_command, validators, …).
+   * against the user's filesystem. Realm-runtime operations go through the
+   * factory MCP tools (get_card_schema, run_lint / run_parse / run_evaluate
+   * / run_instantiate / run_tests, signal_done, request_clarification).
    */
   workspaceDir?: string;
 }
@@ -259,14 +253,6 @@ export interface AgentAction {
   toolArgs?: Record<string, unknown>;
 }
 
-// ---------------------------------------------------------------------------
-// FactoryAgent interface (declarative model)
-// ---------------------------------------------------------------------------
-
-export interface FactoryAgent {
-  plan(context: AgentContext): Promise<AgentAction[]>;
-}
-
 // ---------------------------------------------------------------------------
 // Message types (for LLM communication)
 // ---------------------------------------------------------------------------