feat(task-runner): Task Runner MCP for orchestrating Claude Code agents#193
feat(task-runner): Task Runner MCP for orchestrating Claude Code agents#193
Conversation
…agents New MCP for managing tasks and spawning Claude Code agents: - BEADS_INIT: Initialize .beads directory for task tracking - BEADS_STATUS: Check if beads is initialized - TASK_LIST, TASK_CREATE, TASK_UPDATE, TASK_DELETE: CRUD for tasks - SKILL_LIST, SKILL_READ: Read agent skills from filesystem - AGENT_SPAWN: Spawn Claude Code with safety constraints - AGENT_STATUS, AGENT_STOP: Monitor and control running agents - MEMORY_READ, MEMORY_WRITE: Agent memory persistence Features: - Session persistence to .beads/sessions.json - Stale session cleanup on startup - Tool call and message extraction from stream-json output - Automatic git commit on task completion - Memory files for knowledge persistence (MEMORY.md, memory/*.md) Also updates local-fs with object storage tools for plugin integration.
🚀 Preview Deployments Ready!Deployed from commit: |
There was a problem hiding this comment.
13 issues found across 41 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="task-runner/server/tools/loop.ts">
<violation number="1" location="task-runner/server/tools/loop.ts:240">
P2: Git commit results are not checked. If `git add` or `git commit` fails (e.g., nothing to commit, missing git config), the failure is silently ignored. Consider checking exit codes and logging warnings on failure.</violation>
</file>
<file name="task-runner/server/tools/agent.ts">
<violation number="1" location="task-runner/server/tools/agent.ts:297">
P2: Sequential reading of stdout then stderr can cause a deadlock if the subprocess writes enough to stderr to fill the pipe buffer while stdout is being processed. Consider reading both streams concurrently using `Promise.all` or merging them into a single stream.</violation>
</file>
<file name="task-runner/server/tools/skills.ts">
<violation number="1" location="task-runner/server/tools/skills.ts:152">
P2: `SKILL_APPLY` declares a `customization` input but never uses it, so `prefix`/`extraContext` are silently ignored. Either remove the option or apply it when creating tasks so callers get the behavior the API promises.</violation>
</file>
<file name="local-fs/package.json">
<violation number="1" location="local-fs/package.json:9">
P2: Bin entry points to a TypeScript source file, which won’t run under Node when installed via npm. Point the bin to the compiled JS output (or provide a JS wrapper) so `local-fs-serve` works without a custom loader.</violation>
</file>
<file name="local-fs/server/http.ts">
<violation number="1" location="local-fs/server/http.ts:310">
P2: Build the displayed MCP URL using an encoded path (query-string form) so paths with spaces or special characters produce a valid URL.</violation>
</file>
<file name="task-runner/server/prompts/safe-agent.ts">
<violation number="1" location="task-runner/server/prompts/safe-agent.ts:93">
P2: The completion protocol omits the mandatory `git push` step required by AGENTS.md, so agents could mark tasks complete without pushing changes.</violation>
</file>
<file name="task-runner/server/sessions.ts">
<violation number="1" location="task-runner/server/sessions.ts:207">
P2: appendOutput rewrites the entire log file on every chunk, which is O(n) per write and can lose data under concurrent appends. Use an append operation instead of read/modify/write.</violation>
</file>
<file name="task-runner/server/tools/memory.ts">
<violation number="1" location="task-runner/server/tools/memory.ts:154">
P1: Validate the `date` parameter before using it in a path. As written, any string is accepted and joined into a file path, which allows path traversal to read files outside `memory/`.</violation>
<violation number="2" location="task-runner/server/tools/memory.ts:196">
P2: `type: "all"` should read all daily logs, but the loop slices to the last 7 files. This contradicts the tool description and silently drops older memory files.</violation>
</file>
<file name="local-fs/server/serve.ts">
<violation number="1" location="local-fs/server/serve.ts:158">
P2: `import.meta.dirname` isn’t a Bun runtime property, so this falls back to `process.cwd()/server` and resolves `http.ts` from the wrong directory when run via `bunx`. Use Bun’s `import.meta.dir` (or derive from `import.meta.url`) so the HTTP server script path is correct.</violation>
</file>
<file name="task-runner/server/tools/workspace.ts">
<violation number="1" location="task-runner/server/tools/workspace.ts:57">
P2: `Bun.file(...).exists()` does not reliably detect directories, so valid workspace directories (and `.beads` folders) can be rejected or reported missing. Use a directory-aware check such as `fs.promises.stat`/`lstat` with `.isDirectory()` instead of `Bun.file().exists()` for directory paths.</violation>
<violation number="2" location="task-runner/server/tools/workspace.ts:125">
P1: `workspaceTools` is exported as an empty array, so `WORKSPACE_SET`/`WORKSPACE_GET` are never registered in the tool list. Add the workspace tool creators so the tools are available at runtime.</violation>
</file>
<file name="local-fs/server/tools.ts">
<violation number="1" location="local-fs/server/tools.ts:1334">
P2: The new BEADS/TASK handlers use Bun-specific APIs (Bun.write/Bun.file) even though local-fs is documented to run via `npx` (Node). In Node the `Bun` global is undefined, so these handlers will fail at runtime. Use `fs/promises` (or the existing storage abstraction) for read/write to keep Node compatibility.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
task-runner/server/tools/memory.ts
Outdated
| type: z.enum(["daily", "longterm", "recent", "all"]).optional().default("recent").describe( | ||
| "'daily': Today's log, 'longterm': MEMORY.md, 'recent': Today + yesterday + MEMORY.md, 'all': Everything" | ||
| ), | ||
| date: z.string().optional().describe("Specific date for daily log (YYYY-MM-DD)"), |
There was a problem hiding this comment.
P1: Validate the date parameter before using it in a path. As written, any string is accepted and joined into a file path, which allows path traversal to read files outside memory/.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At task-runner/server/tools/memory.ts, line 154:
<comment>Validate the `date` parameter before using it in a path. As written, any string is accepted and joined into a file path, which allows path traversal to read files outside `memory/`.</comment>
<file context>
@@ -0,0 +1,310 @@
+ type: z.enum(["daily", "longterm", "recent", "all"]).optional().default("recent").describe(
+ "'daily': Today's log, 'longterm': MEMORY.md, 'recent': Today + yesterday + MEMORY.md, 'all': Everything"
+ ),
+ date: z.string().optional().describe("Specific date for daily log (YYYY-MM-DD)"),
+ }),
+ execute: async ({ context }) => {
</file context>
| // Note: WORKSPACE_SET and WORKSPACE_GET are NOT exposed to agents. | ||
| // The workspace is passed directly to tools like AGENT_SPAWN. | ||
| // The tool creators are already exported above for debugging/admin use. | ||
| export const workspaceTools: ((env: Env) => ReturnType<typeof createWorkspaceSetTool>)[] = []; |
There was a problem hiding this comment.
P1: workspaceTools is exported as an empty array, so WORKSPACE_SET/WORKSPACE_GET are never registered in the tool list. Add the workspace tool creators so the tools are available at runtime.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At task-runner/server/tools/workspace.ts, line 125:
<comment>`workspaceTools` is exported as an empty array, so `WORKSPACE_SET`/`WORKSPACE_GET` are never registered in the tool list. Add the workspace tool creators so the tools are available at runtime.</comment>
<file context>
@@ -0,0 +1,125 @@
+// Note: WORKSPACE_SET and WORKSPACE_GET are NOT exposed to agents.
+// The workspace is passed directly to tools like AGENT_SPAWN.
+// The tool creators are already exported above for debugging/admin use.
+export const workspaceTools: ((env: Env) => ReturnType<typeof createWorkspaceSetTool>)[] = [];
</file context>
|
|
||
| // Auto-commit after successful completion | ||
| if (completed) { | ||
| await runCommand("git", ["add", "-A"], cwd); |
There was a problem hiding this comment.
P2: Git commit results are not checked. If git add or git commit fails (e.g., nothing to commit, missing git config), the failure is silently ignored. Consider checking exit codes and logging warnings on failure.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At task-runner/server/tools/loop.ts, line 240:
<comment>Git commit results are not checked. If `git add` or `git commit` fails (e.g., nothing to commit, missing git config), the failure is silently ignored. Consider checking exit codes and logging warnings on failure.</comment>
<file context>
@@ -0,0 +1,523 @@
+
+ // Auto-commit after successful completion
+ if (completed) {
+ await runCommand("git", ["add", "-A"], cwd);
+ await runCommand(
+ "git",
</file context>
| } | ||
|
|
||
| // Capture stderr | ||
| const stderrReader = proc.stderr.getReader(); |
There was a problem hiding this comment.
P2: Sequential reading of stdout then stderr can cause a deadlock if the subprocess writes enough to stderr to fill the pipe buffer while stdout is being processed. Consider reading both streams concurrently using Promise.all or merging them into a single stream.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At task-runner/server/tools/agent.ts, line 297:
<comment>Sequential reading of stdout then stderr can cause a deadlock if the subprocess writes enough to stderr to fill the pipe buffer while stdout is being processed. Consider reading both streams concurrently using `Promise.all` or merging them into a single stream.</comment>
<file context>
@@ -0,0 +1,577 @@
+ }
+
+ // Capture stderr
+ const stderrReader = proc.stderr.getReader();
+ while (true) {
+ const { done, value } = await stderrReader.read();
</file context>
| "Apply a skill to the current workspace. Creates Beads tasks for each user story in the skill.", | ||
| inputSchema: z.object({ | ||
| skillId: z.string().describe("Skill ID to apply"), | ||
| customization: z |
There was a problem hiding this comment.
P2: SKILL_APPLY declares a customization input but never uses it, so prefix/extraContext are silently ignored. Either remove the option or apply it when creating tasks so callers get the behavior the API promises.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At task-runner/server/tools/skills.ts, line 152:
<comment>`SKILL_APPLY` declares a `customization` input but never uses it, so `prefix`/`extraContext` are silently ignored. Either remove the option or apply it when creating tasks so callers get the behavior the API promises.</comment>
<file context>
@@ -0,0 +1,208 @@
+ "Apply a skill to the current workspace. Creates Beads tasks for each user story in the skill.",
+ inputSchema: z.object({
+ skillId: z.string().describe("Skill ID to apply"),
+ customization: z
+ .object({
+ prefix: z.string().optional().describe("Custom task ID prefix"),
</file context>
| } catch { | ||
| // File doesn't exist yet | ||
| } | ||
| await Bun.write(logPath, existing + chunk); |
There was a problem hiding this comment.
P2: appendOutput rewrites the entire log file on every chunk, which is O(n) per write and can lose data under concurrent appends. Use an append operation instead of read/modify/write.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At task-runner/server/sessions.ts, line 207:
<comment>appendOutput rewrites the entire log file on every chunk, which is O(n) per write and can lose data under concurrent appends. Use an append operation instead of read/modify/write.</comment>
<file context>
@@ -0,0 +1,436 @@
+ } catch {
+ // File doesn't exist yet
+ }
+ await Bun.write(logPath, existing + chunk);
+}
+
</file context>
task-runner/server/tools/memory.ts
Outdated
| .filter(f => f.endsWith(".md")) | ||
| .sort() | ||
| .reverse(); // Most recent first | ||
| for (const file of files.slice(0, 7)) { // Last 7 days max |
There was a problem hiding this comment.
P2: type: "all" should read all daily logs, but the loop slices to the last 7 files. This contradicts the tool description and silently drops older memory files.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At task-runner/server/tools/memory.ts, line 196:
<comment>`type: "all"` should read all daily logs, but the loop slices to the last 7 files. This contradicts the tool description and silently drops older memory files.</comment>
<file context>
@@ -0,0 +1,310 @@
+ .filter(f => f.endsWith(".md"))
+ .sort()
+ .reverse(); // Most recent first
+ for (const file of files.slice(0, 7)) { // Last 7 days max
+ const dailyPath = join(memoryDir, file);
+ results.push({
</file context>
| `); | ||
|
|
||
| // Get the directory of this script to find http.ts | ||
| const scriptDir = import.meta.dirname || resolve(process.cwd(), "server"); |
There was a problem hiding this comment.
P2: import.meta.dirname isn’t a Bun runtime property, so this falls back to process.cwd()/server and resolves http.ts from the wrong directory when run via bunx. Use Bun’s import.meta.dir (or derive from import.meta.url) so the HTTP server script path is correct.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At local-fs/server/serve.ts, line 158:
<comment>`import.meta.dirname` isn’t a Bun runtime property, so this falls back to `process.cwd()/server` and resolves `http.ts` from the wrong directory when run via `bunx`. Use Bun’s `import.meta.dir` (or derive from `import.meta.url`) so the HTTP server script path is correct.</comment>
<file context>
@@ -0,0 +1,215 @@
+`);
+
+// Get the directory of this script to find http.ts
+const scriptDir = import.meta.dirname || resolve(process.cwd(), "server");
+const httpScript = resolve(scriptDir, "http.ts");
+
</file context>
|
|
||
| // Check if directory exists | ||
| const file = Bun.file(directory); | ||
| const stat = await file.exists(); |
There was a problem hiding this comment.
P2: Bun.file(...).exists() does not reliably detect directories, so valid workspace directories (and .beads folders) can be rejected or reported missing. Use a directory-aware check such as fs.promises.stat/lstat with .isDirectory() instead of Bun.file().exists() for directory paths.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At task-runner/server/tools/workspace.ts, line 57:
<comment>`Bun.file(...).exists()` does not reliably detect directories, so valid workspace directories (and `.beads` folders) can be rejected or reported missing. Use a directory-aware check such as `fs.promises.stat`/`lstat` with `.isDirectory()` instead of `Bun.file().exists()` for directory paths.</comment>
<file context>
@@ -0,0 +1,125 @@
+
+ // Check if directory exists
+ const file = Bun.file(directory);
+ const stat = await file.exists();
+
+ if (!stat) {
</file context>
- Add TASK_SET_PLAN tool for AI agents to save generated plans - Add TASK_APPROVE_PLAN tool for user approval workflow - Add quality gates detection and management tools - Add memory system for cross-iteration agent learning - Remove BEADS_INIT and TASK_* tools from local-fs (moved to task-runner) - Clean up local-fs to focus on pure file operations only
- SESSION_LAND now returns success=false when required gates fail - Strengthened prompt language: agent MUST fix ALL gate failures - Clarified that pre-existing errors are the agent's responsibility - Updated completion protocol to require gate verification before completion - Agent cannot mark task complete until allGatesPassed=true
Add two new tools to enable command execution capabilities: - EXEC: General command execution with background mode support - DENO_TASK: Convenience wrapper for running deno tasks Both tools support background mode for long-running processes like dev servers. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update agent prompts to clarify that: - Agents should only fix errors in files they modified - Pre-existing codebase errors are not their responsibility - They should warn about pre-existing issues but continue with their task This prevents agents from going off-task trying to fix all lint/check errors in the codebase when the errors existed before their changes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
2 issues found across 2 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="local-fs/server/tools.ts">
<violation number="1" location="local-fs/server/tools.ts:1312">
P0: The new EXEC tool allows arbitrary command execution from MCP clients without any authorization or allowlist, which effectively exposes the host to remote code execution. This is a critical security risk for a filesystem MCP server.</violation>
<violation number="2" location="local-fs/server/tools.ts:1353">
P2: Splitting the command by spaces breaks quoted arguments and commands with spaces, so background EXEC will execute the wrong command for inputs like `node "my script.js"`. Spawn the full command string instead.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| // ============================================================ | ||
|
|
||
| // EXEC - Execute a command in the workspace directory | ||
| server.registerTool( |
There was a problem hiding this comment.
P0: The new EXEC tool allows arbitrary command execution from MCP clients without any authorization or allowlist, which effectively exposes the host to remote code execution. This is a critical security risk for a filesystem MCP server.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At local-fs/server/tools.ts, line 1312:
<comment>The new EXEC tool allows arbitrary command execution from MCP clients without any authorization or allowlist, which effectively exposes the host to remote code execution. This is a critical security risk for a filesystem MCP server.</comment>
<file context>
@@ -1304,9 +1304,233 @@ export function registerTools(server: McpServer, storage: LocalFileStorage) {
+ // ============================================================
+
+ // EXEC - Execute a command in the workspace directory
+ server.registerTool(
+ "EXEC",
+ {
</file context>
| const [cmd, ...cmdArgs] = args.command.split(" "); | ||
| const child = spawn(cmd, cmdArgs, { | ||
| cwd: storage.root, | ||
| detached: true, | ||
| stdio: "ignore", | ||
| shell: true, | ||
| }); |
There was a problem hiding this comment.
P2: Splitting the command by spaces breaks quoted arguments and commands with spaces, so background EXEC will execute the wrong command for inputs like node "my script.js". Spawn the full command string instead.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At local-fs/server/tools.ts, line 1353:
<comment>Splitting the command by spaces breaks quoted arguments and commands with spaces, so background EXEC will execute the wrong command for inputs like `node "my script.js"`. Spawn the full command string instead.</comment>
<file context>
@@ -1304,9 +1304,233 @@ export function registerTools(server: McpServer, storage: LocalFileStorage) {
+
+ if (args.background) {
+ // For background processes, spawn detached and return immediately
+ const [cmd, ...cmdArgs] = args.command.split(" ");
+ const child = spawn(cmd, cmdArgs, {
+ cwd: storage.root,
</file context>
| const [cmd, ...cmdArgs] = args.command.split(" "); | |
| const child = spawn(cmd, cmdArgs, { | |
| cwd: storage.root, | |
| detached: true, | |
| stdio: "ignore", | |
| shell: true, | |
| }); | |
| const child = spawn(args.command, { | |
| cwd: storage.root, | |
| detached: true, | |
| stdio: "ignore", | |
| shell: true, | |
| }); |
AGENT_SPAWN now accepts a siteContext parameter that provides: - isDeco: whether this is a Deco site - serverUrl: dev server URL if running - pages: list of available page paths - decoImports: Deco imports from deno.json - siteType: type of site - guidelines: site-specific guidelines When siteContext.isDeco is true, the agent prompt includes: - Reference to skills/decocms-landing-pages patterns - Deco site conventions (page configs, sections, colors) - Available pages and dev server status This enables site-builder to provide context when spawning agents. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add QualityGatesBaseline interface to track verified state - Add QUALITY_GATES_VERIFY tool to run gates and establish baseline - Add QUALITY_GATES_ACKNOWLEDGE tool to acknowledge pre-existing failures - Add QUALITY_GATES_BASELINE_GET tool to check current baseline status - Update agent prompt to differentiate between: - All gates passing: agent maintains this state - Acknowledged failures: agent ignores pre-existing issues When failures are acknowledged, agents focus on their assigned task without trying to solve the world by fixing unrelated issues. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
1 issue found across 2 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="task-runner/server/tools/agent.ts">
<violation number="1" location="task-runner/server/tools/agent.ts:260">
P3: The newly added pre-existing failure instructions conflict with the existing completion protocol (which still says to fix all gate failures, even pre-existing ones). This contradictory guidance can cause agents to ignore the baseline acknowledgement and re-fix known failures.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| **IMPORTANT: The following gates were ALREADY FAILING before your task started:** | ||
| ${failingGateNames.map((name) => `- ⚠️ ${name} (PRE-EXISTING FAILURE - DO NOT FIX)`).join("\n")} | ||
|
|
||
| **Do NOT attempt to fix these pre-existing failures.** The user has acknowledged them. |
There was a problem hiding this comment.
P3: The newly added pre-existing failure instructions conflict with the existing completion protocol (which still says to fix all gate failures, even pre-existing ones). This contradictory guidance can cause agents to ignore the baseline acknowledgement and re-fix known failures.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At task-runner/server/tools/agent.ts, line 260:
<comment>The newly added pre-existing failure instructions conflict with the existing completion protocol (which still says to fix all gate failures, even pre-existing ones). This contradictory guidance can cause agents to ignore the baseline acknowledgement and re-fix known failures.</comment>
<file context>
@@ -213,26 +233,62 @@ You must verify EACH criterion before completing. Check them off mentally as you
+**IMPORTANT: The following gates were ALREADY FAILING before your task started:**
+${failingGateNames.map((name) => `- ⚠️ ${name} (PRE-EXISTING FAILURE - DO NOT FIX)`).join("\n")}
+
+**Do NOT attempt to fix these pre-existing failures.** The user has acknowledged them.
+Focus ONLY on your assigned task. Do not try to "solve the world."
+
</file context>
Summary
New MCP for managing tasks and spawning Claude Code agents with Beads-based persistence.
Features
Task Management
BEADS_INIT: Initialize.beadsdirectory for task trackingBEADS_STATUS: Check if beads is initializedTASK_LIST,TASK_CREATE,TASK_UPDATE,TASK_DELETE: CRUD operationsAgent Control
AGENT_SPAWN: Spawn Claude Code with safety constraints (no rm -rf, auto-commit)AGENT_STATUS: Get detailed status with tool calls and messagesAGENT_STOP: Stop a running agentSkills & Memory
SKILL_LIST,SKILL_READ: Read agent skills from filesystemMEMORY_READ,MEMORY_WRITE: Agent memory persistence to MEMORY.mdArchitecture
.beads/sessions.jsonstream-jsonoutputAlso Includes
Updates to
local-fsMCP with object storage tools for plugin integration.Test Plan
bun run devin task-runner directorySummary by cubic
Adds a new Task Runner MCP to orchestrate Claude Code agents with Beads-backed task tracking, plus a Mesh-compatible local filesystem MCP. It enables safe agent execution, task CRUD and planning, memory persistence, enforced quality gates with baseline support, optional Deco site context for site-builder, and mounting local folders via stdio or HTTP.
New Features
Migration
Written for commit d7e3479. Summary will update on new commits.