feat: add local code-node bridge for external-agent dev by vnv-varun · Pull Request #2767 · inkeep/agents

vnv-varun · 2026-03-19T00:45:11Z

Summary

Add a local inkeep code-node bridge for external-agent development so coding tasks can run against a checked-out repo during local dev.

Changes

add a new inkeep code-node CLI command that starts a local A2A-compatible bridge
execute incoming tasks inside a configured workspace with workspace guards, blocking send, streaming, cancelation, changed-file reporting, and optional verification output
allow per-request codeNode.runnerArgs overrides so different callers can use different local runner invocations
add a reference localCodeNodeAgent and document how to point externalAgent() at the local bridge
harden the new code-node-server test suite so it does not depend on ambient global.fetch state from other test files

Why

WS6 Phase 1 is the local execution layer for parallel local dev. This makes it possible to point an external agent at a local repo checkout and get back task output plus file and verification metadata, without waiting for the durable executor work.

Test Plan

pnpm format
pnpm --filter @inkeep/agents-cli test
INKEEP_AGENTS_MANAGE_DATABASE_URL=postgresql://localhost:5432/test INKEEP_AGENTS_RUN_DATABASE_URL=postgresql://localhost:5433/test pnpm --filter @inkeep/agents-api test
manual live-server smoke for blocking send, streaming, cancelation, changed files, verification output, and workspace guard behavior
INKEEP_AGENTS_MANAGE_DATABASE_URL=postgresql://localhost:5432/test INKEEP_AGENTS_RUN_DATABASE_URL=postgresql://localhost:5433/test pnpm check

Notes

This PR is the agents side of the WS6 local-dev flow. The /ship integration lives in team-skills as a separate companion change.

The final monorepo pnpm check was attempted multiple times, but this worktree hit unrelated flaky failures outside this diff in packages/agents-core and agents-manage-ui.

vercel · 2026-03-19T00:45:12Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agents-api	Ready	Preview, Comment	Mar 19, 2026 0:49am
agents-docs	Ready	Preview, Comment	Mar 19, 2026 0:49am
agents-manage-ui	Ready	Preview, Comment	Mar 19, 2026 0:49am

changeset-bot · 2026-03-19T00:45:15Z

🦋 Changeset detected

Latest commit: 58c8005

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 10 packages

Name	Type
@inkeep/agents-cli	Patch
@inkeep/agents-api	Patch
@inkeep/agents-manage-ui	Patch
@inkeep/agents-core	Patch
@inkeep/agents-email	Patch
@inkeep/agents-mcp	Patch
@inkeep/agents-sdk	Patch
@inkeep/agents-work-apps	Patch
@inkeep/ai-sdk-provider	Patch
@inkeep/create-agents	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

pullfrog

Medium-priority — solid feature addition with a few issues worth addressing before merge.

The code-node bridge is well-structured: clean A2A compliance, good test coverage (blocking, streaming, cancel, workspace guard), and proper shutdown semantics. The main concerns are a default port collision with OTLP, a zsh fallback that will fail on most CI/server environments, unbounded growth of the taskRecords map, and per-request runnerArgs overrides accepting arbitrary argument injection from callers.

^{｜ Fix all ➔ ｜ Fix 👍s ➔ ｜ View workflow run ｜ pullfrog.com ｜ 𝕏}

pullfrog · 2026-03-19T00:48:22Z

agents-cli/src/index.ts

+  .description('Start a local A2A coding-agent node for local development')
+  .requiredOption('--workspace <path>', 'Workspace root the node may access')
+  .option('--host <host>', 'Host to bind the local server to', '127.0.0.1')
+  .option('--port <port>', 'Port to run the local server on', '4318')


Port 4318 is the well-known OTLP/HTTP port. The SDK docs (agents-docs/content/get-started/traces.mdx) and packages/agents-sdk/src/telemetry-provider.ts both reference localhost:4318 as the OTEL collector endpoint. Running inkeep code-node on the same machine as a local SigNoz/OTEL collector will produce a silent port conflict.

Consider a different default (e.g., 4320 or 14318) to avoid the collision.

Suggested change

.option('--port <port>', 'Port to run the local server on', '4318')

.option('--port <port>', 'Port to run the local server on', '4320')

pullfrog · 2026-03-19T00:48:22Z

agents-cli/src/utils/code-node-server.ts

+    verification = {
+      command: options.verificationCommand,
+      ...(await runCommand(
+        process.env.SHELL || 'zsh',


zsh as a fallback is fragile — most CI containers, minimal Docker images, and many Linux distributions don't ship zsh. POSIX sh (or /bin/sh) is universally available.

Suggested change

process.env.SHELL || 'zsh',

process.env.SHELL || '/bin/sh',

pullfrog · 2026-03-19T00:48:22Z

agents-cli/src/utils/code-node-server.ts

+  const beforeStatus = snapshotGitStatus(workspace);
+  const rawPrompt = getPromptText(params.message);
+  const prompt = buildPrompt(rawPrompt, workspace, mode);
+  const runnerArgs = metadata.runnerArgs?.length ? metadata.runnerArgs : options.runnerArgs;


Per-request runnerArgs overrides from untrusted callers let anyone inject arbitrary arguments into the spawned process. Since the runnerCommand is controlled by the server operator, the impact is limited — but a caller could still pass args that change behavior in unexpected ways (e.g., --dangerously-skip-permissions for Claude Code). If this is intentional, a brief code comment explaining the trust model would help future readers; otherwise, consider dropping per-request arg overrides or gating them behind an opt-in flag.

pullfrog · 2026-03-19T00:48:22Z

agents-cli/src/utils/code-node-server.ts

+export async function startCodeNodeServer(
+  options: CodeNodeServerOptions
+): Promise<StartedCodeNodeServer> {
+  const taskRecords = new Map<string, TaskRecord>();


taskRecords is never pruned — completed tasks accumulate indefinitely. For a long-running dev server this is a slow memory leak. Consider evicting terminal-state tasks after a TTL (e.g., 5 minutes) or capping the map size.

pullfrog · 2026-03-19T00:48:22Z

agents-cli/src/utils/code-node-server.ts

+  let stdout = '';
+  let stderr = '';
+  const timeout = setTimeout(() => {
+    record.canceled = true;
+    terminateChild(child);
+  }, timeoutMs);


stdout/stderr are accumulated without any size bound. A pathological runner could produce gigabytes of output, exhausting process memory. Consider capping the accumulated buffer size (e.g., to MAX_OUTPUT_CHARS) and discarding excess.

pullfrog · 2026-03-19T00:48:22Z

agents-cli/src/utils/code-node-server.ts

+    throw new Error('Write mode requested, but this node was started without write access');
+  }
+
+  const timeoutMs = metadata.timeoutMs ?? options.defaultTimeoutMs;


metadata.timeoutMs comes from the untrusted request. A caller can set timeoutMs: Number.MAX_SAFE_INTEGER to effectively disable the timeout. Consider clamping to a server-side maximum (e.g., Math.min(metadata.timeoutMs, options.defaultTimeoutMs * 10)).

pullfrog · 2026-03-19T00:48:22Z

agents-cli/src/utils/code-node-server.ts

+      )
+    : root;
+
+  if (resolved !== root && !resolved.startsWith(`${root}${path.sep}`)) {


The startsWith check uses path.resolve which follows symlinks. A symlink inside the workspace root that points outside it would pass this check. Since this is a local-dev tool the risk is low, but worth noting for the trust model.

pullfrog · 2026-03-19T00:48:22Z

agents-docs/content/typescript-sdk/external-agents.mdx

+  id: "local-code-node",
+  name: "Local Code Node",
+  description: "Delegates coding tasks into a local workspace during development",
+  baseUrl: "http://127.0.0.1:4318",


If the default port changes (see port-conflict comment), update this to match.

pullfrog · 2026-03-19T00:48:22Z

test-agents/agents/local-code-node-agent.ts

+  id: 'local-code-node',
+  name: 'Local Code Node',
+  description: 'Delegates coding tasks into a locally running code-node bridge',
+  baseUrl: 'http://127.0.0.1:4318',


Same as above — keep this in sync if the default port changes.

pullfrog · 2026-03-19T00:48:56Z

TL;DR — Adds an experimental inkeep code-node CLI command that starts a local A2A-compliant HTTP server, bridging Inkeep agents to a local CLI runner (defaulting to claude) for delegating coding tasks against a checked-out repository. The server handles task lifecycle, git change detection, optional verification commands, and reports structured results back through the A2A protocol.

Key changes

code-node-server.ts — New ~690-line A2A HTTP server that spawns a configurable CLI runner, tracks git diffs before/after execution, runs optional verification commands, and exposes message/send, message/stream, tasks/get, and tasks/cancel JSON-RPC methods alongside an Agent Card endpoint.
code-node.ts command + CLI registration — Wires the server into the CLI as inkeep code-node with options for --workspace, --runner-command, --runner-arg, --verification-command, --write, and more.
code-node-server.test.ts — Comprehensive test suite covering blocking/non-blocking sends, SSE streaming, workspace security validation, task cancellation, git change detection, and verification output capture.
local-code-node-agent.ts — Reference test agent demonstrating how to register the local bridge as an external agent via externalAgent() from the SDK.
external-agents.mdx + README.md — Documentation for the new command and its integration pattern.

_{Summary ｜ 10 files ｜ 3 commits ｜ base: main ← varun/ws6-phase1-claude-code-node}

`code-node-server.ts` — A2A bridge server

Before: No way to delegate coding tasks from Inkeep agents to a local CLI tool.
After: A full A2A-compliant HTTP server spawns a configurable runner (e.g. claude -p {prompt}), captures stdout/stderr, diffs git status before and after execution, optionally runs a verification command, and returns structured artifacts with changedFiles, exit codes, and verification output.

The server supports three execution paths:

Method	Behavior
`message/send` (blocking)	Awaits runner completion, returns final task
`message/send` (non-blocking)	Returns immediately in `working` state, poll via `tasks/get`
`message/stream`	Streams `status-update` and `artifact-update` SSE events

Security is enforced by resolveWorkspace(), which validates that any per-request workspace override resolves within the configured root — preventing directory traversal. The ExecutionMode type (read / write) gates whether the runner is allowed to modify files, controlled by the --write flag or per-request metadata.

How does per-request metadata work?

Callers can pass a codeNode key in message or params metadata to override workspace, timeoutMs, mode, and runnerArgs per request. The getMetadata() function merges params-level and message-level metadata, with message-level taking precedence.

agents-cli/src/utils/code-node-server.ts · agents-cli/src/commands/code-node.ts · agents-cli/src/index.ts

`code-node-server.test.ts` — test suite

Before: No test coverage for code-node functionality.
After: Five tests validate the full A2A lifecycle using mocked child processes and real temp directories.

Tests cover: blocking message/send with artifact validation, write-mode git diff detection with verification output, workspace-escape rejection, non-blocking send with tasks/cancel + child process kill, and SSE message/stream event shape. Child process mocking uses an EventEmitter-based spawn mock with configurable MockSpawnBehavior and queued spawnSync responses for git status.

agents-cli/src/__tests__/code-node-server.test.ts · agents-cli/src/__tests__/cli.test.ts

Reference agent + documentation

Before: No example of how to use a local code-node bridge with the agent framework.
After: A localCodeNodeAgent test agent shows the externalAgent() → subAgent() → agent() wiring, and docs explain the CLI flags and SDK registration.

The test agent defines three layers: an externalAgent pointing at http://127.0.0.1:4318, a coordinator subAgent that routes coding tasks to it, and a top-level agent that uses the coordinator as its default. Documentation is added to both the external-agents MDX page and the root README.

test-agents/agents/local-code-node-agent.ts · agents-docs/content/typescript-sdk/external-agents.mdx · README.md

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ pullfrog.com ｜ 𝕏}

claude

PR Review Summary

(7) Total Issues | Risk: Medium

🟠⚠️ Major (4) 🟠⚠️

Inline Comments:

🟠 Major: code-node-server.ts:473 Task records Map grows unbounded without cleanup
🟠 Major: code-node-server.ts:85-105 Path traversal via symlinks can bypass workspace guard
🟠 Major: code-node-server.ts:241-248 Request body reading has no size limit
🟠 Major: code-node-server.test.ts:188-444 Missing test coverage for write mode enforcement and failure paths

🟡 Minor (3) 🟡

Inline Comments:

🟡 Minor: code-node-server.ts:353 User-supplied timeoutMs lacks validation bounds
🟡 Minor: code-node-server.ts:370-382 stdout/stderr buffers grow unbounded during execution

🟡 1) code-node-server.ts:598-618 Missing server-side error logging

Issue: Errors during task execution (both streaming and non-blocking modes) are not logged server-side. Errors are only sent to clients or update task state silently.
Why: For incident debugging, operators need server-side visibility into failures. If client connections are lost, failures become invisible.
Fix: Add console.error('Task failed:', taskId, error) in error handlers at lines 598-607 and 616-618.
Refs: code-node-server.ts:598-618

💭 Consider (4) 💭

💭 1) index.ts:38-58 Command registration position
Issue: code-node is registered at the top of the command list, before lifecycle commands like add, init, push.
Fix: Consider moving near the dev command (~line 203) to group development utilities together.

💭 2) code-node.ts:55-68 Output format lacks emoji prefix
Issue: Other server commands like dev use emoji prefixes (e.g., 🚀). code-node uses plain chalk output.
Fix: Add emoji for visual consistency: console.log(chalk.green('🚀 Local code node listening at...')).

💭 3) code-node.ts:70-80 Shutdown handler lacks feedback
Issue: The dev command logs "🛑 Stopping dashboard server..." during shutdown. code-node exits silently.
Fix: Add shutdown message before await started.close().

💭 4) external-agents.mdx:56 Missing CLI options in docs
Issue: The --timeout-ms flag (default 2 min) is not documented. Users with long-running tasks will hit timeouts without knowing the option exists.
Fix: Add note: "Use --timeout-ms for tasks longer than 2 minutes. See inkeep code-node --help for all options."

💡 APPROVE WITH SUGGESTIONS

Summary: This is a well-structured implementation of a local A2A bridge with good test coverage for the happy paths. However, there are several resource management and security concerns that should be addressed:

Memory leak from unbounded task record storage needs cleanup logic
Path traversal via symlinks could bypass the workspace security boundary
Unbounded request body could enable DoS
Missing failure path tests for write-mode enforcement and runner failures

The fixes are straightforward and the PR is close to ready. The architectural approach is sound — the concerns are implementation details rather than design issues.

Discarded (5)

Location	Issue	Reason Discarded
`code-node-server.ts:357`	runnerArgs override from request	Intentional per PR description ("Allow per-request code-node runner args")
`code-node-server.ts:412-422`	Verification shares timeout with main task	Acceptable for local dev; separate timeout would over-complicate
`code-node-server.ts:542-557`	Race condition between cancel and completion	Minor race with safe fallback; acceptable for local dev
`index.ts:42`	Host 127.0.0.1 vs localhost	Functionally equivalent; 127.0.0.1 is more explicit
`index.ts:43`	Port 4318 conflicts with OTEL	Valid concern but 4318 is common for local dev servers

Reviewers (5)

Reviewer	Returned	Main Findings	Consider	Inline Comments	Discarded
`pr-review-standards`	5	0	0	4	1
`pr-review-sre`	8	1	0	2	5
`pr-review-tests`	5	0	0	1	4
`pr-review-consistency`	6	0	3	0	3
`pr-review-docs`	2	0	1	0	1
Total	26	1	4	7	14

claude · 2026-03-19T00:49:35Z

agents-cli/src/utils/code-node-server.ts

+export async function startCodeNodeServer(
+  options: CodeNodeServerOptions
+): Promise<StartedCodeNodeServer> {
+  const taskRecords = new Map<string, TaskRecord>();


🟠 MAJOR: Task records Map grows unbounded

Issue: The taskRecords Map stores all tasks indefinitely with no cleanup. Each message/send or message/stream creates a new task record that is never removed, even after completion.

Why: In long-running deployments or under sustained load, this causes memory growth proportional to task count. Each record contains the full Task object with artifacts, history, and metadata.

Fix: Add TTL-based cleanup for completed tasks. Options:

Periodic interval that removes terminal-state tasks older than a TTL (e.g., 1 hour)

LRU cache with max size limit

Clean up immediately on task completion with optional retention for debugging

// Option 1: Periodic cleanup const TASK_RETENTION_MS = 60 * 60 * 1000; // 1 hour setInterval(() => { const now = Date.now(); for (const [id, record] of taskRecords) { if (isTerminalState(record.task.status.state)) { const timestamp = new Date(record.task.status.timestamp).getTime(); if (now - timestamp > TASK_RETENTION_MS) { taskRecords.delete(id); } } } }, 60_000).unref();

Refs:

Node.js memory management best practices

claude · 2026-03-19T00:49:43Z

agents-cli/src/utils/code-node-server.ts

+function resolveWorkspace(workspaceRoot: string, requestedWorkspace?: string): string {
+  const root = path.resolve(workspaceRoot);
+  const resolved = requestedWorkspace
+    ? path.resolve(
+        path.isAbsolute(requestedWorkspace)
+          ? requestedWorkspace
+          : path.join(root, requestedWorkspace)
+      )
+    : root;
+
+  if (resolved !== root && !resolved.startsWith(`${root}${path.sep}`)) {
+    throw new Error(`Workspace must be within ${root}`);
+  }
+
+  const stats = statSync(resolved, { throwIfNoEntry: false });
+  if (!stats || !stats.isDirectory()) {
+    throw new Error(`Workspace does not exist or is not a directory: ${resolved}`);
+  }
+
+  return resolved;
+}


🟠 MAJOR: Path traversal via symlinks can bypass workspace guard

Issue: The resolveWorkspace function uses path.resolve() which does not resolve symlinks. An attacker could create a symlink inside the workspace root that points outside it, then request that symlinked path as the workspace.

Why: The string-based containment check on line 95 would pass because the resolved path starts with the root, but the actual filesystem target could be outside the allowed workspace. This bypasses the security boundary intended to restrict file access.

Fix: Use fs.realpathSync() to resolve symlinks before performing the path containment check:

import { realpathSync } from 'node:fs'; function resolveWorkspace(workspaceRoot: string, requestedWorkspace?: string): string { const root = realpathSync(path.resolve(workspaceRoot)); const resolved = requestedWorkspace ? path.resolve( path.isAbsolute(requestedWorkspace) ? requestedWorkspace : path.join(root, requestedWorkspace) ) : root; // Resolve symlinks in the requested path too let realResolved: string; try { realResolved = realpathSync(resolved); } catch { throw new Error(`Workspace does not exist: ${resolved}`); } if (realResolved !== root && !realResolved.startsWith(`${root}${path.sep}`)) { throw new Error(`Workspace must be within ${root}`); } // ... rest of function }

Refs:

OWASP Path Traversal

claude · 2026-03-19T00:49:50Z

agents-cli/src/utils/code-node-server.ts

+async function readBody(request: IncomingMessage): Promise<string> {
+  return await new Promise((resolve, reject) => {
+    const chunks: Buffer[] = [];
+    request.on('data', (chunk) => chunks.push(Buffer.from(chunk)));
+    request.on('end', () => resolve(Buffer.concat(chunks).toString('utf8')));
+    request.on('error', reject);
+  });
+}


🟠 MAJOR: Request body reading has no size limit

Issue: The readBody function accumulates all request data into memory without a size limit. A malicious or buggy client could send an arbitrarily large payload, causing memory exhaustion before JSON parsing even attempts.

Why: While the server binds to 127.0.0.1 by default, it could be misconfigured to bind publicly. Even locally, a runaway script could DoS the server.

Fix: Add a maximum body size check and abort if exceeded:

const MAX_BODY_SIZE = 10 * 1024 * 1024; // 10MB async function readBody(request: IncomingMessage): Promise<string> { return await new Promise((resolve, reject) => { const chunks: Buffer[] = []; let totalSize = 0; request.on('data', (chunk) => { totalSize += chunk.length; if (totalSize > MAX_BODY_SIZE) { request.destroy(new Error('Request body too large')); reject(new Error('Request body too large')); return; } chunks.push(Buffer.from(chunk)); }); request.on('end', () => resolve(Buffer.concat(chunks).toString('utf8'))); request.on('error', reject); }); }

Then handle the error in the request handler to return 413 Payload Too Large.

claude · 2026-03-19T00:49:55Z

agents-cli/src/utils/code-node-server.ts

+    throw new Error('Write mode requested, but this node was started without write access');
+  }
+
+  const timeoutMs = metadata.timeoutMs ?? options.defaultTimeoutMs;


🟡 Minor: User-supplied timeoutMs lacks validation bounds

Issue: The timeoutMs value from request metadata is used directly without validation. A client could pass timeoutMs: 0 (immediate cancellation), negative values (undefined setTimeout behavior), or extremely large values (disabling timeout protection).

Why: Could lead to DoS via hung processes or unexpected immediate task cancellation.

Fix:

Suggested change

const timeoutMs = metadata.timeoutMs ?? options.defaultTimeoutMs;

const timeoutMs = Math.max(1000, Math.min(metadata.timeoutMs ?? options.defaultTimeoutMs, 86400000));

This bounds timeout to 1 second minimum, 24 hours maximum.

claude · 2026-03-19T00:50:00Z

agents-cli/src/utils/code-node-server.ts

+  let stdout = '';
+  let stderr = '';
+  const timeout = setTimeout(() => {
+    record.canceled = true;
+    terminateChild(child);
+  }, timeoutMs);
+
+  child.stdout.on('data', (chunk) => {
+    stdout += chunk.toString();
+  });
+  child.stderr.on('data', (chunk) => {
+    stderr += chunk.toString();
+  });


🟡 Minor: stdout/stderr buffers grow unbounded during execution

Issue: The stdout and stderr variables accumulate all output from the child process without size limit. A misbehaving runner could produce gigabytes of output, leading to memory exhaustion.

Why: While output is truncated to MAX_OUTPUT_CHARS before being sent in the artifact, the full output is buffered in memory first.

Fix: Track buffer size and stop accumulating once a threshold is reached:

const MAX_BUFFER_SIZE = MAX_OUTPUT_CHARS * 2; // 40KB child.stdout.on('data', (chunk) => { if (stdout.length < MAX_BUFFER_SIZE) { stdout += chunk.toString().slice(0, MAX_BUFFER_SIZE - stdout.length); } }); child.stderr.on('data', (chunk) => { if (stderr.length < MAX_BUFFER_SIZE) { stderr += chunk.toString().slice(0, MAX_BUFFER_SIZE - stderr.length); } });

claude · 2026-03-19T00:50:09Z

agents-cli/src/__tests__/code-node-server.test.ts

+describe('code node server', () => {
+  it('serves an agent card and handles blocking message/send', async () => {
+    const workspace = await makeWorkspace();
+    mockChildProcessState.gitStatuses.push('', '');
+    mockChildProcessState.spawnBehaviors.push({ stdout: 'Runner finished' });
+
+    const started = await startCodeNodeServer({
+      host: '127.0.0.1',
+      port: 0,
+      workspaceRoot: workspace,
+      runnerCommand: 'node',
+      runnerArgs: ['-e', 'process.stdout.write(process.argv[1])', '{prompt}'],
+      allowWrite: false,
+      defaultTimeoutMs: 5_000,
+      name: 'Test Code Node',
+      description: 'Test bridge',
+      version: 'test',
+    });
+    startedServers.push(started);
+
+    const agentCard = JSON.parse(
+      (await requestText(`${started.baseUrl}/.well-known/agent.json`)).body
+    );
+    expect(agentCard.url).toBe(`${started.baseUrl}/a2a`);
+    expect(agentCard.skills[0].id).toBe('local-code-execution');
+
+    const result = await postJson(`${started.baseUrl}/a2a`, {
+      jsonrpc: '2.0',
+      id: 1,
+      method: 'message/send',
+      params: {
+        message: {
+          kind: 'message',
+          messageId: 'msg-1',
+          role: 'user',
+          contextId: 'ctx-1',
+          parts: [{ kind: 'text', text: 'Say hello' }],
+          metadata: {
+            codeNode: {
+              runnerArgs: ['-e', 'process.stdout.write(process.argv[1])', 'override {prompt}'],
+            },
+          },
+        },
+        configuration: {
+          blocking: true,
+          acceptedOutputModes: ['text'],
+        },
+      },
+    });
+
+    expect(result.result.status.state).toBe('completed');
+    expect(result.result.artifacts[0].parts[0].text).toContain('Runner finished');
+    expect(mockChildProcessState.spawnCalls[0].command).toBe('node');
+    expect(mockChildProcessState.spawnCalls[0].args[2]).toContain('override');
+    expect(mockChildProcessState.spawnCalls[0].args[2]).toContain('Task:\n\nSay hello');
+  });
+
+  it('includes changed files and verification output in a successful write run', async () => {
+    const workspace = await makeWorkspace();
+    mockChildProcessState.gitStatuses.push('', '?? generated.txt\n');
+    mockChildProcessState.spawnBehaviors.push({ stdout: 'done' }, { stdout: 'lint ok' });
+
+    const started = await startCodeNodeServer({
+      host: '127.0.0.1',
+      port: 0,
+      workspaceRoot: workspace,
+      runnerCommand: 'node',
+      runnerArgs: ['-e', "process.stdout.write('done')"],
+      verificationCommand: 'pnpm lint',
+      allowWrite: true,
+      defaultTimeoutMs: 5_000,
+      name: 'Writer Node',
+      description: 'Writes files',
+      version: 'test',
+    });
+    startedServers.push(started);
+
+    const result = await postJson(`${started.baseUrl}/a2a`, {
+      jsonrpc: '2.0',
+      id: 2,
+      method: 'message/send',
+      params: {
+        message: {
+          kind: 'message',
+          messageId: 'msg-2',
+          role: 'user',
+          contextId: 'ctx-2',
+          parts: [{ kind: 'text', text: 'Create a file' }],
+          metadata: { codeNode: { mode: 'write' } },
+        },
+        configuration: {
+          blocking: true,
+          acceptedOutputModes: ['text'],
+        },
+      },
+    });
+
+    const dataPart = result.result.artifacts[0].parts.find((part: any) => part.kind === 'data');
+    expect(dataPart.data.changedFiles).toEqual(['generated.txt']);
+    expect(dataPart.data.verification.command).toBe('pnpm lint');
+    expect(dataPart.data.verification.stdout).toContain('lint ok');
+    expect(mockChildProcessState.spawnCalls[1].args).toEqual(['-lc', 'pnpm lint']);
+  });
+
+  it('rejects workspaces outside the configured root', async () => {
+    const workspace = await makeWorkspace();
+    const outside = await makeWorkspace();
+    mockChildProcessState.gitStatuses.push('', '');
+
+    const started = await startCodeNodeServer({
+      host: '127.0.0.1',
+      port: 0,
+      workspaceRoot: workspace,
+      runnerCommand: 'node',
+      runnerArgs: ['-e', "process.stdout.write('ok')"],
+      allowWrite: false,
+      defaultTimeoutMs: 5_000,
+      name: 'Guarded Node',
+      description: 'Guards workspace',
+      version: 'test',
+    });
+    startedServers.push(started);
+
+    const result = await postJson(`${started.baseUrl}/a2a`, {
+      jsonrpc: '2.0',
+      id: 3,
+      method: 'message/send',
+      params: {
+        message: {
+          kind: 'message',
+          messageId: 'msg-3',
+          role: 'user',
+          contextId: 'ctx-3',
+          parts: [{ kind: 'text', text: 'Outside root' }],
+          metadata: { codeNode: { workspace: outside } },
+        },
+        configuration: {
+          blocking: true,
+          acceptedOutputModes: ['text'],
+        },
+      },
+    });
+
+    expect(result.result.status.state).toBe('failed');
+    expect(result.result.status.message.parts[0].text).toContain('Workspace must be within');
+    expect(mockChildProcessState.spawnCalls).toHaveLength(0);
+  });
+
+  it('supports non-blocking send plus cancel', async () => {
+    const workspace = await makeWorkspace();
+    mockChildProcessState.gitStatuses.push('', '');
+    mockChildProcessState.spawnBehaviors.push({ autoClose: false });
+
+    const started = await startCodeNodeServer({
+      host: '127.0.0.1',
+      port: 0,
+      workspaceRoot: workspace,
+      runnerCommand: 'node',
+      runnerArgs: ['-e', "setTimeout(() => process.stdout.write('done'), 60000)"],
+      allowWrite: false,
+      defaultTimeoutMs: 60_000,
+      name: 'Cancelable Node',
+      description: 'Cancelable run',
+      version: 'test',
+    });
+    startedServers.push(started);
+
+    const startResult = await postJson(`${started.baseUrl}/a2a`, {
+      jsonrpc: '2.0',
+      id: 4,
+      method: 'message/send',
+      params: {
+        message: {
+          kind: 'message',
+          messageId: 'msg-4',
+          role: 'user',
+          contextId: 'ctx-4',
+          parts: [{ kind: 'text', text: 'Long task' }],
+        },
+        configuration: {
+          blocking: false,
+          acceptedOutputModes: ['text'],
+        },
+      },
+    });
+
+    expect(startResult.result.status.state).toBe('working');
+    const taskId = startResult.result.id;
+
+    const cancelResult = await postJson(`${started.baseUrl}/a2a`, {
+      jsonrpc: '2.0',
+      id: 5,
+      method: 'tasks/cancel',
+      params: { id: taskId },
+    });
+
+    expect(cancelResult.result.status.state).toBe('canceled');
+    expect(mockChildProcessState.spawnCalls[0].child.kill).toHaveBeenCalledWith('SIGTERM');
+
+    await new Promise((resolve) => setTimeout(resolve, 0));
+    const taskResult = await postJson(`${started.baseUrl}/a2a`, {
+      jsonrpc: '2.0',
+      id: 6,
+      method: 'tasks/get',
+      params: { id: taskId },
+    });
+
+    expect(taskResult.result.status.state).toBe('canceled');
+  });
+
+  it('streams status and artifact events for message/stream', async () => {
+    const workspace = await makeWorkspace();
+    mockChildProcessState.gitStatuses.push('', '');
+    mockChildProcessState.spawnBehaviors.push({ stdout: 'streamed result' });
+
+    const started = await startCodeNodeServer({
+      host: '127.0.0.1',
+      port: 0,
+      workspaceRoot: workspace,
+      runnerCommand: 'node',
+      runnerArgs: ['-e', "process.stdout.write('streamed result')"],
+      allowWrite: false,
+      defaultTimeoutMs: 5_000,
+      name: 'Streaming Node',
+      description: 'Streams progress',
+      version: 'test',
+    });
+    startedServers.push(started);
+
+    const response = await requestText(`${started.baseUrl}/a2a`, {
+      method: 'POST',
+      headers: { 'Content-Type': 'application/json' },
+      body: JSON.stringify({
+        jsonrpc: '2.0',
+        id: 7,
+        method: 'message/stream',
+        params: {
+          message: {
+            kind: 'message',
+            messageId: 'msg-7',
+            role: 'user',
+            contextId: 'ctx-7',
+            parts: [{ kind: 'text', text: 'Stream this' }],
+          },
+          configuration: {
+            acceptedOutputModes: ['text'],
+          },
+        },
+      }),
+    });
+
+    expect(response.headers['content-type']).toContain('text/event-stream');
+    expect(response.body).toContain('"kind":"status-update"');
+    expect(response.body).toContain('"kind":"artifact-update"');
+    expect(response.body).toContain('"state":"completed"');
+  });
+});


🟠 MAJOR: Missing test coverage for critical failure paths

Issue: Several important code paths lack test coverage:

Write mode enforcement (security-critical): code-node-server.ts:349-351 rejects write mode when allowWrite: false, but no test verifies this.

Runner failure handling: code-node-server.ts:458-463 transitions to Failed state on non-zero exit, but all tests exercise success paths only.

Verification command failure: Tests only cover verification success, not what happens when lint/test commands fail.

Why: Without coverage, regressions could allow unauthorized file modifications or incorrectly report failed executions as completed.

Fix: Add these test cases:

it('rejects write mode when allowWrite is false', async () => { const started = await startCodeNodeServer({ // ... allowWrite: false }); const result = await postJson(`${started.baseUrl}/a2a`, { // ... metadata: { codeNode: { mode: 'write' } } }); expect(result.result.status.state).toBe('failed'); expect(result.result.status.message.parts[0].text).toContain('Write mode requested'); }); it('reports failed state when runner exits non-zero', async () => { mockChildProcessState.spawnBehaviors.push({ exitCode: 1, stderr: 'error' }); // ... assert status.state === 'failed' });

vnv-varun added 3 commits March 18, 2026 00:50

Add local code-node bridge for external-agent dev

7f66ad5

Allow per-request code-node runner args

b5d1fe2

Fix code-node test isolation

58c8005

vercel bot deployed to Preview – agents-api March 19, 2026 00:47 View deployment

vercel bot deployed to Preview – agents-docs March 19, 2026 00:47 View deployment

pullfrog bot reviewed Mar 19, 2026

View reviewed changes

vercel bot deployed to Preview – agents-manage-ui March 19, 2026 00:49 View deployment

claude bot reviewed Mar 19, 2026

View reviewed changes

github-actions bot deleted a comment from claude bot Mar 19, 2026

	.option('--port <port>', 'Port to run the local server on', '4318')
	.option('--port <port>', 'Port to run the local server on', '4320')

	const timeoutMs = metadata.timeoutMs ?? options.defaultTimeoutMs;
	const timeoutMs = Math.max(1000, Math.min(metadata.timeoutMs ?? options.defaultTimeoutMs, 86400000));

Conversation

vnv-varun commented Mar 19, 2026

Summary

Changes

Why

Test Plan

Notes

Uh oh!

vercel bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot bot commented Mar 19, 2026

🦋 Changeset detected

Uh oh!

pullfrog bot left a comment

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot commented Mar 19, 2026

Key changes

code-node-server.ts — A2A bridge server

code-node-server.test.ts — test suite

Reference agent + documentation

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

PR Review Summary

🟠⚠️ Major (4) 🟠⚠️

🟡 Minor (3) 🟡

💭 Consider (4) 💭

💡 APPROVE WITH SUGGESTIONS

Uh oh!

claude bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

vercel bot commented Mar 19, 2026 •

edited

Loading

`code-node-server.ts` — A2A bridge server

`code-node-server.test.ts` — test suite