Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 46 additions & 1 deletion src/product/generation/master-workflow-renderer.ts
Original file line number Diff line number Diff line change
Expand Up @@ -338,11 +338,25 @@ function renderChildRunStep(child: ChildWorkflowPlan): string[] {
];
}

function childWorkflowSource(child: ChildWorkflowPlan, spec: NormalizedWorkflowSpec): string {
export function childWorkflowSource(child: ChildWorkflowPlan, spec: NormalizedWorkflowSpec): string {
const artifactsDir = child.signoffArtifactPath.replace(/\/signoff\.md$/, '');
const validationCommand = child.validationCommands[0] ?? 'npm run typecheck';
const targetScope = child.targetFiles.length > 0 ? child.targetFiles.join(' ') : 'NO_TARGET_FILES_DECLARED';
const marker = child.signoffMarker;
// Injected into every review/fix task. The master executor runs all child
// slices in one shared checkout, so by the time a later child is reviewed
// the worktree already contains earlier siblings' dirty files. Reviewers
// were assigning BLOCKED and fix-loops were writing BLOCKED_NO_COMMIT.md
// purely because `git status` showed sibling files outside this slice's
// declared scope — a false block that stalls the whole master plan.
const sharedWorktreeScopeRule =
`Shared-worktree scope rule: this child runs in the SAME checkout as sibling slices. `
+ `${artifactsDir}/scope-baseline.txt is the dirty set captured before this child started. `
+ `Files in that baseline are pre-existing sibling/parent changes — they are NOT this child's `
+ `responsibility, NOT scope violations, and must NOT be reverted, cleaned, or counted against `
+ `this slice. Judge scope, findings, and BLOCKED only on the delta this child introduces on top `
+ `of the baseline (current 'git status --porcelain' minus scope-baseline.txt). Do not BLOCK or `
+ `write BLOCKED_NO_COMMIT.md solely because unrelated sibling files are dirty.`;

return `${[
"import { workflow } from '@agent-relay/sdk/workflows';",
Expand Down Expand Up @@ -370,6 +384,27 @@ function childWorkflowSource(child: ChildWorkflowPlan, spec: NormalizedWorkflowS
`mkdir -p ${shellQuote(artifactsDir)}`,
`printf '%s\\n' ${shellQuote(spec.description)} > ${shellQuote(`${artifactsDir}/normalized-spec.txt`)}`,
`printf '%s\\n' ${shellQuote(targetScope)} > ${shellQuote(`${artifactsDir}/target-files.txt`)}`,
// Snapshot the worktree's dirty set BEFORE this child touches anything.
// The master executor runs every child in the SAME checkout, so by the
// time a later child starts, earlier siblings' changes are already
// dirty here. Files listed in scope-baseline.txt are pre-existing
// sibling/parent state this child does not own — the scope gate must
// judge this child only on the delta it introduces, never on this
// baseline. Without it, every child after the first false-blocks on
// sibling contamination.
//
// Scope/limitation: this fully fixes the dominant failure — sequential
// accumulation, where siblings completed before this child started
// (the observed update-last-week stall: 50 dirty entries, all from
// already-finished siblings). It does NOT fully eliminate the race for
// siblings running CONCURRENTLY within the same wave (master is
// .maxConcurrency(4)): a file a sibling dirties AFTER this snapshot is
// not in the baseline and could still be misattributed. A snapshot
// can't close that window; the durable fix is per-child git-worktree
// isolation, deferred as a separate, larger change (see PR
// "Alternative considered"). Baseline subtraction is the minimal
// correct fix for the failure that actually occurs in practice.
`git status --porcelain > ${shellQuote(`${artifactsDir}/scope-baseline.txt`)} 2>/dev/null || : > ${shellQuote(`${artifactsDir}/scope-baseline.txt`)}`,
Comment on lines +387 to +407
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Baseline snapshot is not stable when sibling children run concurrently.

Line 395 captures scope-baseline.txt only once at child start, but Line 206 allows concurrent child runs in the same checkout. Sibling dirt introduced after this snapshot is still outside the baseline and can be misclassified as this child’s scope delta, which reintroduces false BLOCKED outcomes.

Consider either serializing child runs in shared worktree mode or moving to per-child isolated worktrees before relying on baseline subtraction for gating.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/product/generation/master-workflow-renderer.ts` around lines 387 - 395,
The baseline snapshot command writing to scope-baseline.txt (uses artifactsDir
and the git status command) is taken while sibling children may still mutate the
shared checkout (master executor runs concurrent children), so later sibling
changes can be misattributed to the wrong child; fix by either (A) serializing
child execution in the master workflow renderer so the snapshot is taken with no
concurrent siblings (turn off/guard the concurrency path around the child-run
dispatch referenced near the concurrent-run logic) or (B) switch to per-child
isolated worktrees before taking the baseline (create a per-child git
worktree/checkout or temporary clone for the child, then run the `git status
--porcelain > ${artifactsDir}/scope-baseline.txt` inside that isolated
worktree), and ensure all subsequent scope-delta calculations use that isolated
workspace; update the code paths that reference artifactsDir/scope-baseline.txt
and the child-dispatch/concurrency logic accordingly.

'echo RICKY_CHILD_CONTEXT_READY',
].join('\n'))},`,
' captureOutput: true,',
Expand All @@ -381,6 +416,7 @@ function childWorkflowSource(child: ChildWorkflowPlan, spec: NormalizedWorkflowS
` task: ${templateLiteral([
`Plan this child slice: ${child.title}.`,
`Target files: ${targetScope}.`,
`This child runs in a SHARED worktree alongside sibling child slices. ${artifactsDir}/scope-baseline.txt lists files already dirty before this child started — those belong to sibling/parent slices, are out of this child's ownership, and must NOT be reverted or counted as scope violations. Define this child's changed-file scope as the delta it introduces on top of that baseline.`,
'State non-goals, ownership, validation, source changes, code changes, tests, git diff evidence, and PR URL or explicit result status.',
`Write ${artifactsDir}/lead-plan.md ending with RICKY_CHILD_LEAD_PLAN_READY.`,
].join('\n'))},`,
Expand All @@ -392,6 +428,7 @@ function childWorkflowSource(child: ChildWorkflowPlan, spec: NormalizedWorkflowS
` task: ${templateLiteral([
`Implement the bounded child slice: ${child.title}.`,
`Edit only declared targets when possible: ${targetScope}.`,
`Shared worktree: files in ${artifactsDir}/scope-baseline.txt are pre-existing sibling/parent changes. Do not edit, revert, stage, or clean them — leave them exactly as found.`,
'Produce deterministic evidence, tests, and a concise result or PR URL when applicable.',
].join('\n'))},`,
' verification: { type: "exit_code", value: "0" },',
Expand All @@ -408,6 +445,7 @@ function childWorkflowSource(child: ChildWorkflowPlan, spec: NormalizedWorkflowS
' dependsOn: ["initial-soft-validation"],',
` task: ${templateLiteral([
'Fresh-eyes review this child slice against the lead plan, actual files, diff, and validation output.',
sharedWorktreeScopeRule,
'Use verdict: FINDINGS | NO_ISSUES_FOUND | BLOCKED and include fix_required plus test_required for each finding.',
`Write ${artifactsDir}/review-claude.md ending with RICKY_CHILD_CLAUDE_REVIEW_READY.`,
].join('\n'))},`,
Expand All @@ -420,6 +458,7 @@ function childWorkflowSource(child: ChildWorkflowPlan, spec: NormalizedWorkflowS
'Run the Claude 80-to-100 review-fix loop for this child slice.',
`Read ${artifactsDir}/review-claude.md and the initial validation output.`,
'Fix every valid finding within the declared scope; add or update tests/proofs for testable findings.',
sharedWorktreeScopeRule,
`If blocked, write ${artifactsDir}/BLOCKED_NO_COMMIT.md with exact evidence.`,
`Write ${artifactsDir}/fix-loop-report.md ending with RICKY_CHILD_FIX_LOOP_READY.`,
].join('\n'))},`,
Expand All @@ -437,6 +476,7 @@ function childWorkflowSource(child: ChildWorkflowPlan, spec: NormalizedWorkflowS
' dependsOn: ["post-fix-validation"],',
` task: ${templateLiteral([
'Re-review the fixed child state from scratch.',
sharedWorktreeScopeRule,
'Use verdict: FINDINGS | NO_ISSUES_FOUND | BLOCKED and include fix_required plus test_required for each finding.',
`Write ${artifactsDir}/final-review-claude.md ending with RICKY_CHILD_CLAUDE_FINAL_REVIEW_READY.`,
].join('\n'))},`,
Expand All @@ -448,6 +488,7 @@ function childWorkflowSource(child: ChildWorkflowPlan, spec: NormalizedWorkflowS
` task: ${templateLiteral([
`Read ${artifactsDir}/final-review-claude.md.`,
'If it says NO_ISSUES_FOUND, record that no fix was needed. Otherwise fix every valid finding and harden tests/proofs.',
sharedWorktreeScopeRule,
`If blocked, write ${artifactsDir}/BLOCKED_NO_COMMIT.md with exact evidence.`,
`Write ${artifactsDir}/claude-final-fix.md ending with RICKY_CHILD_CLAUDE_FINAL_FIX_READY.`,
].join('\n'))},`,
Expand All @@ -458,6 +499,7 @@ function childWorkflowSource(child: ChildWorkflowPlan, spec: NormalizedWorkflowS
' dependsOn: ["final-fix-claude"],',
` task: ${templateLiteral([
'Second-pass fresh-eyes review after the Claude loop. Read the actual files, diff, review artifacts, and validation evidence.',
sharedWorktreeScopeRule,
'Use verdict: FINDINGS | NO_ISSUES_FOUND | BLOCKED and include fix_required plus test_required for each finding.',
`Write ${artifactsDir}/review-codex.md ending with RICKY_CHILD_CODEX_REVIEW_READY.`,
].join('\n'))},`,
Expand All @@ -469,6 +511,7 @@ function childWorkflowSource(child: ChildWorkflowPlan, spec: NormalizedWorkflowS
` task: ${templateLiteral([
`Read ${artifactsDir}/review-codex.md.`,
'Fix every valid Codex finding and add or update tests/proofs for testable findings.',
sharedWorktreeScopeRule,
`If blocked, write ${artifactsDir}/BLOCKED_NO_COMMIT.md with exact evidence.`,
`Write ${artifactsDir}/codex-fix-loop-report.md ending with RICKY_CHILD_CODEX_FIX_LOOP_READY.`,
].join('\n'))},`,
Expand All @@ -486,6 +529,7 @@ function childWorkflowSource(child: ChildWorkflowPlan, spec: NormalizedWorkflowS
' dependsOn: ["post-codex-fix-validation"],',
` task: ${templateLiteral([
'Final Codex fresh-eyes review after Codex fixes.',
sharedWorktreeScopeRule,
'Use verdict: FINDINGS | NO_ISSUES_FOUND | BLOCKED and include fix_required plus test_required for each finding.',
`Write ${artifactsDir}/final-review-codex.md ending with RICKY_CHILD_CODEX_FINAL_REVIEW_READY.`,
].join('\n'))},`,
Expand All @@ -497,6 +541,7 @@ function childWorkflowSource(child: ChildWorkflowPlan, spec: NormalizedWorkflowS
` task: ${templateLiteral([
`Read ${artifactsDir}/final-review-codex.md.`,
'If it says NO_ISSUES_FOUND, record that no fix was needed. Otherwise fix every valid finding and harden tests/proofs.',
sharedWorktreeScopeRule,
`If blocked, write ${artifactsDir}/BLOCKED_NO_COMMIT.md with exact evidence.`,
`Write ${artifactsDir}/codex-final-fix.md ending with RICKY_CHILD_CODEX_FINAL_FIX_READY.`,
].join('\n'))},`,
Expand Down
102 changes: 102 additions & 0 deletions src/product/generation/pipeline.test.ts
Original file line number Diff line number Diff line change
@@ -1,8 +1,59 @@
import { describe, expect, it } from 'vitest';
import ts from 'typescript';

import { intake } from '../spec-intake/index.js';
import type { NormalizedWorkflowSpec, RawSpecPayload } from '../spec-intake/types.js';
import { generate, validateGeneratedArtifact } from './pipeline.js';
import { childWorkflowSource } from './master-workflow-renderer.js';

interface StepConfig {
command?: string;
task?: string;
}

/**
* Parse a generated child workflow's TypeScript source and return each
* `.step("<id>", { ... })` config's `command` / `task` string values keyed
* by step id. AST-based (per AGENTS.md "Source-Text Analysis: Use
* Grammar-Aware Parsers, Not Regex") so assertions check the contract is
* attached to the right step rather than that text appears anywhere in the
* rendered blob.
*/
function extractStepConfigs(source: string): Map<string, StepConfig> {
const sourceFile = ts.createSourceFile('child.ts', source, ts.ScriptTarget.Latest, false, ts.ScriptKind.TS);
const steps = new Map<string, StepConfig>();
const literalText = (node: ts.Expression): string | undefined => {
if (ts.isStringLiteralLike(node)) return node.text;
if (ts.isNoSubstitutionTemplateLiteral(node)) return node.text;
if (ts.isTemplateExpression(node)) {
return [node.head.text, ...node.templateSpans.map((s) => s.literal.text)].join('');
}
return undefined;
};
const visit = (node: ts.Node): void => {
if (
ts.isCallExpression(node)
&& ts.isPropertyAccessExpression(node.expression)
&& node.expression.name.text === 'step'
&& node.arguments.length >= 2
&& ts.isStringLiteralLike(node.arguments[0])
&& ts.isObjectLiteralExpression(node.arguments[1])
) {
const id = node.arguments[0].text;
const cfg: StepConfig = {};
for (const prop of node.arguments[1].properties) {
if (!ts.isPropertyAssignment(prop) || !prop.name) continue;
const key = ts.isIdentifier(prop.name) || ts.isStringLiteralLike(prop.name) ? prop.name.text : undefined;
if (key === 'command') cfg.command = literalText(prop.initializer);
if (key === 'task') cfg.task = literalText(prop.initializer);
}
steps.set(id, cfg);
}
ts.forEachChild(node, visit);
};
visit(sourceFile);
return steps;
}

const RECEIVED_AT = '2026-04-26T00:00:00.000Z';

Expand Down Expand Up @@ -61,6 +112,57 @@ describe('workflow generation pipeline', () => {
expect(rendered.content).toContain('.run({ cwd: process.cwd() })');
});

// Regression: the master executor runs every child slice in the SAME
// checkout (.run({ cwd: process.cwd() })), so later children see earlier
// siblings' dirty files. Reviewers were assigning BLOCKED and fix-loops
// were writing BLOCKED_NO_COMMIT.md purely because `git status` showed
// out-of-scope sibling files — a false block that stalled the whole
// master plan for hours. Each child must snapshot the pre-existing dirty
// set and judge scope only on its own delta.
it('makes master child slices baseline-aware so shared-worktree sibling dirt is not a false BLOCK', () => {
const fixtureSpec = spec({
description:
'Implement nested runner, runtime policy, telemetry, evals, and insights as smaller workflows run by a master executor.',
constraints: ['Use independent child workflows with deterministic 80-to-100 validation.'],
acceptanceGates: ['npm test'],
});
const result = generate({ spec: fixtureSpec, artifactPath: 'workflows/generated/runtime-master.ts' });
expect(result.masterExecutionPlan).toBeDefined();

const child = result.masterExecutionPlan!.children[0];
const childSrc = childWorkflowSource(child, fixtureSpec);

// Parse the generated child workflow with the TypeScript AST and extract
// each `.step("<id>", { ... })` config object so assertions verify the
// contract is attached to the right steps — not that a literal string
// appears anywhere in the blob (AGENTS.md: parser-based, not substring).
const stepConfigs = extractStepConfigs(childSrc);

const prepare = stepConfigs.get('prepare-context');
expect(prepare, 'prepare-context step exists').toBeDefined();
expect(prepare!.command, 'prepare-context snapshots the pre-child dirty set')
.toMatch(/git status --porcelain >\s*'[^']*\/scope-baseline\.txt'/);

// Every review/fix stage that assigns BLOCKED or writes
// BLOCKED_NO_COMMIT.md must carry the shared-worktree scope rule.
const scopedStages = [
'review-claude', 'fix-loop',
'final-review-claude', 'final-fix-claude',
'review-codex', 'fix-loop-codex',
'final-review-codex', 'final-fix-codex',
];
for (const stage of scopedStages) {
const cfg = stepConfigs.get(stage);
expect(cfg, `${stage} step exists`).toBeDefined();
expect(cfg!.task, `${stage} task carries the shared-worktree scope rule`)
.toContain('Shared-worktree scope rule');
expect(cfg!.task, `${stage} forbids blocking on sibling dirt`)
.toContain('Do not BLOCK or write BLOCKED_NO_COMMIT.md solely because unrelated sibling files are dirty');
expect(cfg!.task, `${stage} defines scope as the delta over the baseline`)
.toContain("current 'git status --porcelain' minus scope-baseline.txt");
}
});

// Regression: master-rendered final-hard-validation used to hardcode
// `npm test`, which walks the entire repo's test suite from the cwd.
// For monorepo specs that scope work to a few `packages/<pkg>/` files,
Expand Down
Loading