Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions prpm.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "agent-workforce-skills",
"version": "1.0.5",
"version": "1.0.6",
"description": "Skills for multi-agent coordination - swarm patterns, workflow building, relay usage, and headless orchestration",
"author": "khaliqgant",
"organization": "agent-relay",
Expand All @@ -10,7 +10,7 @@
"packages": [
{
"name": "choosing-swarm-patterns",
"version": "1.1.2",
"version": "1.1.3",
"description": "Use when coordinating multiple AI agents with Agent Relay's workflow engine and need to pick the right orchestration pattern - covers the 10 core patterns (fan-out, pipeline, hub-spoke, consensus, mesh, handoff, cascade, dag, debate, hierarchical) plus 14 specialized ones, with decision framework and accurate SDK/YAML examples.",
"format": "claude",
"subtype": "skill",
Expand All @@ -28,7 +28,7 @@
},
{
"name": "writing-agent-relay-workflows",
"version": "1.6.6",
"version": "1.6.7",
"description": "Use when building multi-agent workflows with the relay broker-sdk - covers conversation-shape vs pipeline-shape coordination, WorkflowBuilder API, DAG step dependencies, agent definitions, output chaining via {{steps.X.output}}, verification gates, evidence-based completion, channels, swarm patterns, chat-native coordination recipes (Q/A, broadcast-ack, peer review, standup, hand-off), error handling, event listeners, step sizing, lead+workers team pattern, and parallel wave planning",
"format": "claude",
"subtype": "skill",
Expand Down Expand Up @@ -107,7 +107,7 @@
},
{
"name": "relay-80-100-workflow",
"version": "1.0.4",
"version": "1.0.5",
"description": "Use when writing agent-relay workflows that must fully validate features end-to-end before merging - covers the 80-to-100 pattern with PGlite in-memory Postgres testing, mock sandbox patterns, test-fix-rerun loops, verify gates, and full lifecycle from implementation through passing tests to commit",
"format": "claude",
"subtype": "skill",
Expand Down Expand Up @@ -166,7 +166,7 @@
"id": "agent-relay-starter",
"name": "Agent Relay Starter",
"description": "Essential skills for building multi-agent systems with Agent Relay - swarm pattern selection, workflow authoring, and trail debugging",
"version": "1.0.4",
"version": "1.0.5",
"category": "development",
"tags": [
"multi-agent",
Expand Down
23 changes: 23 additions & 0 deletions skills/choosing-swarm-patterns/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,29 @@ Topology is still resolved per-pattern once selected; the "Triggering roles" col
| `competitive` | — (declared explicitly) | independent parallel implementations + judge |
| `review-loop` | `implement*` + 2+ `reviewer*` | implementer ↔ reviewers |

## Structured Squad Review Loop

For serious implementation work, especially workflow generation or product-contract changes, prefer a composite **squad-review-loop** recipe over a plain single implementer plus final reviewer. This is a workflow authoring recipe built from existing patterns, not a separate SDK enum unless the local runner has added one.

Use this when the fastest reliable path is small teams of 2-3 agents working in parallel with live feedback:

1. Split the work into bounded implementation squads. Each squad owns a non-overlapping file or subsystem scope.
2. Give each squad an implementer plus a shadow/review partner. The shadow follows the implementer in real time, checks alignment with the spec, and posts concise feedback before the work drifts.
3. Require the implementer to self-reflect before external review: compare the final diff against the spec, AGENTS.md / CLAUDE.md, recent local conventions, tests, and declared non-goals.
4. Run an independent self-review/fresh-eyes agent that reads the actual files and recent repo context, not just the chat transcript.
5. Send that review back to the implementer for one repair round.
6. After squads converge, run a final two-agent review team, usually one Claude reviewer and one Codex reviewer, independently. They compare notes, merge findings, and produce one final verdict.
7. Spawn fresh fix agents for final-review findings. Those fix agents self-reflect, then the final reviewers re-check the post-fix state until the spec is fully satisfied or a blocker is documented.

Pattern selection for this recipe:
- Use `supervisor` or `hub-spoke` when a lead needs to coordinate live squads.
- Use `review-loop` when the main risk is code quality and feedback iteration.
- Use `reflection` when critic feedback should loop directly back to producers.
- Use `verifier` when completion evidence matters more than design debate.
- Use `competitive` only when independent alternative implementations are useful; otherwise split by ownership scope.

Keep squads small. Two or three agents per squad is usually the useful limit: implementer, shadow/reviewer, and optionally test/validation owner. More agents belong in separate squads or in the final review team.

## Pattern Details

All examples below use real API shapes (`WorkflowBuilder` / YAML), verified against `packages/sdk/src/workflows/builder.ts` and `packages/sdk/src/workflows/types.ts`.
Expand Down
15 changes: 15 additions & 0 deletions skills/relay-80-100-workflow/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,21 @@ For large rollouts, treat implementation agents as advisory producers and put a

This shape prevents "agent transport failed" from masquerading as "the product failed." The product still has to pass the same gates; the difference is that the workflow can reach the gates and repair them.

## Squad Review Before Final Acceptance

For high-stakes implementation workflows, validation should include human-like review structure, not only command gates. Use small implementation squads and make review state durable:

1. Split independent scopes into 2-3 agent squads. Each squad has an implementer, a shadow reviewer, and optionally a validation/test owner.
2. The shadow reviewer follows the implementer while work is happening and flags spec drift early.
3. Before external review, the implementer writes a self-reflection artifact under `.workflow-artifacts/<task>/` covering spec coverage, changed files, tests/proofs, repo-rule alignment, and known risks.
4. A fresh self-review agent reads the actual files, AGENTS.md / CLAUDE.md, recent related work, and local conventions. It writes findings to disk.
5. The implementer repairs valid findings, then deterministic gates rerun from captured output.
6. After all squads converge, run two independent final reviewers, typically Claude and Codex. They compare notes and write one merged final review artifact.
7. Spawn fresh fix agents for final-review findings. Those agents self-reflect, then the final reviewers re-check the post-fix state.
8. Commit or PR creation is allowed only after final deterministic acceptance and post-fix review are green. Otherwise write a `BLOCKED_NO_COMMIT` artifact with exact evidence.

This keeps "100%" tied to both executable evidence and independent review over the final state.

## The Test-Fix-Rerun Pattern

Every testable feature in a workflow should follow this four-step pattern:
Expand Down
26 changes: 25 additions & 1 deletion skills/writing-agent-relay-workflows/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,30 @@ See [Common Patterns → Interactive Team](#interactive-team-lead--workers-on-sh

---

## Default For Serious Implementation: Shadowed Squad Review Loop

When a workflow is expected to produce production-quality code, generated workflows, runtime behavior, or shared execution contracts, use a structured squad-review-loop unless the task is clearly small enough for the lighter shape.

The default unit is a **2-3 agent squad**:
- implementer: owns a tight file/subsystem scope and writes the change
- shadow reviewer: follows the implementer in real time, checks drift against the spec, and leaves feedback early
- optional validation owner: owns tests, dry-run proof, or fixture coverage when that is a separate deliverable

Encode the loop explicitly:

1. Deterministically read the spec, AGENTS.md / CLAUDE.md, workflow standards, recent local docs, and declared file targets.
2. Lead splits work into bounded squads with non-overlapping ownership.
3. Squads run in parallel. The shadow reads actual files and channel updates, then posts feedback while the implementer is still active.
4. Each implementer writes a self-reflection artifact before external review. It must answer: what changed, what spec items are satisfied, what tests/proofs ran, what risks remain, and how the work follows repo rules.
5. A fresh self-review agent reads the post-implementation files, recent local conventions, AGENTS.md / CLAUDE.md, and related rules. It should not rely on the implementer's summary.
6. The implementer gets that feedback and performs a repair pass.
7. Deterministic gates run with captured output. Red output goes to a repair owner, then the same gate reruns.
8. A final review team of two agents, normally Claude and Codex, reviews independently. They then compare notes and write one merged final review artifact.
9. Fresh fix agents address final-review findings, self-reflect, and hand back to the final reviewers.
10. Final signoff only happens after post-fix review and final deterministic gates prove the spec is complete, or a blocker artifact explains why it cannot be completed.

For small doc/spec workflows, a lead + author + distinct reviewer is enough. For serious implementation workflows, do not collapse implementer self-reflection, shadow review, independent review, final dual review, and repair into one vague "review" step.

**Critical TypeScript rules:**
1. Check the project's `package.json` for `"type": "module"` — if ESM, use `import` and top-level `await`. If CJS, use `require()` and wrap in `async function main()`.
2. `agent-relay run <file.ts>` executes the file as a standalone subprocess — it does NOT inspect exports. The file MUST call `.run()`.
Expand Down Expand Up @@ -1559,7 +1583,7 @@ When you set `.pattern('supervisor')` (or `hub-spoke`, `fan-out`), the runner au
| Agents receiving noisy cross-channel messages during focused work | Use `relay.mute({ agent, channel })` to silence non-primary channels without leaving them |
| Hardcoding all channels at spawn time | Use `agent.subscribe()` / `agent.unsubscribe()` for dynamic channel membership post-spawn |
| Using `preset: 'worker'` for Codex in *interactive team* patterns when coordination is needed | Codex interactive mode works fine with PTY channel injection. Drop the preset for interactive team patterns (keep it for one-shot DAG workers where clean stdout matters) |
| Separate reviewer agent from lead in interactive team | Merge lead + reviewer into one interactive Claude agent — reviews between rounds, fewer agents |
| Unnecessary separate reviewer agent in a small interactive team | For low-risk work, merge lead + reviewer into one interactive Claude agent; for serious implementation or Ricky-style workflows, keep reviewer/shadow/final review roles distinct |
| Not printing PR URL after `createGitHubStep({ action: 'createPR' })` | Capture `html_url` with `output: { mode: 'data', format: 'json', path: 'html_url' }` and echo or write it in a final deterministic step |
| Workflow ending without worktree + PR for cross-repo changes | Add `setup-worktree` at start and `push-and-pr` + `cleanup-worktree` at end |

Expand Down