agentic-atdd

ATDD pipeline for Claude Code and Codex. You start with a user story, you end with a PR sitting on main waiting for your review. Everything in between runs without you.

The skills are small. You can read any one of them in under a minute, fork it, swap it out. No monolith, no framework lock-in.

What it does

You give the pipeline a story (or import a GitHub issue you already wrote). It:

Interviews you on the business rules until they're concrete enough to test.
Generates Gherkin scenarios tagged by test level (@use-case, @e2e, @ui) and by branch (@nominal, @violation, @auth, @technical, @limit).
Stops. You read the scenarios. You say OK or REGENERATE <reason>.
Pushes the spec to GitHub Issues — one parent, one user story, one sub-issue per scenario. Idempotent, so you can rerun without duplicates.
Creates an integration branch (atdd/<slug>/integration) off main.
For each scenario sub-issue: writes the failing test, runs review-fidelity, auto-corrects up to twice, then writes the minimal implementation, runs review-architecture and review-intent in parallel, auto-corrects up to twice, opens a draft PR targeting the integration branch.
Marks the PR ready, watches CI, watches bot reviewers (CodeRabbit, codex, github-actions, whatever you've wired). When the bots go quiet, if there's actionable feedback it invokes apply-pr-feedback, pushes, and loops. When CI is green and nothing is outstanding, it squash-merges into the integration branch.
Repeats for every scenario.
Opens one final PR integration → main and stops there. That PR is yours to merge.

Two human gates: the spec, and the final PR. The rest is hands-off.

Install

Claude Code (recommended)

/plugin marketplace add mpiton/agentic-atdd
/plugin install atdd-pipeline@agentic-atdd

The plugin's skills and slash commands get namespaced under atdd-pipeline:*. No clone, no symlink, no shell script. Updates flow through /plugin update.

Manual install (Codex CLI, or Claude Code without the marketplace)

git clone https://github.com/mpiton/agentic-atdd ~/.claude/plugins/atdd-pipeline
~/.claude/plugins/atdd-pipeline/scripts/install.sh

The script symlinks every skill folder into ~/.claude/skills/ and (when present) ~/.codex/skills/. Same SKILL.md files on both harnesses, single source of truth. Restart the CLI to refresh the skill index.

Per-repo setup is a one-time interview:

/setup-atdd-pipeline

Writes .atdd-pipeline.json at the repo root. Auto-merge is on by default, the idle window for bot review is 5 minutes, apply-pr-feedback is capped at 3 iterations. Tunable.

Skills

Spec phase

impact-map — capture a story, an actor, a goal, business rules numbered R-NN.
from-issue — import an existing GitHub issue. The issue you already wrote stays; the pipeline back-fills context.md from its body.
spec-generate — Gherkin scenarios from the context, one feature file per rule.
spec-review — read-only audit of the scenarios across four axes (branches, coherence, gaps, triangulation).

Sync phase

to-issues-atdd — idempotent gh CLI sync of the scenario hierarchy. Also provisions the integration branch.

Execute phase

red-cycle — generate the failing test, run review-fidelity, auto-correct ×2 then escalate.
green-cycle — minimal implementation, review-architecture and review-intent in parallel, auto-correct ×2 then escalate. Opens the draft PR against the integration branch.
review-fidelity — does the test mirror the Gherkin? Structure, semantics, intent.
review-architecture — placement, naming, conventions, domain responsibilities.
review-intent — does the code do only what the test asks? No hidden side effects, no speculative branches.
apply-pr-feedback — fetch every actionable review comment (bot or human) on a PR, apply only the changes asked for, commit and push. Bundled so the plugin has no external skill dependency.
pr-auto-merge — watches CI plus bot reviewers, runs apply-pr-feedback on actionable feedback, squash-merges into the integration branch. Refuses to operate on PRs whose base is main (the final PR is always your call).

Orchestrator

atdd-run — end-to-end driver. Chains every phase, stops at the two human gates.
setup-atdd-pipeline — per-repo configuration.

Slash commands

Command	What it does
`/setup-atdd-pipeline`	One-time per-repo config.
`/impact-map`	Capture a new story.
`/from-issue <N>`	Import an existing GitHub issue.
`/spec-generate`	Gherkin scenarios from the context.
`/spec-review`	Read-only review of the scenarios.
`/to-issues-atdd`	Sync to GitHub Issues + create the integration branch.
`/red <issue>`	Failing test for one scenario.
`/green <issue>`	Minimal implementation + open the draft PR.
`/auto-merge <pr>`	Watch + apply-pr-feedback + squash-merge a sub-PR.
`/apply-pr-feedback [pr]`	Apply every actionable review comment on a PR, commit, push.
`/atdd-run <us-slug>`	Run the whole thing end to end.
`/review-fidelity` · `/review-architecture` · `/review-intent`	Standalone reviewer runs.

How to use it

Read docs/USAGE.md for the three entry paths (existing GitHub issue, fresh feature with a PRD already in hand, greenfield with nothing). The visual version of the same three paths lives at docs/workflows.html — open it in a browser.

Codex portability

Codex CLI auto-discovers skills from ~/.codex/skills/<name>/SKILL.md using the same format Claude Code uses. The installer symlinks each plugin skill into both ~/.claude/skills/ and ~/.codex/skills/, so one edit propagates to both harnesses. scripts/sync-codex.sh is a deprecated alias that delegates to install.sh.

Parallel reviewers (the default on Claude Code via the Task tool) fall back to sequential execution under Codex automatically. You can force sequential anywhere with --sequential on atdd-run.

Example

examples/cart-checkout/ is a full worked artifact set: context.md, two .feature files, a review.md carrying a REGENERATE verdict, and the issues.json you get after sync. Good place to look at what the pipeline produces before you run it on something real.

Reviewer fixtures

tests/fixtures/ holds pass/fail golden cases for each reviewer skill. Edit a reviewer prompt, replay the fixtures, catch regressions. See tests/README.md for the contract.

Design principles

Small composable skills. You should be able to delete any one of them and replace it with your own version in an afternoon.
Two hard human gates. Post-spec and the final PR. Nothing else asks for your input.
Bounded auto-correction. Reviewer loops cap at 2 retries. The auto-merge apply-pr-feedback loop caps at 3. Past the cap, the pipeline drops an escalation comment on the issue or PR and moves on.
Integration branch is mandatory. Sub-PRs target atdd/<slug>/integration, never main. pr-auto-merge refuses to merge PRs whose base is the trunk.
GitHub Issues is the database. The spec, the work breakdown, the escalations — all of it lives there. The pipeline reads its own state from gh rather than from a sidecar file.

License

MIT. See LICENSE.

Inspiration

The pipeline structure follows the multi-agent ATDD pattern: impact mapping, then Gherkin scenarios per business rule, then issues, then a RED/GREEN cycle with parallel reviewers. The skill-shape philosophy (small, hackable, no monolith) comes from mattpocock/skills. Several skills compose well with mattpocock's grill-with-docs, to-prd, and improve-codebase-architecture.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.claude-plugin		.claude-plugin
.github		.github
docs		docs
examples		examples
scripts		scripts
skills		skills
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agentic-atdd

What it does

Install

Claude Code (recommended)

Manual install (Codex CLI, or Claude Code without the marketplace)

Skills

Spec phase

Sync phase

Execute phase

Orchestrator

Slash commands

How to use it

Codex portability

Example

Reviewer fixtures

Design principles

License

Inspiration

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

agentic-atdd

What it does

Install

Claude Code (recommended)

Manual install (Codex CLI, or Claude Code without the marketplace)

Skills

Spec phase

Sync phase

Execute phase

Orchestrator

Slash commands

How to use it

Codex portability

Example

Reviewer fixtures

Design principles

License

Inspiration

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages