bdfinst · bdfinst · Jun 1, 2026 · Jun 1, 2026 · Jun 1, 2026 · Jun 1, 2026
diff --git a/.adr-dir b/.adr-dir
@@ -0,0 +1 @@
+docs/adr
diff --git a/docs/adr/0001-record-architecture-decisions.md b/docs/adr/0001-record-architecture-decisions.md
@@ -0,0 +1,19 @@
+# 1. Record architecture decisions
+
+Date: 2026-06-01
+
+## Status
+
+Accepted
+
+## Context
+
+We need to record the architectural decisions made on this project.
+
+## Decision
+
+We will use Architecture Decision Records, as [described by Michael Nygard](http://thinkrelevance.com/blog/2011/11/15/documenting-architecture-decisions).
+
+## Consequences
+
+See Michael Nygard's article, linked above. For a lightweight ADR toolset, see Nat Pryce's [adr-tools](https://github.com/npryce/adr-tools).
diff --git a/...0002-use-sentinel-file-and-argument-shape-heuristic-for-codegraph-nudge-hook.md b/...0002-use-sentinel-file-and-argument-shape-heuristic-for-codegraph-nudge-hook.md
@@ -0,0 +1,65 @@
+# 2. Use sentinel file and argument-shape heuristic for CodeGraph nudge hook
+
+Date: 2026-06-01
+
+## Status
+
+Accepted
+
+Tooling documented by [3. Document ADR tooling workflow as a skill](0003-document-adr-tooling-workflow-as-a-skill.md)
+
+## Context
+
+We added a `PreToolUse` hook (`codegraph-nudge.sh`) that fires on every `Read`, `Grep`, and `Glob` tool call in projects with a CodeGraph index (`.codegraph/` in cwd). When the call looks like exploration, the hook should recommend the indexed `codegraph_*` MCP tools instead. Two implementation questions had non-obvious answers:
+
+1. **How to detect "the agent has already used codegraph this turn"** — so the hook stays silent on the follow-up `Read` that confirms a specific detail (the pattern our own CLAUDE.md guidance recommends). Two designs were considered:
+   - **Parse the transcript at PreToolUse time.** Walk `tail -n N` of the JSONL transcript backward to the last `"type":"user"` marker and grep for `mcp__codegraph__*` in the window. Stateless; no extra hook needed.
+   - **Sentinel file written by a companion PostToolUse hook.** A small `codegraph-turn-mark.sh` fires after any `mcp__codegraph__.*` tool completes and writes `{transcript_id, turn_counter}` to `${CLAUDE_PROJECT_DIR}/.claude/codegraph-turn-state.json`. The nudge hook reads the sentinel and compares.
+
+2. **How to classify a tool call as "multi-file exploration"** — without duplicating the tool's own file enumeration on every invocation. Two designs were considered:
+   - **Expand the glob / walk the directory** to count matches. Accurate but slow (bash `**` needs `shopt -s globstar`, not portable; deep `**` patterns can be expensive), and the cost is paid on every Read/Grep/Glob.
+   - **Argument-shape heuristic.** Read-always-single; `Grep` is multi unless `tool_input.path` is a regular file (`test -f`); `Glob` is multi when `tool_input.pattern` contains a glob metacharacter. Coarse but constant-time and zero filesystem walk.
+
+Plan-review (round 1) flagged both questions as blockers:
+
+- **Design & Architecture Critic** rejected transcript-walk: couples the hook to JSONL schema, depends on PostToolUse flush timing, parses MBs of transcript on every Read/Grep/Glob in long sessions, and contradicts the spec which already named `codegraph-turn-state.json`.
+- **Design & Architecture Critic** rejected filesystem expansion: `shopt -s globstar` portability, glob-expansion cost, and over-coupling to the live filesystem when the hook is meant to be advisory.
+
+Security review later flagged that even with the bounded sentinel approach, the nudge hook's transcript `grep -c "type":"user"` would scan the full transcript on every Read/Grep/Glob — unbounded growth over a long session.
+
+## Decision
+
+Adopt both refinements:
+
+1. **Sentinel file written by a PostToolUse mark hook.** `codegraph-turn-mark.sh` is registered on matcher `mcp__codegraph__.*`. It writes `{transcript_id, turn_counter}` atomically (mktemp + `mv -f`) to `${CLAUDE_PROJECT_DIR}/.claude/codegraph-turn-state.json`. The nudge hook reads that file, recomputes `transcript_id` (basename of `transcript_path` minus extension) and `turn_counter` (`grep -c '"type":"user"'` over `tail -c 1048576` of the transcript), and suppresses the warning when both match.
+2. **Argument-shape heuristic for breadth classification.** `Read` is always silent. `Grep` is multi unless `tool_input.path` exists and is a regular file. `Glob` is multi when `tool_input.pattern` contains `*`, `?`, or `[`. No `find`, no globstar, no expansion.
+3. **Cap the transcript scan at 1 MB** (`tail -c 1048576`) on both reader and writer. The turn counter only needs to detect monotonic change within a session, not reflect the absolute all-time count — a fixed-size tail is sufficient and both sides see the same count.
+
+Fail-open posture throughout: any internal error in either hook exits 0 (the hook is a nudge, never a gate). Careful-mode escalation (`exit 2`) is layered on top via the existing `careful-state.json` mechanism, matching `destructive-guard.sh`.
+
+## Consequences
+
+**Easier:**
+
+- Constant-time hook overhead per call (~36–49 ms median wall-clock on the hot path, dominated by process startup, not hook logic). Scales independently of session length and project size.
+- Adding the same nudge to another plugin (e.g., the writing-team mirror tracked in `bdfinst/agentic-writing-team#36`) reuses the same mechanism without re-deciding.
+- The sentinel is one small JSON file; the schema is documented in `docs/codegraph-nudge.md` and pinned by `tests/hooks/codegraph_nudge.bats`.
+
+**Harder:**
+
+- The mark hook must fire reliably on every `mcp__codegraph__.*` PostToolUse — if Claude Code changes that matcher convention, the suppression silently regresses to "always warn." Fail-open posture absorbs this without breaking workflows.
+- Argument-shape heuristic has known false positives (a `Grep` against a directory that happens to contain a single file still warns). Accepted because the warning is advisory, not blocking — false positives nudge the agent toward `codegraph_*`, which is the point.
+- Two scripts to maintain instead of one. Mitigated by their shared single-page `docs/codegraph-nudge.md` and a single bats file covering both.
+
+**Risks:**
+
+- **Sentinel timing.** PostToolUse must complete before the next PreToolUse fires. Claude Code's hook ordering guarantees this in practice; if it ever drifts, the user sees one spurious warning per affected turn — never a broken tool call.
+- **Transcript schema drift.** If the JSONL schema changes (`"type":"user"` ever stops being the per-user-message marker), the turn counter freezes and the suppression breaks closed (always warns). Detection: bats fixtures in `tests/hooks/fixtures/transcripts/` mirror the current schema; if they need regenerating, that's the signal to update both hooks together.
+
+## References
+
+- Spec: `docs/specs/codegraph-integration.md`
+- Plan: `plans/codegraph-integration.md`
+- Hook docs: `plugins/agentic-dev-team/docs/codegraph-nudge.md`
+- PR: <https://github.com/bdfinst/agentic-dev-team/pull/36>
+- Companion (writing-team): <https://github.com/bdfinst/agentic-writing-team/issues/36>
diff --git a/docs/adr/0003-document-adr-tooling-workflow-as-a-skill.md b/docs/adr/0003-document-adr-tooling-workflow-as-a-skill.md
@@ -0,0 +1,43 @@
+# 3. Document ADR tooling workflow as a skill
+
+Date: 2026-06-01
+
+## Status
+
+Accepted
+
+Documents tooling used by [2. Use sentinel file and argument-shape heuristic for CodeGraph nudge hook](0002-use-sentinel-file-and-argument-shape-heuristic-for-codegraph-nudge-hook.md)
+
+## Context
+
+We started using [npryce/adr-tools](https://github.com/npryce/adr-tools) to manage Architecture Decision Records under `docs/adr/` (project convention; `.adr-dir` at the project root points adr-tools at it). ADR-0002 was the first decision recorded with the tool, and writing it surfaced a non-obvious failure mode: `adr new` opens `$VISUAL`/`$EDITOR` (defaulting to `vi`), which hangs in non-interactive shells — the file is created but empty, and the command exits non-zero. Without documented guidance, every future ADR run will hit the same wall and either:
+
+- Re-discover the workaround (lost time), or
+- Skip `adr-tools` entirely and write ADRs by hand (lost link bookkeeping, lost TOC generation, lost supersede semantics — the whole reason to use the tool).
+
+There is already an `adr-author` agent in the plugin, but it covers the *decision framework* and prose authoring, not the CLI mechanics. The two concerns are separable: deciding whether an ADR is warranted vs. driving the tool correctly. Treating them as one would either bloat the agent or leave the mechanics undocumented.
+
+## Decision
+
+Add an `adr-tools` skill (`plugins/agentic-dev-team/skills/adr-tools/SKILL.md`) that documents the CLI mechanics: pre-flight check, the editor caveat with the `EDITOR=true VISUAL=true` workaround, the create / supersede / link / list / generate-toc workflows, and anti-patterns (don't renumber, don't squash decisions, don't write decisions in future tense).
+
+Keep `adr-author` as the decision-framework + prose-authoring agent. The skill delegates to the agent for "should we ADR this?" questions and the agent delegates to the skill for "now run the commands."
+
+Register the skill in `plugins/agentic-dev-team/CLAUDE.md`'s skills count (31 → 32).
+
+## Consequences
+
+**Easier:**
+
+- Any agent or human running `adr new` in a Claude shell gets the editor workaround on first read instead of after losing a turn to the hang.
+- The supersede and link mechanics (bidirectional link insertion, automatic Status updates) are documented in one place — agents stop hand-editing Status sections, which keeps the relationship graph consistent.
+- `adr generate toc` is named as the required follow-up after every create/supersede, which prevents the index from drifting.
+
+**Harder:**
+
+- Two surfaces to keep aligned (the skill and the `adr-author` agent). Mitigated by an explicit "When to defer to adr-author" section at the bottom of the skill.
+- The skill assumes the `docs/adr/` convention (this project's choice) and reads `.adr-dir` for the configured path. Other projects using `doc/adr/` (the adr-tools default) or `docs/adrs/` need to ensure `.adr-dir` matches; the skill's pre-flight calls this out but does not auto-detect.
+
+**Risks:**
+
+- If `adr-tools` ever changes its template path or command surface (last release was v3.0.0), the skill's command examples drift. Mitigated by the skill linking to the upstream README rather than copying it.
diff --git a/docs/adr/README.md b/docs/adr/README.md
@@ -0,0 +1,5 @@
+# Architecture Decision Records
+
+* [1. Record architecture decisions](0001-record-architecture-decisions.md)
+* [2. Use sentinel file and argument-shape heuristic for CodeGraph nudge hook](0002-use-sentinel-file-and-argument-shape-heuristic-for-codegraph-nudge-hook.md)
+* [3. Document ADR tooling workflow as a skill](0003-document-adr-tooling-workflow-as-a-skill.md)
diff --git a/docs/agentic-dev-team-reference.md b/docs/agentic-dev-team-reference.md
@@ -21,6 +21,7 @@ Team agents are persona-driven roles that implement, design, coordinate, and com
 The central dispatcher. Classifies every incoming request, selects and loads only the agents needed for the current phase, manages the three-phase workflow (Research → Plan → Implement), enforces the 40% context ceiling, and runs the inline review loop during implementation. Spans all phases; all other agents are loaded on demand.
 
 **Key responsibilities:**
+
 - Task routing and model assignment for all review agents
 - Parallel plan review (four critic personas) before human gate
 - Inline review loop: spec compliance → quality review → auto-fix (up to 5 iterations)
@@ -85,6 +86,7 @@ Performs threat modeling, assesses attack surface, and provides secure design gu
 Owns pipelines, deployment strategy, observability, and reliability planning. Provides self-service infrastructure capabilities to development teams. Advocates for safe, observable deployments and SLO-driven reliability.
 
 **Key responsibilities:**
+
 - Pipeline design and maintenance (build, test, deploy)
 - Deployment strategies: blue-green, canary, rolling, feature flags
 - Observability: metrics, logs, traces
@@ -250,6 +252,7 @@ All user-invocable workflows. Executed under Orchestrator direction unless other
 | `/browse` | Browser-based QA via Playwright: navigate, screenshot, click, fill forms |
 | `/benchmark` | Capture runtime performance metrics and compare against baselines |
 | `/competitive-analysis` | Compare plugin against other tools to find gaps and weaknesses |
+| `/init-dev-team` | Install plugin prerequisites (jq, python3, mutation tools); state-aware CodeGraph offer; JS bootstrap via `js-project-init` when `package.json` is missing |
 
 ## Safety Commands