Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .adr-dir
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
docs/adr
19 changes: 19 additions & 0 deletions docs/adr/0001-record-architecture-decisions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# 1. Record architecture decisions

Date: 2026-06-01

## Status

Accepted

## Context

We need to record the architectural decisions made on this project.

## Decision

We will use Architecture Decision Records, as [described by Michael Nygard](http://thinkrelevance.com/blog/2011/11/15/documenting-architecture-decisions).

## Consequences

See Michael Nygard's article, linked above. For a lightweight ADR toolset, see Nat Pryce's [adr-tools](https://github.com/npryce/adr-tools).
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# 2. Use sentinel file and argument-shape heuristic for CodeGraph nudge hook

Date: 2026-06-01

## Status

Accepted

Tooling documented by [3. Document ADR tooling workflow as a skill](0003-document-adr-tooling-workflow-as-a-skill.md)

## Context

We added a `PreToolUse` hook (`codegraph-nudge.sh`) that fires on every `Read`, `Grep`, and `Glob` tool call in projects with a CodeGraph index (`.codegraph/` in cwd). When the call looks like exploration, the hook should recommend the indexed `codegraph_*` MCP tools instead. Two implementation questions had non-obvious answers:

1. **How to detect "the agent has already used codegraph this turn"** — so the hook stays silent on the follow-up `Read` that confirms a specific detail (the pattern our own CLAUDE.md guidance recommends). Two designs were considered:
- **Parse the transcript at PreToolUse time.** Walk `tail -n N` of the JSONL transcript backward to the last `"type":"user"` marker and grep for `mcp__codegraph__*` in the window. Stateless; no extra hook needed.
- **Sentinel file written by a companion PostToolUse hook.** A small `codegraph-turn-mark.sh` fires after any `mcp__codegraph__.*` tool completes and writes `{transcript_id, turn_counter}` to `${CLAUDE_PROJECT_DIR}/.claude/codegraph-turn-state.json`. The nudge hook reads the sentinel and compares.

2. **How to classify a tool call as "multi-file exploration"** — without duplicating the tool's own file enumeration on every invocation. Two designs were considered:
- **Expand the glob / walk the directory** to count matches. Accurate but slow (bash `**` needs `shopt -s globstar`, not portable; deep `**` patterns can be expensive), and the cost is paid on every Read/Grep/Glob.
- **Argument-shape heuristic.** Read-always-single; `Grep` is multi unless `tool_input.path` is a regular file (`test -f`); `Glob` is multi when `tool_input.pattern` contains a glob metacharacter. Coarse but constant-time and zero filesystem walk.

Plan-review (round 1) flagged both questions as blockers:

- **Design & Architecture Critic** rejected transcript-walk: couples the hook to JSONL schema, depends on PostToolUse flush timing, parses MBs of transcript on every Read/Grep/Glob in long sessions, and contradicts the spec which already named `codegraph-turn-state.json`.
- **Design & Architecture Critic** rejected filesystem expansion: `shopt -s globstar` portability, glob-expansion cost, and over-coupling to the live filesystem when the hook is meant to be advisory.

Security review later flagged that even with the bounded sentinel approach, the nudge hook's transcript `grep -c "type":"user"` would scan the full transcript on every Read/Grep/Glob — unbounded growth over a long session.

## Decision

Adopt both refinements:

1. **Sentinel file written by a PostToolUse mark hook.** `codegraph-turn-mark.sh` is registered on matcher `mcp__codegraph__.*`. It writes `{transcript_id, turn_counter}` atomically (mktemp + `mv -f`) to `${CLAUDE_PROJECT_DIR}/.claude/codegraph-turn-state.json`. The nudge hook reads that file, recomputes `transcript_id` (basename of `transcript_path` minus extension) and `turn_counter` (`grep -c '"type":"user"'` over `tail -c 1048576` of the transcript), and suppresses the warning when both match.
2. **Argument-shape heuristic for breadth classification.** `Read` is always silent. `Grep` is multi unless `tool_input.path` exists and is a regular file. `Glob` is multi when `tool_input.pattern` contains `*`, `?`, or `[`. No `find`, no globstar, no expansion.
3. **Cap the transcript scan at 1 MB** (`tail -c 1048576`) on both reader and writer. The turn counter only needs to detect monotonic change within a session, not reflect the absolute all-time count — a fixed-size tail is sufficient and both sides see the same count.

Fail-open posture throughout: any internal error in either hook exits 0 (the hook is a nudge, never a gate). Careful-mode escalation (`exit 2`) is layered on top via the existing `careful-state.json` mechanism, matching `destructive-guard.sh`.

## Consequences

**Easier:**

- Constant-time hook overhead per call (~36–49 ms median wall-clock on the hot path, dominated by process startup, not hook logic). Scales independently of session length and project size.
- Adding the same nudge to another plugin (e.g., the writing-team mirror tracked in `bdfinst/agentic-writing-team#36`) reuses the same mechanism without re-deciding.
- The sentinel is one small JSON file; the schema is documented in `docs/codegraph-nudge.md` and pinned by `tests/hooks/codegraph_nudge.bats`.

**Harder:**

- The mark hook must fire reliably on every `mcp__codegraph__.*` PostToolUse — if Claude Code changes that matcher convention, the suppression silently regresses to "always warn." Fail-open posture absorbs this without breaking workflows.
- Argument-shape heuristic has known false positives (a `Grep` against a directory that happens to contain a single file still warns). Accepted because the warning is advisory, not blocking — false positives nudge the agent toward `codegraph_*`, which is the point.
- Two scripts to maintain instead of one. Mitigated by their shared single-page `docs/codegraph-nudge.md` and a single bats file covering both.

**Risks:**

- **Sentinel timing.** PostToolUse must complete before the next PreToolUse fires. Claude Code's hook ordering guarantees this in practice; if it ever drifts, the user sees one spurious warning per affected turn — never a broken tool call.
- **Transcript schema drift.** If the JSONL schema changes (`"type":"user"` ever stops being the per-user-message marker), the turn counter freezes and the suppression breaks closed (always warns). Detection: bats fixtures in `tests/hooks/fixtures/transcripts/` mirror the current schema; if they need regenerating, that's the signal to update both hooks together.

## References

- Spec: `docs/specs/codegraph-integration.md`
- Plan: `plans/codegraph-integration.md`
- Hook docs: `plugins/agentic-dev-team/docs/codegraph-nudge.md`
- PR: <https://github.com/bdfinst/agentic-dev-team/pull/36>
- Companion (writing-team): <https://github.com/bdfinst/agentic-writing-team/issues/36>
43 changes: 43 additions & 0 deletions docs/adr/0003-document-adr-tooling-workflow-as-a-skill.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# 3. Document ADR tooling workflow as a skill

Date: 2026-06-01

## Status

Accepted

Documents tooling used by [2. Use sentinel file and argument-shape heuristic for CodeGraph nudge hook](0002-use-sentinel-file-and-argument-shape-heuristic-for-codegraph-nudge-hook.md)

## Context

We started using [npryce/adr-tools](https://github.com/npryce/adr-tools) to manage Architecture Decision Records under `docs/adr/` (project convention; `.adr-dir` at the project root points adr-tools at it). ADR-0002 was the first decision recorded with the tool, and writing it surfaced a non-obvious failure mode: `adr new` opens `$VISUAL`/`$EDITOR` (defaulting to `vi`), which hangs in non-interactive shells — the file is created but empty, and the command exits non-zero. Without documented guidance, every future ADR run will hit the same wall and either:

- Re-discover the workaround (lost time), or
- Skip `adr-tools` entirely and write ADRs by hand (lost link bookkeeping, lost TOC generation, lost supersede semantics — the whole reason to use the tool).

There is already an `adr-author` agent in the plugin, but it covers the *decision framework* and prose authoring, not the CLI mechanics. The two concerns are separable: deciding whether an ADR is warranted vs. driving the tool correctly. Treating them as one would either bloat the agent or leave the mechanics undocumented.

## Decision

Add an `adr-tools` skill (`plugins/agentic-dev-team/skills/adr-tools/SKILL.md`) that documents the CLI mechanics: pre-flight check, the editor caveat with the `EDITOR=true VISUAL=true` workaround, the create / supersede / link / list / generate-toc workflows, and anti-patterns (don't renumber, don't squash decisions, don't write decisions in future tense).

Keep `adr-author` as the decision-framework + prose-authoring agent. The skill delegates to the agent for "should we ADR this?" questions and the agent delegates to the skill for "now run the commands."

Register the skill in `plugins/agentic-dev-team/CLAUDE.md`'s skills count (31 → 32).

## Consequences

**Easier:**

- Any agent or human running `adr new` in a Claude shell gets the editor workaround on first read instead of after losing a turn to the hang.
- The supersede and link mechanics (bidirectional link insertion, automatic Status updates) are documented in one place — agents stop hand-editing Status sections, which keeps the relationship graph consistent.
- `adr generate toc` is named as the required follow-up after every create/supersede, which prevents the index from drifting.

**Harder:**

- Two surfaces to keep aligned (the skill and the `adr-author` agent). Mitigated by an explicit "When to defer to adr-author" section at the bottom of the skill.
- The skill assumes the `docs/adr/` convention (this project's choice) and reads `.adr-dir` for the configured path. Other projects using `doc/adr/` (the adr-tools default) or `docs/adrs/` need to ensure `.adr-dir` matches; the skill's pre-flight calls this out but does not auto-detect.

**Risks:**

- If `adr-tools` ever changes its template path or command surface (last release was v3.0.0), the skill's command examples drift. Mitigated by the skill linking to the upstream README rather than copying it.
5 changes: 5 additions & 0 deletions docs/adr/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Architecture Decision Records

* [1. Record architecture decisions](0001-record-architecture-decisions.md)
* [2. Use sentinel file and argument-shape heuristic for CodeGraph nudge hook](0002-use-sentinel-file-and-argument-shape-heuristic-for-codegraph-nudge-hook.md)
* [3. Document ADR tooling workflow as a skill](0003-document-adr-tooling-workflow-as-a-skill.md)
3 changes: 3 additions & 0 deletions docs/agentic-dev-team-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ Team agents are persona-driven roles that implement, design, coordinate, and com
The central dispatcher. Classifies every incoming request, selects and loads only the agents needed for the current phase, manages the three-phase workflow (Research → Plan → Implement), enforces the 40% context ceiling, and runs the inline review loop during implementation. Spans all phases; all other agents are loaded on demand.

**Key responsibilities:**

- Task routing and model assignment for all review agents
- Parallel plan review (four critic personas) before human gate
- Inline review loop: spec compliance → quality review → auto-fix (up to 5 iterations)
Expand Down Expand Up @@ -85,6 +86,7 @@ Performs threat modeling, assesses attack surface, and provides secure design gu
Owns pipelines, deployment strategy, observability, and reliability planning. Provides self-service infrastructure capabilities to development teams. Advocates for safe, observable deployments and SLO-driven reliability.

**Key responsibilities:**

- Pipeline design and maintenance (build, test, deploy)
- Deployment strategies: blue-green, canary, rolling, feature flags
- Observability: metrics, logs, traces
Expand Down Expand Up @@ -250,6 +252,7 @@ All user-invocable workflows. Executed under Orchestrator direction unless other
| `/browse` | Browser-based QA via Playwright: navigate, screenshot, click, fill forms |
| `/benchmark` | Capture runtime performance metrics and compare against baselines |
| `/competitive-analysis` | Compare plugin against other tools to find gaps and weaknesses |
| `/init-dev-team` | Install plugin prerequisites (jq, python3, mutation tools); state-aware CodeGraph offer; JS bootstrap via `js-project-init` when `package.json` is missing |

## Safety Commands

Expand Down
Loading
Loading