Skip to content

feat: Shepherd hooks for subagent orchestration #27

@dean0x

Description

@dean0x

Summary

Implement Claude Code hooks to monitor and orchestrate subagent completion. The "shepherd" pattern uses SubagentStop hooks to validate subagent work, enforce quality gates, and suggest follow-up actions.

Motivation

DevFlow spawns many subagents (Coder, Explore, Review-*, Synthesize). Currently there's no automatic validation that:

  • Coder agents actually made changes
  • Review agents provided thorough analysis (not rubber-stamp)
  • Explore agents found actionable results
  • Synthesize didn't lose critical findings

Hooks provide deterministic enforcement vs skills (AI-judged) or commands (user-invoked).

Technical Approach

SubagentStop Hook Mechanics

  • Input: Minimal - session_id, transcript_path, cwd, stop_hook_active
  • Output: {"decision": "approve"|"block", "reason": "...", "systemMessage": "..."}
  • Subagent identification: Must parse transcript file (JSONL) to identify which agent completed
  • Two types: command (fast, deterministic) or prompt (LLM-evaluated, context-aware)

Proposed Shepherd Patterns

Pattern 1: Post-Coder Review Gate

Block Coder completion for security-sensitive or large changes without review:

{
  "type": "prompt",
  "prompt": "If Coder agent completed code changes touching auth/crypto/keys OR >50 lines without tests: block and require SecurityReview"
}

Pattern 2: Review Quality Gate

Prevent shallow "looks good" reviews:

{
  "type": "prompt", 
  "prompt": "If Review agent completed: verify it analyzed actual files, provided file:line references, checked multiple concerns. Block if rubber-stamp."
}

Pattern 3: Explore Completeness

Ensure exploration returned useful results:

{
  "type": "prompt",
  "prompt": "If Explore agent completed: verify it found relevant files/patterns with concrete paths. Block if inconclusive."
}

Pattern 4: Chain Trigger (Non-blocking)

Suggest logical next steps without blocking:

{
  "type": "prompt",
  "prompt": "Analyze completed subagent. Approve but suggest next step in systemMessage: Coder→review, Explore→plan, Review→fix"
}

Implementation Options

Approach Speed Intelligence Maintenance
type: "prompt" Slow (API) High Low
type: "command" Fast Rules-based Higher
Hybrid Medium Best of both Medium

Proposed File Structure

.claude/
├── settings.json              # Hook configuration
└── hooks/
    ├── shepherd.py            # Main shepherd logic
    └── transcript-parser.py   # Utility for reading JSONL transcripts

Or ship hooks via DevFlow install to ~/.devflow/hooks/.

Acceptance Criteria

  • Shepherd hook script that parses transcript to identify subagent type
  • Configurable rules per subagent type (Coder, Review, Explore, Synthesize)
  • Both blocking (quality gate) and non-blocking (suggest next) modes
  • stop_hook_active check to prevent infinite loops
  • Documentation in CLAUDE.md for hook patterns
  • Optional: prompt-based hooks for intelligent evaluation

Open Questions

  1. Ship hooks via devflow init or keep as opt-in configuration?
  2. Default to command (fast) or prompt (smart) hooks?
  3. How aggressive should blocking be? (strict vs suggestive)
  4. Should hooks write to .docs/ for audit trail?

References

  • Claude Code Hooks Guide
  • SubagentStop receives: session_id, transcript_path, cwd, stop_hook_active, hook_event_name
  • Transcript format: JSONL with type, role, content, id, timestamp per message

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions