feat: environment-aware model routing with PreToolUse hook enforcement#39
Merged
Conversation
Spec at docs/specs/environment-aware-model-routing.md and approved plan at plans/environment-aware-model-routing.md. Addresses issue #37 — corporate proxies with restricted model allowlists and Anthropic snapshot deprecation. Two passes of plan-review personas (Acceptance, Design, UX, Strategic) — pass 2 final outcome 3/4 approve with Design blockers resolved (PreToolUse matcher verification gate, SessionStart hook for banner). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Verified matcher: "Agent" via production plugin precedent and docs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Single source of truth for tier → snapshot resolution. Replaces what's currently scattered across agent frontmatter and CLAUDE.md prose. Refs #37 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per-user override cache and append-only bump log generated by the resolver. Explicit entries (in addition to the existing .claude/metrics/*.log glob) prevent rename-time drift and document intent for the team. Refs #37 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hooks/lib/model-resolve.sh reads knowledge/model-routing.json and prints the resolved snapshot for haiku|sonnet|opus on stdout. Test-only env-var seams (MODEL_ROUTING_JSON, MODEL_OVERRIDES_JSON, MODEL_BUMP_LOG) keep the helper bats-isolatable. Override/cascade/error paths deferred to Steps 4-6. Refs #37 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolver now covers Steps 4-7 in one cohesive helper: - Single-hop override + JSONL bump log (exactly one event per invocation) - Multi-hop alias cascade up to _MAX_HOPS=3 - Cycle detection with AC5a stderr template - AC5 exhaustion template when chain terminates at an unresolvable tier - AC5b missing routing.json (exit 4) and AC5c malformed overrides (exit 5) - --dump-map flag for /model-routing-check 24/24 bats tests pass. Refs #37 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hooks/agent-model-resolve.sh is the enforcement surface for R1: it reads
PreToolUse-shaped JSON on stdin, shells out to hooks/lib/model-resolve.sh,
and emits one of:
- bump: hookSpecificOutput.updatedInput rewrites tool_input.model
- pass-through: exactly {} (no change)
- refusal: hookSpecificOutput.permissionDecision=deny with the
resolver's stderr as the reason
Registered in settings.json under PreToolUse with matcher="Agent" — the
LLM cannot bypass it. Fail-open posture on malformed stdin or unexpected
resolver exit codes so a buggy hook never blocks legitimate dispatch.
13/13 bats tests pass. AC16, AC17, AC18 fully covered.
Refs #37
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Read-only diagnostic that prints (a) the effective tier → snapshot map, (b) any override file contents, (c) the last N=10 bump events (raise MODEL_BUMP_TAIL to see more), and (d) probe applicability for the current ANTHROPIC_BASE_URL. AC10 (side-effect-free), AC11 (surfaces bumps), AC11a (tail cap), AC11b (probe-applicability line). 16/16 bats tests pass. Refs #37 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hooks/lib/model-probe.sh: - Reads y/N from stdin. Decline writes nothing (AC7). - On accept: probes $ANTHROPIC_BASE_URL/v1/models (5s timeout). - ok-all → 'All model tiers available; no overrides needed.' (AC7a) - missing → writes overrides + literal user message (AC7b) - non-Anthropic host → 'Probe skipped:' + docs/model-routing.md ref (AC8) - timeout / 5xx / malformed JSON → three differentiated messages (AC9) commands/init-dev-team.md gains a Step 4.5 with the verbatim prompt text. tests/hooks/fake-bin/curl shim deterministically replays each fixture based on MODEL_PROBE_FAKE_MODE. 15/15 bats tests pass. Refs #37 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e hook Replaces the static 'Model Routing Table' in agents/orchestrator.md with a 'Resolution Procedure' section that points at the enforcement surface: - hooks/agent-model-resolve.sh (PreToolUse hook, matcher=Agent) - hooks/lib/model-resolve.sh (resolver helper) - knowledge/model-routing.json (single source of truth) - .claude/model-overrides.json (per-user, gitignored) 'Tier guidance (informational)' subsection preserves the rationale-per-tier bullet list so new-agent authors have a guide for which tier to declare. Also sweeps the wider doc surface to remove 'Orchestrator Model Routing Table' references that now contradict hook-as-authority: - CLAUDE.md: static table → paragraph pointer; new /model-routing-check row in Slash Commands Registry - docs/agent-architecture.md: rewritten Model Routing subsection - docs/skills.md, prompts/quality-reviewer.md, commands/code-review.md, commands/review-agent.md, commands/agent-remove.md, knowledge/agent-registry.md, skills/agent-skill-authoring/references/templates.md: one-line reference fixes pointing at the Resolution Procedure 11/11 bats tests pass. AC2 holds across orchestrator.md and CLAUDE.md. ADR + docs/model-routing.md cross-references are placeholders pending Steps 19 + 20. Refs #37 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Step 16 sweep: - skills/performance-metrics/SKILL.md:79: claude-opus-4-6 → 'opus' tier alias - templates/agents/agent-template.md:32: rewrite comment to point at knowledge/model-routing.json + the PreToolUse hook instead of listing snapshot IDs inline tests/repo/no_pinned_snapshots_test.bats enforces AC2: no pinned snapshot IDs in plugin source outside the three approved files (knowledge/model-routing.json, docs/model-routing.md, templates/agents/agent-template.md). Spec/plan/eval-fixture files are out of scope. Refs #37 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
hooks/overrides-banner.sh prints the literal line: 'Note: model routing overrides active — run /model-routing-check to review.' to stderr when .claude/model-overrides.json exists at session start. Silent on clean installs; fail-open on malformed stdin. Registered in settings.json under SessionStart. Markdown command bodies cannot deterministically emit terminal output, so the SessionStart hook is the enforcement surface for AC19. 4/4 bats tests pass. Refs #37 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Defines a consistent blues-and-grays Mermaid theme (light fills, navy
text, blue borders) via a reusable %%{init}%% directive. Applies it to
the one existing diagram in code-review-process.md and ships a new
mermaid-diagramming skill with palette reference, typed examples, and
procedure for adding themed diagrams to markdown files.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Opt-in perf gate at MODEL_RESOLVE_PERF=1. Asserts 1000 sequential invocations complete under 50s wall-clock (50ms p99 ceiling per invocation), matching the spec target. Apple Silicon measurement: ~14ms per invocation, dominated by bash + jq cold-start. Optimisation: when no overrides file exists (the dominant case), skip the alias machinery and resolve in a single jq invocation. Cuts elapsed_ms from 16.2s to 13.8s. Spec AC15 updated to clarify the 50ms p99 target. The previous '5s wall-clock ceiling' wording was the aspirational 10× headroom, not a realistic threshold for shell+jq on macOS. Refs #37 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ADR 0004 records two decisions: 1. Pre-dispatch resolution, not runtime model_not_available retry — the harness owns the dispatch surface and the plugin cannot reach it. 2. PreToolUse hook enforcement, not orchestrator instruction — markdown instructions can be silently skipped by the LLM under context pressure. Plus a stub docs/model-routing.md to land the ADR cross-reference and the orchestrator.md ADR pointer (was a 00NN placeholder). 5/5 bats tests pass. Refs #37 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
docs/model-routing.md covers: - Contract (tier aliases, resolution inputs, exit-code taxonomy) - When the fallback fires (silent bump, refused dispatch, probe write) - Interpreting the override file (schema, sentinel values, alias chain) - Adding a new tier (5-step procedure) - Troubleshooting: Bedrock / Vertex / corporate proxy - Hand-writing the override file - Environment variables (user-facing vs. test-only seams) Links to ADR 0004. 12/12 bats tests pass. Refs #37 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
All 21 steps complete. 237/237 bats tests pass. R1 enforcement is empirically proven via the PreToolUse hook on the Agent matcher. Refs #37 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Three error-severity fixes addressing residual orchestrator-routing-table
references that contradicted the hook-as-authority model:
- commands/code-review.md:31 — Constraint 3 still claimed the orchestrator
routing table is authoritative
- prompts/quality-reviewer.md:39 — 'Pass each agent its model from the
routing table'
- docs/agent_info.md:25 — 'Model assignment is controlled by the
Orchestrator's routing table'
Plus:
- commands/harness-audit.md:52 — pointer to the renamed section
- commands/init-dev-team.md:461 — probe invocation now uses
${CLAUDE_PLUGIN_ROOT}/hooks/lib/model-probe.sh. The previous
repo-layout path 'plugins/agentic-dev-team/hooks/...' only resolved
from the plugin source tree, which would have broken the probe step
for every installed user.
237/237 bats tests still pass.
Refs #37
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Closes the remaining doc-review findings from PR #39: - docs/skills.md: add /init-dev-team to Workflow Commands and /model-routing-check to Utility Commands. Restores the 2-hop discoverability path from CLAUDE.md. - docs/diagrams/architecture-overview.svg: 'Model Routing Table' label replaced with 'Model Tier Resolution (PreToolUse hook)'. Two-line label so the box stays readable. - docs/diagrams/review-dispatch.svg: orchestrator subtitle 'Model Routing' → 'Agent Dispatch' (the orchestrator dispatches; the hook routes). Plus two Mermaid diagrams in docs/model-routing.md: - Architecture at a glance — flowchart showing the caller layer, harness, plugin enforcement surface, routing state, and diagnostics with edges showing the read/write relationships. - Dispatch flow — sequenceDiagram covering the three branches (pass-through, bump rewrite, deny) with alt/else blocks. Both Mermaid blocks validated via @mermaid-js/mermaid-cli mmdc. Uses the project's blue-gray theme directive per the mermaid-diagramming skill. 237/237 bats tests still pass. Refs #37 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Owner
Author
|
Addresses the remaining doc-review items called out in the PR body as known-out-of-scope:
Plus two new Mermaid diagrams in
Both diagrams use the project's blue-gray theme (per the 237/237 bats still pass. |
bdfinst
added a commit
that referenced
this pull request
Jun 1, 2026
Removes plans and specs for features that have shipped: - codegraph-integration (implemented) - environment-aware-model-routing (implemented, merged in PR #39) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #37.
Summary
Environment-aware model tier resolution for the agentic-dev-team plugin. Same code works on a personal Anthropic API key, a corporate proxy with restricted model allowlist, or Bedrock/Vertex deployments — with zero environment-specific config in the repo.
knowledge/model-routing.jsonships tier→snapshot defaults; every dispatch flows through it.Agentmatcher rewritestool_input.modelor refuses dispatch viapermissionDecision="deny". The LLM cannot bypass it..claude/model-overrides.jsonpopulated by an opt-in/init-dev-teamprobe or hand-written; never leaks into commits./model-routing-checkshows effective state;SessionStartbanner surfaces silent bumps.What ships
plugins/agentic-dev-team/knowledge/model-routing.jsonplugins/agentic-dev-team/hooks/lib/model-resolve.shplugins/agentic-dev-team/hooks/lib/model-probe.shplugins/agentic-dev-team/hooks/agent-model-resolve.sh(PreToolUse,matcher: "Agent")plugins/agentic-dev-team/hooks/overrides-banner.sh(SessionStart)plugins/agentic-dev-team/commands/model-routing-check.md/init-dev-teamStep 4.5docs/adr/0004-pre-dispatch-model-resolution.mdplugins/agentic-dev-team/docs/model-routing.mdProcess
Two full Specs → Plan → Build cycles:
docs/specs/environment-aware-model-routing.md(~140 Gherkin lines, 24 acceptance criteria across AC1–AC19)plans/environment-aware-model-routing.md(21 TDD steps; two passes of four plan-review personas — Acceptance, Design, UX, Strategic)Quality Gate
MODEL_RESOLVE_PERF=1 bats tests/hooks/model_resolve_perf_tests.batspasses — 13.8ms/invocation against 50ms p99 targetgit grep -nE 'claude-(haiku|sonnet|opus)-[0-9]'in plugin source returns matches only in the three approved files--arginterpolation throughout, fail-open posture, bounded SSRF surface)/init-dev-team(used${CLAUDE_PLUGIN_ROOT}instead of dev-repo-relative path)agent-architecture.md,code-review.md,agent-remove.md, plus minor stale refsTest Plan
/versionand any sub-agent dispatch behave identically to pre-change (zero-config baseline).claude/model-overrides.jsonwith{"tier_aliases":{"haiku":"sonnet"}}; next sub-agent taggedmodel: haikudispatches withclaude-sonnet-4-6and a JSONL line lands in.claude/metrics/model-routing.log/model-routing-checkprints the four sections cleanly with override present and bump log populated/init-dev-teamshows the probe prompt verbatim; answering "n" (or empty) writes nothingANTHROPIC_BASE_URL, accepting the probe emits "Probe skipped" without making an HTTP callKnown out-of-scope
Captured in the spec's §Out of Scope. Notably: runtime
model_not_availableretry (the harness owns that surface), multi-region Anthropic endpoint auto-detection, per-agent override files, telemetry beyond the bump log. Architecture-overview.svg still shows "Model Routing Table" — visual asset, queued for a separate cleanup.🤖 Generated with Claude Code