Rewrite /deepen-plan with context-managed map-reduce (v3) by Drewx-Design · Pull Request #178 · EveryInc/compound-engineering-plugin

Drewx-Design · 2026-02-12T19:15:42Z

Summary

Replaces v1 /deepen-plan with a phased file-based map-reduce architecture (same pattern as the review command v2 in Rewrite workflows:review with context-managed map-reduce architecture #157)
Sub-agents write full analysis JSON to .deepen/ on disk and return only ~100 token summaries to parent
Parent context stays under ~12k tokens regardless of agent count (13-26 agents), vs unbounded in v1
Adds 5 new phases: Plan Manifest, Validation, Judge, Preservation Check, Compound Insights
Cross-platform safe: project-relative .deepen/ instead of /tmp/

Architecture Changes

Phase	v1 (current)	v3 (this PR)
Analyze	Inline parsing in parent	Dedicated agent writes structured manifest JSON
Discover	`find` + `head` (bash, Windows-broken)	Glob/Read native tools (cross-platform)
Research	Agents return full output to parent	Agents write JSON to `.deepen/`, return 1 sentence
Validate	None	Schema validation + hallucination flagging
Judge	None (dedup in parent)	Dedicated judge with source attribution priority
Enhance	Parent synthesizes (context overflow)	Dedicated enhancer reads disk, writes disk
Verify	Basic quality checks	Section preservation verification
Compound	Not supported	Creates validated `docs/solutions/` files via compound-docs skill

Alignment Audit: v3 vs Plugin Ecosystem

Performed a full audit of v3 against the compound-engineering plugin (v2.33.0) -- all agents, skills, commands, MCP servers, and conventions.

1. Agent Discovery Completeness -- Aligned

Check	Status	Evidence
Glob reaches plugin agents	OK	`~/.claude/plugins/cache//agents//*.md` matches `every-marketplace/compound-engineering/2.33.0/agents/`
Category filtering correct	OK	USE: `review/` (14), `research/` (4), `design/` (3), `docs/` (1). SKIP: `workflow/` (5 orchestrators)
Always-run agents exist	OK	`security-sentinel.md`, `architecture-strategist.md`, `performance-oracle.md` all verified
No valuable agents excluded	OK	v3 includes manifest-matched agents: `code-simplicity-reviewer`, `agent-native-reviewer`, `pattern-recognition-specialist`, language-specific reviewers

Fixed from v1: v1 said "run ALL agents" (40+ parallel). v3 uses intelligent selection: 3 always-run + manifest-matched, reducing from ~30 agents to 10-15 targeted ones.

2. Skill Discovery -- Aligned (with fixes applied)

Check	Status	Evidence
Glob reaches plugin skills	OK	`~/.claude/plugins/cache/*/skills//SKILL.md` matches all 14 skills
Skill-to-domain mappings correct	OK	Fixed: removed `kieran-rails-style` (doesn't exist as a skill -- it's an agent). Added: `andrew-kane-gem-writer`, `dspy-ruby`, `every-style-editor`, `create-agent-skills`
Skills with references/ handled	OK	v3 skill agent prompt now globs `references/`, `assets/`, `templates/` subdirectories

Fixed from v1: v1 only read SKILL.md. v3 agents also read references/*.md, assets/*, templates/* -- critical for skills like agent-native-architecture (14 reference files) and create-agent-skills (11 references + templates + workflows).

3. Command Pipeline -- Aligned (with fixes applied)

Pipeline	Status	Notes
`/workflows:plan` to `/deepen-plan`	OK	Plan analyzer handles any markdown format. Plans output to `plans/` directory.
`/deepen-plan` to `/workflows:work`	OK	Enhanced plan is valid markdown with Research Insights blocks and `// ENHANCED:` comments. Work command reads it fine.
`/deepen-plan` to `/plan_review`	OK	Fixed: v1 offered `/workflows:review` (code review, not plan review). v3 correctly offers `/plan_review`.
`/lfg` chain	OK	Fixed: Kept `name: deepen-plan` (not `workflows:deepen-plan`) so `/lfg` and `/slfg` references to `/compound-engineering:deepen-plan` still work.
Compound insights	OK	Fixed: v3 now references `compound-docs` skill and its YAML schema for properly validated learning files.

4. MCP Server Integration -- Aligned

Check	Status	Evidence
Context7 tool names	OK	`resolve-library-id` and `query-docs` verified against `plugin.json` mcpServers config
No browser automation needed	OK	Research-only command, agent-browser not applicable
No missing MCP servers	OK	Context7 is the only MCP server in the plugin

5. Naming and Convention Alignment -- Aligned (with fixes applied)

Check	Status	Notes
Command name	OK	Fixed: Kept `name: deepen-plan` (in `commands/` not `commands/workflows/`). Matches existing plugin structure and `/lfg`/`/slfg` references.
Output paths	OK	Plans: original location preserved. Learnings: `docs/solutions/[category]/` with compound-docs schema.
Frontmatter format	OK	Fixed: Compound insights now use full compound-docs YAML schema (13 fields) instead of simplified 6-field format.

6. Philosophy Alignment -- Aligned

Principle	Status	Evidence
Compounding	OK	Option 6 creates learnings; Step 3b discovers them in future runs
80/20 plan vs execution	OK	Command is entirely research/planning, never writes code
Source priority	OK	`skill > documented-learning > official-docs > community-web` matches `best-practices-researcher` Phase 1-2-3 order
Agent specialization	OK	Each agent focuses on its domain, returns structured JSON
Parallel-first	OK	All matched agents launch in single parallel batch

7. Edge Cases -- Compatible

Scenario	Status	Notes
No plugin installed	OK	Glob returns empty, "Handling Sparse Discovery" section acknowledges
Custom project agents	OK	`.claude/agents/*.md` glob catches project-local agents
Skills with scripts/	OK	v3 agent prompt globs `references/`, `assets/`, `templates/`
Multiple plugins	OK	`~/.claude/plugins/cache//agents//*.md` catches all
Plan with no frontmatter	OK	Manifest analyzer parses markdown sections, doesn't require frontmatter
Windows paths	OK	Uses project-relative `.deepen/`, Node.js for validation (no Python/bash dependency)

Key Fixes Applied (v1 to v3)

Context overflow prevention -- v1 pulled all agent output into parent. v3 uses file-based map-reduce.
Skill references/ -- v1 only read SKILL.md. v3 reads full skill tree including references, assets, templates.
kieran-rails-style ghost skill -- v1 referenced a skill that doesn't exist. v3 maps to actual skill names.
/workflows:review vs /plan_review -- v1 offered code review for plan feedback. v3 correctly offers plan review.
Command name preserved -- Kept deepen-plan (not workflows:deepen-plan) for /lfg//slfg compatibility.
Compound insights schema -- v1 had no compound integration. v3 uses full compound-docs YAML schema.
Windows compatibility -- .deepen/ instead of /tmp/. Node.js instead of Python3. Glob/Read instead of find/head.
Hallucination detection -- Agents with empty tools_used flagged, judge downweights confidence by 0.2.

Test plan

Run /deepen-plan plans/test-plan.md on a Rails plan -- verify .deepen/ directory created with expected JSON files
Verify parent context stays under ~14k tokens of agent output
Run /lfg end-to-end to confirm /compound-engineering:deepen-plan still resolves correctly
Test compound insights option (Step 9d, option 6) creates files matching compound-docs YAML schema
Test on Windows to verify .deepen/ path and Node.js validation work cross-platform
Verify judge removes duplicate recommendations and resolves conflicts with source priority
Run on a plan with no docs/solutions/ to confirm sparse discovery handles gracefully

Breaking Changes Risk

If the plugin updates these, v3 would need updating:

Agent renames -- v3 references agents by filename (e.g., security-sentinel). If agents are renamed, the always-run tier and manifest-matched lists need updating.
compound-docs YAML schema changes -- v3's compound insights option references the schema. If fields are added/removed, the instructions need updating.
Context7 tool name changes -- v3 hardcodes MCP tool names. If Context7 changes its API, the docs-researcher prompts need updating.

These are the same risks the existing /workflows:review and /workflows:plan commands face -- nothing unique to /deepen-plan.

…re (v3) Replace unbounded v1 agent output with phased file-based map-reduce pattern that keeps parent context under ~12k tokens. Adds plan manifest analysis, validation, judge phase with source attribution priority, and preservation checking. Aligns with plugin ecosystem conventions.

…mplementing The compound-docs skill already has a validated YAML schema and 7-step process. Instead of reimplementing it inside deepen-plan, offer the user the option to run /workflows:compound themselves.

The deepen-plan command deepens decisions but never challenges them. Real reviewer feedback showed it misses redundant tool params, YAGNI violations built despite being flagged, and misplaced business logic. Adds two new always-run agents: - agent-native-architecture-reviewer: routes to skill checklist, anti-patterns, and reference files (not generic prompt) - project-architecture-challenger: reads CLAUDE.md and challenges every decision against project-specific principles Also injects PROJECT ARCHITECTURE CONTEXT into all review/research agent prompts so they evaluate against project conventions.

…ents Validated across two real-world pipeline runs. Key changes: - Batched agent launches (max 4 pending) to prevent context overflow crashes - 200-char return cap on agent messages (all analysis in JSON files) - Version grounding: lockfile > package.json > plan text priority - Per-section judge parallelization (~21 min -> ~8-10 min) - Two-part output: Decision Record (reviewers) + Implementation Spec (developers) - Quality review phase (CoVe pattern) catches self-contradictions and code gaps - Enhancer resolves conditionals, verifies API versions, checks accessibility - fast_follow classification bucket for ticketable items - Convergence signals with [Strong Signal] markers - Task() failure recovery (retry once on infrastructure errors) - truncated_count field for judge convergence weighting - Pipeline checkpoint logging for diagnostics 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Three fixes from v3.4 test run: 1. Split merge judge into Data Prep Agent (haiku, mechanical I/O) + Merge Judge (reasoning only). Data prep reads 20+ files and compiles MERGE_INPUT.json. Merge judge reads one file, focuses on cross-section analysis. Fixes OOM/timeout failures where merge judge was spending half its budget on file reads. 2. Replace all ! operators in bash-embedded node -e scripts with === false and == null patterns. Bash history expansion escapes ! as \! which Node.js rejects as SyntaxError. 3. Add dash normalization to preservation check — em-dashes and en-dashes normalized before comparing section titles. Prevents false positives when enhancer normalizes typography. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Log analysis revealed learnings-researcher used string section_ids ("Phase-1-Types-Store") while manifest uses numeric ids (1, 2, 3). Section judges filter on numeric equality, so 3 of 5 learnings recs were silently dropped -- including high-value documented-learning source recs for debounce, frame budgeting, and checkpoint handling. Fixes: - OUTPUT RULES now explicitly say "must be a numeric id like 1, 2, 3. NOT a string like Phase-1" - Instruction EveryInc#5 warns string IDs will be silently dropped - Validation script warns on non-numeric section_ids 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

kieranklaassen · 2026-02-17T20:32:19Z

This is cool, still eciding how to incorporate since it does add a new pattern but will rty it out a bit

kieranklaassen

Code Review: PR #178 — Rewrite /deepen-plan with context-managed map-reduce (v3)

Reviewers: architecture-strategist, code-simplicity-reviewer, pattern-recognition-specialist, learnings-researcher + manual analysis

Summary

The core idea — file-based map-reduce where agents write JSON to .deepen/ and return only a completion signal — is a genuine architectural improvement that solves a real context overflow problem. The cross-platform improvements (.deepen/ instead of /tmp/, Glob instead of find, Node.js instead of Python) are also welcome.

However, the PR wraps this good idea in significant overengineering. At 1,191 lines, it is 2x the next-largest command in the plugin (workflows:plan at 557 lines). An estimated ~670 lines (56%) could be removed while preserving the core innovation. Additionally, the PR needs a rebase — the version numbers are stale.

🔴 P1 — Critical (Blocks Merge)

1. Version Regression — Needs Rebase

Main is at v2.35.2. This PR bumps to v2.34.0 in both plugin.json and marketplace.json. The CHANGELOG adds a ## [2.34.0] entry, but that version already exists on main (Gemini CLI target from PR #190).

Fix: Rebase on main, bump to v2.36.0, and update the CHANGELOG entry accordingly.

2. Significant Overengineering (~670 removable lines)

The simplicity review identified 7 areas of unnecessary complexity that could be simplified without losing the core map-reduce innovation:

Section	Lines	Why Remove
Judge pipeline (6a-6e)	~230	Enhancer agent can read agent JSON files directly and deduplicate as it goes. Three sub-agent types, two JSON schemas, and a validation script exist to produce input that the enhancer reads anyway.
Quality Review / CoVe (7b)	~120	This is a plan-deepening command, not code review. `/plan_review` already exists for plan quality. Checking "defensive stacking" in a plan doc is solving a problem the output format created.
Inline Node.js scripts	~80	Claude can read JSON and verify structure natively. Validation scripts add cross-platform concerns for no benefit over "Read each JSON file, verify required fields."
Checkpoint logging	~60	Diagnostic infrastructure for debugging the command itself. Users see failures in real-time. Not needed in the main flow.
Two-part output (Decision Record + Impl Spec)	~100	Downstream `/workflows:work` expects the simple `### Research Insights` format from v1. The two-audience split has no consumer.
Batching boilerplate	~30	The 200-char return cap already solves the problem. One sentence ("Launch in batches of 4-6 if many agents matched") replaces 30 lines.
Over-specified manifest	~30	8 boolean flags per section (`has_code_examples`, `has_ui_components`, etc.) are never referenced downstream. Drop them.

Recommendation: Simplify to 5 phases (Parse → Discover → Research → Enhance → Present) matching the ~500-550 line range of other workflow commands. Merge validation/judging/quality review into the enhancer prompt.

For comparison:

Command	Lines	JSON Schemas	Validation Scripts
`workflows:plan`	557	0	0
`workflows:review`	528	0	0
`deepen-plan` v1	546	0	0
`deepen-plan` v3	1,191	5	3

🟡 P2 — Important (Should Fix)

3. Duplicate Step 8 Numbering (lines 319-320)

8. **truncated_count is REQUIRED (default 0).**
8. **CRITICAL — YOUR RETURN MESSAGE TO PARENT MUST BE UNDER 200 CHARACTERS.**

Two items numbered "8" in the OUTPUT_RULES block. The executing agent may deprioritize one.

Fix: Renumber to 8 and 9.

4. `model: haiku` Not Supported (line 690)

", model: haiku)

Claude Code's Task() tool does not accept a model: parameter. This will either be ignored or error. No other command in the plugin uses this.

Fix: Remove the parameter or add a comment that it's aspirational.

5. Step 5a/5b Behavior Inconsistency

Step 5a says: "Re-launch missing agents before proceeding"
Step 5b silently deletes invalid files with fs.unlinkSync(fp) without re-launching

These should follow the same strategy. Currently, missing files get retried but corrupt files get silently dropped.

6. `+ SHARED_CONTEXT + OUTPUT_RULES` Pseudocode

Lines 373, 383, 393, 410, 437, 461 use string concatenation syntax:

" + SHARED_CONTEXT + OUTPUT_RULES)

This is pseudocode — Claude Code doesn't support string concatenation in Task() calls. The executing agent must mentally expand these references, which adds cognitive load and risk of omission. Consider inlining a shortened version or adding explicit expansion instructions.

🔵 P3 — Nice-to-Have

7. Token Budget Inconsistency

Line 30: "Parent context stays under ~15k tokens"
Appendix: "Total parent from agents: ~8,500-13,000"

Pick one source of truth. The appendix is more detailed and credible.

8. Missing AskUserQuestion Tool Reference

The post-action options (line 1165) don't mention using the AskUserQuestion tool, breaking the pattern used by workflows:plan and the old deepen-plan.

9. No `compound-engineering.local.md` Integration

Unlike workflows:review which reads review agents from settings, this command has a fixed agent list. Consider whether it should respect the same config.

10. Consider Command-to-Skill Conversion

Per project memory: "Commands do NOT support progressive disclosure." A 1,191-line prompt loads entirely on every invocation. If restructured as a skill with SKILL.md + references/ files for each phase's agent templates, the initial context load would be much smaller. The command becomes a thin wrapper that invokes the skill.

✅ What the PR Does Well

Core innovation is sound: File-based map-reduce (.deepen/ directory) genuinely solves context overflow
Cross-platform safety: .deepen/ instead of /tmp/, Glob instead of find, Node.js instead of Python
Eliminates v1 conflict: Old file had contradictory "run ALL agents" (Section 5) vs. selective matching (Sections 1-4). PR resolves this entirely
Version grounding: Reading lockfiles for actual versions instead of trusting plan text is a good idea
Manifest-matched agent selection: Intelligent agent matching replaces brute-force "run 40+ agents"
Frontmatter and naming conventions: Correctly preserved, follows all plugin patterns
PR description quality: Thorough alignment audit, architecture comparison table, breaking changes analysis

Recommended Path Forward

Rebase on main and fix version to 2.36.0
Simplify to ~500-550 lines by removing judge pipeline, quality review, inline scripts, and checkpoint logging
Fix the three P2 items (duplicate numbering, model:haiku, step inconsistency)
The result: the core map-reduce innovation at a complexity level consistent with the rest of the plugin

🤖 Generated with Claude Code

ultraon · 2026-02-25T13:45:24Z

This one is really good and works. It consumes significantly fewer tokens on the main agent.

Drewx-Design added 3 commits February 12, 2026 14:13

fix: delegate compound insights to /workflows:compound instead of rei…

52c60f3

…mplementing The compound-docs skill already has a validated YAML schema and 7-step process. Instead of reimplementing it inside deepen-plan, offer the user the option to run /workflows:compound themselves.

Drewx-Design marked this pull request as draft February 13, 2026 15:49

Drewx-Design and others added 3 commits February 14, 2026 14:52

Drewx-Design marked this pull request as ready for review February 16, 2026 14:20

kieranklaassen reviewed Feb 24, 2026

View reviewed changes

Drewx-Design marked this pull request as draft February 26, 2026 15:05

Drewx-Design mentioned this pull request Mar 1, 2026

Rewrite workflows:review with context-managed map-reduce architecture #157

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite /deepen-plan with context-managed map-reduce (v3)#178

Rewrite /deepen-plan with context-managed map-reduce (v3)#178
Drewx-Design wants to merge 6 commits intoEveryInc:mainfrom
Drewx-Design:feat/deepen-plan-v3-context-managed

Drewx-Design commented Feb 12, 2026

Uh oh!

kieranklaassen commented Feb 17, 2026

Uh oh!

kieranklaassen left a comment

Uh oh!

ultraon commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Drewx-Design commented Feb 12, 2026

Summary

Architecture Changes

Alignment Audit: v3 vs Plugin Ecosystem

1. Agent Discovery Completeness -- Aligned

2. Skill Discovery -- Aligned (with fixes applied)

3. Command Pipeline -- Aligned (with fixes applied)

4. MCP Server Integration -- Aligned

5. Naming and Convention Alignment -- Aligned (with fixes applied)

6. Philosophy Alignment -- Aligned

7. Edge Cases -- Compatible

Key Fixes Applied (v1 to v3)

Test plan

Breaking Changes Risk

Uh oh!

kieranklaassen commented Feb 17, 2026

Uh oh!

kieranklaassen left a comment

Choose a reason for hiding this comment

Code Review: PR #178 — Rewrite /deepen-plan with context-managed map-reduce (v3)

Summary

🔴 P1 — Critical (Blocks Merge)

1. Version Regression — Needs Rebase

2. Significant Overengineering (~670 removable lines)

🟡 P2 — Important (Should Fix)

3. Duplicate Step 8 Numbering (lines 319-320)

4. model: haiku Not Supported (line 690)

5. Step 5a/5b Behavior Inconsistency

6. + SHARED_CONTEXT + OUTPUT_RULES Pseudocode

🔵 P3 — Nice-to-Have

7. Token Budget Inconsistency

8. Missing AskUserQuestion Tool Reference

9. No compound-engineering.local.md Integration

10. Consider Command-to-Skill Conversion

✅ What the PR Does Well

Recommended Path Forward

Uh oh!

ultraon commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

4. `model: haiku` Not Supported (line 690)

6. `+ SHARED_CONTEXT + OUTPUT_RULES` Pseudocode

9. No `compound-engineering.local.md` Integration