fix(ce-plan): compress synthesis confirmation to prose + call-outs by tmchow · Pull Request #819 · EveryInc/compound-engineering-plugin

tmchow · 2026-05-11T05:54:08Z

Summary

The synthesis gate before research/plan-write now surfaces only the decisions worth weighing in on: a 1-3 line prose summary plus 0-3 "Call outs" instead of the full Stated/Inferred/Out audit. When the prose fully captures the scope with no forks to flag, the gate skips entirely and the agent announces it's auto-proceeding.

Before, the confirmation regularly produced 15+ bullets for a moderate plan, even when granularity rules were followed. The volume made approval feel like rubber-stamping rather than a real checkpoint.

What changed

The synthesis is now a two-stage shape:

Internal three-bucket draft (Stated / Inferred / Out of scope): the agent's comprehensive thinking surface. Dissolves into plan body sections as before (Requirements, Key Technical Decisions or ## Assumptions in headless, Scope Boundaries).
Chat-time presentation: prose summary plus 0-3 "Call outs," derived from the internal draft via a keep test.

Each candidate call-out passes an affirmability test (can the user evaluate this without reading code?) and one of four categories: real fork, non-obvious behavioral choice, non-obvious exclusion, or cheap-now-expensive-later correction. Mechanical bets and implementation-flow specifics are cut before reaching chat.

Design decisions

Cap is tiered with a hard 6 ceiling that triggers re-cut, not cap-raising. Lightweight tops out at 3, Standard at 4, Deep at 6. Above 6, the synthesis is misshapen. Usually 2-3 of the call-outs are sub-decisions of one larger fork, so the rule directs the agent to collapse to higher abstraction rather than raise the cap.

Conditional skip is allowed but never silent. When zero call-outs survive, the agent emits a mandatory "Planning: [prose]. No open decisions for you to weigh in on, proceeding to [next phase]" announcement. The "why" must be visible. A default-to-keep rule on borderline call-outs keeps the failure mode bounded.

The affirmability test surfaces in SKILL.md, not only in the reference. The Phase 0.7 and 5.1.5 stubs carry the test inline so it loads on every invocation, even when the reference loads shallow. References can be skipped; SKILL.md is always loaded.

Soft-cut tracks call-outs by decision dimension, not surface wording. Stage 2 re-derivation after a revision can rephrase or merge call-outs, so identity needs to be the underlying fork. When a re-cut collapses multiple call-outs into one, the combined call-out inherits the "touched" status of any constituent.

Headless mode is unchanged in routing. The internal draft still dissolves into Requirements / Assumptions / Scope Boundaries. Stage 2 is moot when there's no synchronous user.

Test plan

bun test and bun run release:validate pass.

Behavioral changes to skill prose are not exercised by automated tests. Verify by running /ce-plan on:

A trivial prompt (expect auto-proceed announcement, no gate)
A moderate prompt (expect prose plus 1-3 call-outs)
A complex prompt that previously produced the volume problem (expect compression to 2-3 call-outs, the rest dissolved into the plan body)

The synthesis gate before research/plan-write was producing too much volume for users to weigh in on — full Stated/Inferred/Out buckets with 15+ bullets even when granularity rules were followed. Restructure into a two-stage shape: - Stage 1 (internal): the three-bucket draft the agent uses to think comprehensively; routes into plan body sections as before - Stage 2 (chat-time): 1-3 line prose summary plus 0-3 "Call outs" — only the forks where another reasonable agent might choose differently Add a keep test (affirmability + four categories: real fork, non-obvious behavioral choice, non-obvious exclusion, cheap-now-expensive-later) that gates each call-out. Cap is tiered by plan depth with a hard 6 ceiling — above that, re-cut at higher abstraction rather than raising the cap. When zero call-outs survive, skip the blocking question and emit a mandatory auto-proceed announcement so the agent never proceeds silently. Tighten granularity rules with an explicit anti-pattern list for call-outs (file paths, flags, JSON shapes, HTTP codes, implementation flow) and surface the affirmability test in the SKILL.md phase stubs so it loads reliably. Soft-cut now tracks call-outs by decision dimension rather than surface wording so re-cuts don't reset revision counts. Headless mode unchanged in routing — internal draft still dissolves into Requirements / Assumptions / Scope Boundaries.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 63a175bc06

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- Align top-level call-out count statements with the tiered cap table. The two-stage shape paragraph, stage-2 structure list, and synthesis- as-plan-pitch anti-pattern all deferred to the table under How many call-outs are right? so the agent receives a single deterministic limit (Lightweight: 0-3, Standard: 1-4, Deep: 2-6).

chatgpt-codex-connector Bot reviewed May 11, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/ce-plan/references/synthesis-summary.md Outdated

tmchow merged commit 60c1c93 into main May 11, 2026
2 checks passed

github-actions Bot mentioned this pull request May 11, 2026

chore: release main #817

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ce-plan): compress synthesis confirmation to prose + call-outs#819

fix(ce-plan): compress synthesis confirmation to prose + call-outs#819
tmchow merged 2 commits into
mainfrom
tmchow/ce-plan-confirmation-noise

tmchow commented May 11, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tmchow commented May 11, 2026

Summary

What changed

Design decisions

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant