Skip to content

feat(ce-replan-beta): two-phase re-brainstorm + re-plan from main#785

Draft
kieranklaassen wants to merge 17 commits intomainfrom
feat/ce-replan-beta
Draft

feat(ce-replan-beta): two-phase re-brainstorm + re-plan from main#785
kieranklaassen wants to merge 17 commits intomainfrom
feat/ce-replan-beta

Conversation

@kieranklaassen
Copy link
Copy Markdown
Collaborator

@kieranklaassen kieranklaassen commented May 6, 2026

Summary

Long-running PRs accumulate decisions that compound. Once new learnings emerge — review back-and-forth, code reading, a fresh brainstorm — the existing plan is grounded in assumptions that no longer hold, and neither recovery move available today fits cleanly: running /ce-plan from scratch loses the PR's good work, and patching the plan in place silently inherits the framing that should have been re-questioned.

/ce-replan-beta is a new beta skill for that moment. It takes a PR (auto-detects the current branch's, or accepts an explicit number) and runs a two-phase re-brainstorm → re-plan flow. Phase one forks the original *-requirements.md into a new dated revision with R-IDs carried forward stably; phase two derives a fresh full-redo plan from main. Original PR, plan, and brainstorm are preserved untouched — the new artifacts supersede them by reference. The skill performs no Git operations.

Why a beta and what changed mid-flight

The skill was designed, shipped, dogfooded, and rewritten in one session. v1 (commits 916d80f3f6b835) implemented a single-doc delta-replan against the existing PR's tree. The first real run on cora PR #2382 surfaced that the user's verb maps to redo from main with the latest understanding folded in, not adjust forward from where the PR is. The user's correction — "no it replan so start from main no base don we redo the full plan" — drove a v2 redesign.

v2 (commits 47220573b0ab86) is in this same PR. Two phases instead of six, two artifacts instead of one combined doc, R-IDs as the load-bearing anchor across revolutions. The chain of supersedes: links makes the loop walkable: original brainstorm ← v2 (forked) brainstorm; original plan ← v2 plan. The v1 docs are kept as historical context with superseded pointers.

Why two phases (and not one combined doc)

Because the brainstorm/plan separation in the rest of this plugin (ce-brainstormce-plan) exists specifically to keep WHAT and HOW from collapsing into the same surface. Conflating them in a replan produces v1's failure: a doc that annotated requirements inside the plan, leaving the original brainstorm to rot. v2 refreshes the requirements layer first, then derives the plan from it. R-IDs carry stably across the fork — R1 stays R1 across revisions, revisions keep their ID with new wording, discards leave gaps. That stability is what makes the compounding loop work.

What ships

Component Role
SKILL.md Phased workflow: mode detection, discovery via four scripts, re-brainstorm (Phase 2a/3a/4a), re-plan (Phase 2b/3b/4b), handoff with branch base committed
references/rebrainstorm-workflow.md Four-step re-derivation pattern at the requirements scope, R-ID stability rule, anti-patterns, brief-view worked example, legacy fallback for brainstorms without R-IDs
references/rebrainstorm-template.md Forked brainstorm output template — supersedes: + revision: frontmatter, [unchanged from rev N] / [revised from rev N] markers, ## Discarded Requirements section preserving R-ID gaps
references/replan-template.md New plan output template — always-from-main rule prominently, drops per-requirement annotation block (now in brainstorm), retains Cherry-Pick Guidance / Discarded Approaches / Supersedes / New Learnings sections
scripts/detect-pr.sh Auto-detect or explicit PR; exit code 2 sentinel for the no-PR routing
scripts/fetch-pr-context.sh GraphQL bundle of PR metadata, threads, reviews, comments, commits. v2 adds // [] defaults so missing keys produce structured nulls (the cora run silently absorbed three jq exit-5s)
scripts/find-original-plan.sh Scores docs/plans/ filenames by branch-name fragment match; prefers explicit PR-body links
scripts/find-original-brainstorm.sh New in v2. Prefers the plan's origin: frontmatter; falls back to topic-fragment scoring against docs/brainstorms/

Key decisions

  • Always full redo, no shape question. The verb "replan" maps to "redo from main" in user mental models. Adding a "delta or full?" question would let the v1 failure mode resurface.
  • Two phases, sequential synthesis. Mirrors the ce-brainstormce-plan separation. Conflating WHAT and HOW into a single confirmation is what v1 got wrong.
  • Fork the brainstorm, never mutate. Symmetric with how the skill treats the original PR and original plan. Each fork is dated and links back via supersedes:; the chain itself is the audit trail.
  • R-ID stability is load-bearing. Original IDs preserved across the fork. Revisions keep their ID. Discards leave gaps. New IDs continue from max+1. No renumbering, ever.
  • Skip the brainstorm phase when there's nothing to fork. When a plan came from scratch (no origin:), Phase 2a/3a/4a are skipped and only the new plan is written. The skill doesn't synthesize a brainstorm out of thin air.
  • Branch base committed in handoff. The handoff hands off main as the branch base to ce-work, so ce-work doesn't re-ask the user — that re-ask was the friction point in the cora session.

Inputs


Compound Engineering
Claude Code

kieranklaassen and others added 9 commits May 4, 2026 15:48
… dev server

Upload changes:
- Add R2 (Cloudflare R2) as a permanent upload destination using AWS CLI
- In headless/background mode, auto-upload to R2 when env vars are set
  (R2_ACCESS_KEY_ID, R2_SECRET_ACCESS_KEY, R2_BUCKET, R2_ENDPOINT, R2_PUBLIC_URL)
  without any user confirmation step
- In interactive mode, offer R2 as the first option in the destination menu
- Fall back to catbox if R2 upload fails

Browser reel tier:
- In headless mode, auto-start the dev server in background and poll for
  readiness (30s timeout) instead of asking the user to start it
- Track server PID for cleanup in Step 4

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Captures requirements and implementation plan for ce-replan-beta,
a PR-anchored replanning skill that re-grounds at the brainstorm
tier rather than patching existing plans in place.
Creates the skill directory and SKILL.md with the beta-skills
framework frontmatter (name, [BETA] description, disable-model-
invocation, argument-hint) plus the phase outline. Phase content
is wired in U5 once references and scripts exist.
- detect-pr.sh: targets explicit PR or auto-detects current branch's,
  emits PR metadata JSON; exit code 2 sentinel for no-PR routing.
- fetch-pr-context.sh: GraphQL bundle of PR metadata, review threads,
  review bodies, top-level comments, and commits; minimal filtering so
  the agent decides what's signal.
- find-original-plan.sh: scores docs/plans/ filenames by branch-name
  fragment matches; prefers explicit PR-body link when given. Emits
  the top candidate so the synthesis checkpoint can confirm/correct.
Documents the four-step pattern: read artifacts (PR first, plan
last), re-derive the problem frame from user discussion language,
mark every original requirement [unchanged]/[revise]/[discard], and
compose a three-bucket synthesis. Includes anti-pattern callouts
(diff-against-plan, critique-mode, brainstorm-from-zero) and a
worked example based on the brief-view scenario.
Defines the single canonical output template: frontmatter (with
original_pr/original_plan/supersedes), summary, re-grounded problem
frame, requirements with [unchanged]/[revise]/[discard] annotations,
discarded approaches, cherry-pick guidance table, supersedes block
with diff narrative, new learnings inventory, then standard ce-plan
body sections, then a suggested fresh branch name. Filename uses
the beta-skills framework's -beta-plan.md suffix to avoid colliding
with stable ce-plan output.
Adds the full phase content: mode detection (PR# / blank / no-PR
routing), discovery via the three scripts, re-grounding (loads
references/regrounding-workflow.md), synthesis checkpoint, doc
write (loads references/doc-template.md), and handoff menu with
inline routing. Documents pipeline-mode behavior — skip synthesis
prompt, route Inferred bets to ## Assumptions, skip handoff menu.
Cross-platform interaction via AskUserQuestion plus equivalents.
Surfaces ce-replan-beta in the Beta / Experimental table next to
ce-polish-beta. release:validate updates plugin counts automatically
during release-please runs; no manual count edits needed.
ce-code-review autofix pass (run 20260506-075859-f3ee7ae0):

- find-original-plan.sh: '-' moved to the front of the tr SET so it is
  literal. Previously written as '/-_+.', tr interpreted the '-' between
  '/' and '_' as a range, sweeping in uppercase letters and digits.
  Branches like JIRA-123-bug had their fragments silently destroyed by
  the unintended range translation; lowercase-only branch names happened
  to survive. Verified the fix produces correct fragment splits for both
  cases.
@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review

1. **Upload to R2 (public URL)** -- upload to Cloudflare R2 for permanent PR embedding (available when R2 env vars are set)
2. **Upload to catbox (public URL)** -- promote to catbox permanent hosting for PR embedding
3. **Save locally** -- save to a stable OS-temp path (/tmp/compound-engineering/ce-demo-reel/)
4. **Recapture** -- provide instructions on what to change
5. **Proceed without evidence** -- set evidence to null and proceed

P1 Badge Reduce destination menu to four blocking-tool options

This menu now defines five choices when R2 is configured, but upload-and-approval.md says to use blocking question tools first; those tools cap options at four in our target platforms, so this path can fail to render or execute the question instead of letting the user choose an upload destination. This is also explicitly disallowed by the plugins/compound-engineering/AGENTS.md rule for interactive question design (max 4 options), so the new R2 branch should be merged/split to stay within four options or switched to the documented numbered-chat overflow fallback.


if ! gh pr view "$PR_NUMBER" --json "$PR_FIELDS" "${REPO_FLAG[@]}"; then
exit 1

P2 Badge Map explicit missing PRs to the no-PR sentinel

The explicit PR path always exits with status 1 on gh pr view failure, while the skill contract treats “explicit PR number does not exist” as the same no-PR flow that should redirect users without writing a plan. As written, a typoed or deleted PR number aborts as a hard CLI error instead of taking the intended no-PR route, so this path should classify not-found responses and return exit code 2 just like auto-detect does.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Rewrites the brainstorm and plan from scratch as v2 docs after the
first real run on cora PR #2382 surfaced that v1's delta-shape didn't
match the user's verb. v2 is a two-phase re-brainstorm + re-plan flow
that always produces a from-main plan; original artifacts are
preserved by reference.

- Add v2 brainstorm at docs/brainstorms/2026-05-06-ce-replan-skill-rebrainstorm-requirements.md
  (revision: 2, supersedes the v1 brainstorm). 19 requirements
  organized around the two-phase shape, R-ID stability rule,
  fork-not-mutate discipline, and from-main baseline.
- Add v2 plan at docs/plans/2026-05-06-002-replan-ce-replan-beta-beta-plan.md
  (type: replan, original_pr: 785, supersedes the v1 plan). 7
  implementation units; cherry-pick guidance for v1 code that
  survives.
- Mark v1 brainstorm and v1 plan as superseded with pointers to the
  v2 docs. v1 docs retain their v2-Shape audit trail as historical
  context for the design transition.

The user's correction that drove the rewrite: "no it replan so start
from main no base don we redo the full plan" (cora session
10b929fb-c03f-4daf-b675-32c00ac44b43, 2026-05-06).
Updates the skill description to reflect the two-phase
re-brainstorm + re-plan output. Adds allowed-tools frontmatter
pinned per script filename so the runtime Bash invocations in the
forthcoming Phase 1 (U6) skip the per-call permission prompt.
Rewrites the intro paragraph; phase content is intentionally
empty here and lands in U6.
fetch-pr-context.sh: add // [] defaults on every collection
traversal so missing keys produce structured empty arrays rather
than triggering jq exit code 5 (which the cora run silently
absorbed three times). Same defense for nullable .author.login on
deleted accounts.

find-original-brainstorm.sh: new script. Given a plan path, prefer
the plan frontmatter's origin: field; fall back to scoring
docs/brainstorms/*-requirements.md by topic-fragment match against
the plan's filename topic. Same scoring shape as
find-original-plan.sh (with '-' first in the tr SET so it is
literal, not a range).
Replaces v1's regrounding-workflow.md (deleted) with a re-brainstorm
reference at the requirements scope. Documents the four-step pattern
(read artifacts in order with brainstorm last; re-derive problem
frame from user discussion language; walk every requirement assigning
[unchanged]/[revised]/[discarded] with R-ID stability; compose
three-bucket synthesis), the load-bearing R-ID stability rule, the
anti-patterns (diff-against-original, framing inheritance, critique
mode, brainstorm-from-zero, renumbering), the legacy fallback for
brainstorms without R-IDs, and a worked example based on the cora
brief-view scenario.

124 lines.
Loaded by Phase 4a. Documents the filename pattern
(<topic>-rebrainstorm-requirements.md), frontmatter contract
(supersedes + revision), full section order, and the new
## Discarded Requirements section unique to forks.

Discipline checks: every original R-ID must be accounted for
(active list with [unchanged]/[revised] marker, or discarded
section with [discarded] marker — never silent drop). New R-IDs
continue from max+1; no ID reuse. Frontmatter supersedes: names
the immediately-prior brainstorm; chain walks to root.
Replaces v1's doc-template.md (deleted) with replan-template.md
that:

- Drops the per-requirement [unchanged]/[revised]/[discarded]
  block (now in the forked brainstorm; plan only references
  R-IDs from origin:).
- Hoists the always-from-main rule to the top so the load-bearing
  discipline is unmissable. Plan units' Files:, Approach:, and
  Test scenarios: are written for the main baseline; code on the
  original PR's branch must be named in Cherry-Pick Guidance.
- Adds explicit cherry-pick reference idiom: "Reuses path/... from
  original PR commit <sha> — see Cherry-Pick Guidance."
- Documents type: replan in frontmatter (distinguishes from
  feat/fix/refactor for downstream tooling) plus original_pr
  and the origin: pointer to the forked brainstorm.

Discipline checks: no silent assumption that PR-branch code is on
main; Discarded Approaches each name a specific learning, not
generic rationale.
Replaces v1's six-phase delta-shaped body with the v2 two-phase
flow:

- Phase 0: mode detection (PR# / blank / no-PR routing).
- Phase 1: discovery via four scripts + plan-merge-state probe.
  Uses ${CLAUDE_SKILL_DIR}/scripts/X.sh paths so the runtime Bash
  tool resolves them correctly (bare relative paths fail
  empirically per AGENTS.md). The allowed-tools frontmatter pins
  per-script entries so users skip the per-call permission prompt.
- Phase 2a/3a/4a: re-brainstorm phase, loaded from
  references/rebrainstorm-workflow.md and written via
  references/rebrainstorm-template.md. Skipped entirely when no
  upstream brainstorm exists (R3 from the brainstorm) — re-plan
  runs against the original plan + PR + learnings instead.
- Phase 2b/3b/4b: re-plan phase, loaded via
  references/replan-template.md. Always-from-main baseline; plan
  units cite R-IDs from origin:.
- Phase 5: handoff with explicit hard rule against unsolicited
  "what is not in this plan" summaries (R19 from the brainstorm,
  matching the cora-session friction). Branch base committed to
  main; ce-work invocation passes both the plan path and the
  branch base so ce-work doesn't re-ask the user.

Synthesis checkpoints stay prose, not menus — option sets bias
the answer at the synthesis layer. Pipeline mode skips both
synthesis prompts and the handoff menu; Inferred bets land in
## Assumptions in each artifact.
Updates the Beta Skills row to reflect the v2 two-phase shape
(re-brainstorm + re-plan, R-IDs carried forward, full redo from
main) and clarifies that all three original artifacts (PR, plan,
brainstorm) are preserved as superseded by reference.

bun test green (1340 pass / 0 fail). bun run release:validate clean.
@kieranklaassen kieranklaassen changed the title feat(ce-replan-beta): replan from PR with brainstorm-tier re-grounding feat(ce-replan-beta): two-phase re-brainstorm + re-plan from main May 6, 2026
@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review


P1 Badge Fail PR-context fetch when upstream API call fails

The script uses gh api graphql ... | jq ... under set -e only, so a failure in gh api can be masked by the pipeline and still exit successfully if jq returns 0. In that case the skill continues with empty/invalid PR context instead of taking the documented hard-error path, which can lead to replans grounded on missing review/commit data. Add set -o pipefail (or explicit status checks) so upstream API failures stop execution.


5. **Proceed without evidence** -- set evidence to null and proceed

P1 Badge Keep destination question within blocking-tool option cap

This menu now defines five choices, but the referenced blocking question tools (AskUserQuestion/request_user_input) only support up to four options per question. When R2 is available, the agent is instructed to use the blocking tool first and will hit a schema/runtime error instead of presenting the choice flow, breaking interactive upload routing. Split this into sequential questions or collapse options so each prompt stays within the 4-option limit.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@kieranklaassen kieranklaassen marked this pull request as draft May 7, 2026 05:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant