docs: codex-pivot spec rev 2 — dual-surface (App + Action), shadow-mode phase#278
Conversation
Rev 2 incorporates spike #275 findings and the project-reviewer feedback on the conceptual v1. Architectural change: pr-review retires entirely (handled by the Codex GitHub App under subscription, zero in-repo YAML); the four write-side workflows migrate to openai/codex-action with API billing; the verb router collapses entirely since the App handles @codex review / @codex address feedback natively. Adds a mandatory shadow-mode phase with written kill criteria between App enablement and Claude pr-review retirement. Quality gate becomes a thin workflow filtered on the Codex bot's review state, escaping the severity-regex trap from #271. Refs #273, #275, #277
|
To use Codex here, create an environment for this repo. |
|
Claude finished @cbeaulieu-gt's task in 2m 13s —— View job PR Review CompleteReviewed the incremental changes in this push (640e20a → 243a4ee). AnalysisThis push refines the timing model for sub-issue #M (external consumer audit) and resolves Question 3 from §13. The changes address a critical sequencing concern: decoupling audit completion from the OAuth deadline while still guaranteeing consumers receive 30 days notice before GHCR image deletion. Changes in this push:
Findings🟢 Medium - Potential ambiguity in "in parallel with #L" phrasingLocation: Issue: The phrase "Survey post-release in parallel with Current text: - **#M — Audit external consumers pinned to `v2`.** Survey post-release in
parallel with `#L` (GHCR-deletion clock prep). Notify each consumer...Suggested refinement: - **#M — Audit external consumers pinned to `v2`.** Survey post-release,
preceding #L's 30-day grace window. Notify each consumer...Or more explicitly: - **#M — Audit external consumers pinned to `v2`.** Survey post-release to
prepare for #L (GHCR deletion begins 30 days after audit completes). Notify...Why it matters: Clear sequencing prevents misunderstandings during execution. The current phrasing could be read as "both happen around the same time," when the actual model is: #M completes → 30-day wait → #L executes. ✅ Positive: Cross-references are accurate
✅ Positive: Decoupling from OAuth deadline is sound
SummaryThe incremental changes are coherent and well-cross-referenced. The timing resolution addresses a legitimate sequencing concern raised by project-reviewer. One minor ambiguity in phrasing could be clarified, but the underlying model is sound. Verdict: APPROVE |
…NCERN + 2 NIT) BLOCKING: - examples/** and docs/consumer-onboarding.md DO exist on main (false negative in planner's verification step); add to touches, correct false claim, expand sub-issue #J's scope to cover them. - Shadow-mode duration logic rewritten: 7 days AND >=7 PRs binding floor; >=30 PRs is advisory, not blocking -- closes the deadline trap. - codex-gate.yml YAML draft hardened: empty-SHA guard + description field + workflow-level statuses:write permission + filter to chatgpt-codex-connector[bot]. CONCERN: - openai/codex-action @v1 pinning policy rescoped (precondition was unsatisfiable). - AGENTS.md sub-issue #A gains concrete acceptance test. - New sub-issue #O gates synchronize-event verification before Claude retirement. NIT: - S6 GHCR deletion sequencing note added. - S9 verb-router-retirement atomicity note added. Refs #273, #278 (this PR), #275 (spike findings unchanged).
MAJOR: - Shadow mode duration logic clarified (7d AND 7 PRs binding; 14d max; 30 PRs aspirational) - #M sequencing contradiction resolved — 30d GHCR clock gated on #M completion, not v3 release Medium: - #O sub-issue gains explicit timing (run during #A/#B, before shadow window) Nits: - Empty-SHA error message includes recovery hint - Redundant if-condition retained with explanatory comment - §8 observation methodology specifies structured PR-by-PR logging schema - #J acceptance test method specifies throwaway consumer-test repo - BLOCKING-#4 verification step now shows the git ls-tree command + 2026-05-20 date Findings from claude-action-runner[bot] review of e062774 → 07a08f0. Refs #273, #278 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Claude hit the turn limit before finishing this review. The PR may be too large or complex for the current settings. What happened: The review was cut off at 30 turns (PR has 1 files changed). Options:
|
The fix-up in 640e20a applied Option B to §9 row 12 (conditional 30-day clock) but left two references with Option-A "before-release" framing: #M's body and §13 Q3. This commit propagates Option B to both — #M runs post-release in parallel with #L; the GHCR-deletion clock is gated on #M completion; §13 Q3 marked resolved. Refs #273, #278
Summary
Rev 2 of the Codex pivot spec, incorporating:
chatgpt-codex-connector[bot]rangit ls-files | rgin sandbox to validate a claim).unverified:with resolution path.Architectural change v1 → v2
Dual-surface architecture (was: single-surface API-billed migration):
pr-reviewretires entirely. No replacement workflow file in repo. Codex GitHub App handles it cloud-side under subscription. Zero YAML, zero composite action, zero API billing for the review surface.apply-fix,lint-failure,ci-failure,tag-respondmigrate toopenai/codex-action@v1.8withOPENAI_API_KEY. API-billed (low-volume since failure-/comment-triggered).tag-claude/,claude-command-router/,check-auth/) collapses entirely — the Codex App handles@codex reviewand@codex address that feedbacknatively.New phase: Shadow mode
Mandatory shadow-mode evaluation between App enablement and Claude
pr-reviewretirement. Written kill criteria (≥7 days OR ≥30 PRs window; revert if Codex misses a BLOCKING finding Claude caught, or false-positive rate >2× Claude's, or latency >30 min on >20% of PRs). Addresses the BLOCKING-#3 transition-window finding without forcing a synthetic-stimulus decision.Quality gate replacement
New
codex-gate.ymlfilters onchatgpt-codex-connector[bot]review state!= CHANGES_REQUESTED. Replaces bothclaude-pr-review/quality-gateand its shadow variant. No severity-regex parsing — escapes the #271 trap.Open items
Spec carries 5
unverified:claims to be resolved during shadow mode or post-cutover measurement (monthly API spend, severity-to-state mapping, synchronize-event behavior, finding-density delta, external-consumer audit). 5 new gating unknowns (§13) flagged for user decision before sub-issues are filed.Review path
Per harness rule (project-planner spec output → project-reviewer auto-fires), I'll dispatch the reviewer after this PR is up. If significant findings, will raise to user about an inquisitor pass.
Refs
Closes — (none yet; spec lands first, sub-issues file after review)
Refs #273 (epic), #275 (spike findings incorporated), #277 (research doc — companion deliverable)
🤖 Generated by Claude Code on behalf of @cbeaulieu-gt