Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
0eff7a4
feat(ce-code-review-beta): add Codex CLI delegation for mid-tier revi…
davidalee May 4, 2026
0f54f60
fix(code-review-beta): harden Codex delegation contracts
davidalee May 4, 2026
fcfd7d1
fix(code-review-beta): secure delegated reviewer prompts
davidalee May 5, 2026
8edf815
fix(code-review-beta): address review findings
davidalee May 5, 2026
ab7a284
fix(code-review-beta): apply walk-through review fixes
davidalee May 5, 2026
9a84c7d
fix(code-review-beta): apply review-panel hardening fixes
davidalee May 5, 2026
512b7c4
refactor(code-review-beta): extract delegation content and tighten wo…
davidalee May 5, 2026
9b124c0
fix(code-review-beta): keep stable review references unchanged
davidalee May 5, 2026
8317471
fix(code-review-beta): harden delegation script portability
davidalee May 6, 2026
1f00421
fix(code-review-beta): make integrity-check-config fail closed via ex…
davidalee May 6, 2026
1016ed5
fix(code-review-beta): harden delegation workflow doc
davidalee May 6, 2026
dc8c60b
Merge branch 'main' into feat/ce-code-review-beta-codex-delegation
davidalee May 6, 2026
dc5b8d4
fix(code-review-beta): drop broken docs link in README beta row
davidalee May 6, 2026
51217e5
fix(code-review-beta): sync review-output-template with stable
davidalee May 6, 2026
0192f9d
fix(code-review-beta): harden resolve-base.sh path resolution and rem…
davidalee May 6, 2026
c09b90d
fix(code-review-beta): parse PR base host-agnostically in resolve-bas…
davidalee May 11, 2026
a6a97ec
chore(code-review-beta): resync ce-swift-ios-reviewer to post-#803 de…
davidalee May 11, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions plugins/compound-engineering/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ The primary entry points for engineering work, invoked as slash commands. Detail

| Skill | Description |
|-------|-------------|
| `ce-code-review-beta` | Same as `/ce-code-review` but delegates mid-tier persona reviewers to Codex CLI to conserve session tokens; high-stakes reviewers (correctness, security, adversarial) stay on the session model |
| [`ce-polish-beta`](../../docs/skills/ce-polish-beta.md) | Human-in-the-loop polish phase after /ce-code-review — verifies review + CI, starts a dev server from `.claude/launch.json`, generates a testable checklist, and dispatches polish sub-agents for fixes. Emits stacked-PR seeds for oversized work |
| `/lfg` | Full autonomous engineering workflow |

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
name: ce-swift-ios-reviewer
description: Conditional code-review persona, selected when the diff touches Swift files (.swift), SwiftUI views, UIKit controllers, iOS entitlements, privacy manifests, Core Data model bundles, SPM manifests, storyboards/XIBs, or semantic build-setting/target/signing changes inside .pbxproj. Reviews Swift and iOS code for SwiftUI correctness, state management, memory safety, Swift concurrency, Core Data threading, and accessibility.
description: Conditional code-review persona, selected when the diff touches Swift files, SwiftUI/UIKit views, iOS entitlements, privacy manifests, Core Data models, SPM manifests, storyboards/XIBs, or semantic .pbxproj changes. Reviews for SwiftUI correctness, state management, memory safety, Swift concurrency, Core Data threading, and accessibility.
model: inherit
tools: Read, Grep, Glob, Bash, Write
color: blue
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# ce-code-review-beta — Beta Status

This skill is the experimental Codex-delegation lane for `ce-code-review`. It exists in parallel with stable `ce-code-review`, and the two MUST converge on a graduation/sunset decision rather than living forever as siblings.

## What "beta" means here

- `disable-model-invocation: true` — manual user-invocation only; not auto-fired by other skills.
- Only delegation behavior diverges. Non-delegation paths (scope detection, intent discovery, reviewer selection, merge/dedup, validation, synthesis, fix routing) follow stable `ce-code-review`.
- Shared references are byte-equal and enforced by `tests/review-skill-contract.test.ts` parity checks. Drift in shared files is a test failure, not a feature.

## Graduation criteria

Promote delegation behavior into stable `ce-code-review` (and delete this skill) when ALL of the following hold across at least 20 manual review runs (logged via Mixed-Model Attribution in Coverage):

1. **Quality parity:** Delegated reviewers' findings are not materially worse than local-lane equivalents. Operationalize with a side-by-side run on the same PR — count P0/P1 finding overlap, false positive rate, and missed issues. Acceptable threshold: >=80% finding overlap on critical findings, no systematic miss class.
2. **Operational reliability:** <5% of delegated reviewer runs hit the circuit breaker, timeout cancellation path, or preflight failure across the sample. Cancellation is confirmed (not "unable to confirm") in >=95% of timeouts.
3. **Schema stability:** No major-version bumps to `findings-schema.json` (`_meta.schema_version`) needed during the beta period. Producers and consumers stayed in agreement.
4. **No security regressions:** No findings against the delegation lane in adversarial code review. The Self-Review Prompt Integrity Gate has tripped at least once and behaved correctly when it did.
5. **User feedback:** No outstanding open issues against `ce-code-review-beta` that block stable adoption.

When the criteria are met, the graduation PR should:
- Move delegation logic from beta `SKILL.md` and `references/codex-delegation-workflow.md` into stable `ce-code-review/SKILL.md` (under a `delegate:codex` argument or config flag, not as a default).
- Delete `plugins/compound-engineering/skills/ce-code-review-beta/` entirely.
- Run the removal procedure below.

## Sunset criteria

Delete this skill (without graduation) when ANY of the following hold:

1. **Quality regression that cannot be closed:** After two attempts at root-cause + fix, delegated reviewers consistently miss findings that the local lane catches at >=20% rate.
2. **Operational instability that cannot be closed:** Circuit breaker / timeout / cancellation failures persist >5% across consecutive runs for two months despite mitigation attempts.
3. **Codex CLI behavior shift:** Upstream Codex changes (sandbox, schema, auth model) make the delegation contract untenable to maintain.
4. **No user adoption:** No one (including the maintainer) has run `ce-code-review-beta` in 60 days. A beta no one uses is dead weight.

Sunset PR: delete the skill, run the removal procedure below, document the lessons in `docs/solutions/skill-design/codex-delegation-tradeoffs.md`.

## Telemetry

The Mixed-Model Attribution Coverage section (per `references/codex-delegation-workflow.md`) is the only structured telemetry source. It records which reviewers ran on which lane, which preflight gate fired, and any post-circuit-breaker fallback events. Aggregating this across runs requires manual log-keeping today; if delegation usage grows beyond a handful of reviewers, surface the Coverage data as machine-readable JSON in the run artifact at that time.

## Removal procedure

When deleting this skill (graduation OR sunset):

1. Delete `plugins/compound-engineering/skills/ce-code-review-beta/` (whole directory).
2. Add `ce-code-review-beta` to `STALE_SKILL_DIRS` in `src/utils/legacy-cleanup.ts` so flat-install artifacts get swept on plugin upgrade.
3. Add the skill name to `EXTRA_LEGACY_ARTIFACTS_BY_PLUGIN["compound-engineering"]` in `src/data/plugin-legacy-artifacts.ts`.
4. Remove tests scoped to `ce-code-review-beta` from `tests/review-skill-contract.test.ts`. Keep stable-side equivalents.
5. If graduating, update stable `ce-code-review` in the same PR with the migrated delegation behavior (gated by config or argument, not default).
6. Update `plugins/compound-engineering/README.md` skill count.
7. Run `bun run release:validate` and confirm clean.

## What does NOT diverge between stable and beta

- `findings-schema.json` (parity-enforced)
- `subagent-template.md` (parity-enforced)
- `diff-scope.md` (parity-enforced)
- `persona-catalog.md` (parity-enforced; lane column is informational in stable)
- `synthesis-rubric.md`, `architecture-patterns.md`, `walk-through-rubric.md`, `dispatch-fixers.md`, `validation-pass.md` (parity-enforced when present)
- Stage 5 merge/dedup, Stage 6 synthesis, Stage 7+ fix routing
- Headless error envelopes, mode-detection rules, finding numbering, residual-summary contract

If a future PR is tempted to drift one of these between stable and beta, the question to answer first is: "is this divergence load-bearing for delegation, or is it bit-rot?" If it's the latter, fix both sides or fix neither.
Loading