You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Enhancement: ce-deep-review-beta — verified cross-model deep review of high-stakes plans
Submitting this the way your contributions policy asks for — as an issue + an illustrative reference PR (#858), not a merge request. No need to approve the fork CI or merge anything on my account; flagging it in case it's useful for you (or your Claude/Codex review) to pick up, re-implement, or ignore.
What it is
A turnkey skill that runs a high-stakes plan through the Claude ce-doc-review panel, then — after one consent gate — fans the plan across non-Claude reviewer CLIs for decorrelated findings, verdict-tags each cross-model finding against the plan with a deterministic quote-grep backstop, and writes a reconciled verified <plan>.deep-review.md sidecar.
Pipeline:
Phase 0 — detect available arms (codex + agy; offline detection, no API calls, no secret leakage).
Phase 1 — Claude ce-doc-review panel (no egress); fail-stops if the panel didn't complete.
Phase 2 — single consent gate: gitleaks content preview + per-vendor opt-in whose option labels carry the egress verb (load-bearing — see below).
Phase 3 — dispatch only the consented arms across the same six lenses the panel uses, parallel across models; a deselected vendor is never sent the plan.
Phase 3.5 — deterministic quote-grep backstop assigns each finding CONFIRMED / NOT-FOUND-IN-DOC / NEEDS-HUMAN. It's authoritative and model-blind (the verdict never sees the producing model), so a model verifier can't inherit the confabulation it's meant to catch. CONFIRMED certifies the quoted evidence exists — not that the finding is correct.
Phase 4 — reconcile into the verified sidecar (data-loss-safe rotation; decision-changing union of panel + CONFIRMED cross-model findings).
Why it might interest you
It's a productized version of the "fan a plan across models and keep only the decorrelated, grounded findings" pattern, with the verifier-contamination failure mode designed out (deterministic backstop, not an LLM judge).
One thing the build surfaced that may be generally useful regardless of this skill: Claude Code's auto-mode permission classifier is consent-scope-keyed, not path-keyed. A cross-model egress dispatch is blocked even with allowed-tools set, unless the in-conversation consent is legible to the classifier — which is why the consent-gate option labels carry the egress verb + vendor (Send the plan to codex (OpenAI)) rather than a bare model name. Decision record is in the branch under docs/solutions/skill-design/.
Reference implementation
Branch / PR: feat(ce-deep-review): verified cross-model deep-review skill (RU1–RU6) + the eval harness behind it #858 (also includes the cross-model evaluation harness the skill was built on — a decision tool that measures whether cross-model review is worth shipping before building it; the harness's own finding on this question was inconclusive/underpowered, so it's offered as methodology, not a build recommendation).
It's a beta skill (-beta suffix + disable-model-invocation: true) — opt-in, never auto-fires.
Green locally (bun test, release:validate in sync); the automated Codex review on the PR is fully resolved.
Happy to split this into smaller issues, convert any part to a plain writeup, or close it if an issue isn't the surface you want for this. Thanks for making the plugin — it's what this was built on.
Enhancement:
ce-deep-review-beta— verified cross-model deep review of high-stakes plansSubmitting this the way your contributions policy asks for — as an issue + an illustrative reference PR (#858), not a merge request. No need to approve the fork CI or merge anything on my account; flagging it in case it's useful for you (or your Claude/Codex review) to pick up, re-implement, or ignore.
What it is
A turnkey skill that runs a high-stakes plan through the Claude
ce-doc-reviewpanel, then — after one consent gate — fans the plan across non-Claude reviewer CLIs for decorrelated findings, verdict-tags each cross-model finding against the plan with a deterministic quote-grep backstop, and writes a reconciled verified<plan>.deep-review.mdsidecar.Pipeline:
codex+agy; offline detection, no API calls, no secret leakage).ce-doc-reviewpanel (no egress); fail-stops if the panel didn't complete.Why it might interest you
allowed-toolsset, unless the in-conversation consent is legible to the classifier — which is why the consent-gate option labels carry the egress verb + vendor (Send the plan to codex (OpenAI)) rather than a bare model name. Decision record is in the branch underdocs/solutions/skill-design/.Reference implementation
-betasuffix +disable-model-invocation: true) — opt-in, never auto-fires.bun test,release:validatein sync); the automated Codex review on the PR is fully resolved.Happy to split this into smaller issues, convert any part to a plain writeup, or close it if an issue isn't the surface you want for this. Thanks for making the plugin — it's what this was built on.