Parent epic: #273 · Gating sub-issue under #294 · Hard gate on #A (#279)
Why this exists
Issue #294 surfaced a spec gap: with the Codex pivot, the central-prompt property of the Claude model is lost unless Codex offers an equivalent mechanism. Research narrowed the most promising candidate to Codex Cloud environment + setup script that fetches a canonical AGENTS.md from glitchwerks/github-actions. That path is currently rated "plausible, with third-party supporting evidence" (see #294 correction comment) — the chain "setup script writes file → Codex sees it during review" is inferred from openai/codex#20093, not empirically verified.
This issue is the empirical test that turns "plausible" into "verified" (or rules it out).
Hypothesis
If a Codex Cloud environment is configured for a repo with a setup script that writes a file (or curls a remote file) into the workspace as ./AGENTS.md before review fires, then @codex review on a PR in that repo will follow the guidance in that file.
Experiment design
Setup
- Throwaway test repo — create
cbeaulieu-gt/codex-pivot-env-test (or reuse an existing throwaway). Plain repo, minimal scaffolding (one source file in whatever language is convenient).
- Install Codex GitHub App on the test repo.
- Configure a Codex Cloud environment for the test repo. In its setup-script field, paste a script that writes a sentinel
AGENTS.md to the workspace:
cat > ./AGENTS.md <<'EOF'
# AGENTS.md
## Review guidelines
Flag any function named exactly `sentinel_canary_xyz` as a **P0 issue** with the exact text:
"CANARY DETECTED — env-script chain works".
Apply this rule to every changed file in the diff.
EOF
Sentinel chosen so it's unmistakable in Codex's review output and unlikely to be flagged by Codex's defaults.
Test PR
Open a PR on the test repo that adds a single function literally named sentinel_canary_xyz. The function body is irrelevant.
Trigger and observation
- Wait for auto-review (if enabled) or invoke
@codex review explicitly.
- Inspect Codex's review post.
Outcome decision matrix
| Codex output |
Conclusion |
Next action |
Review flags sentinel_canary_xyz as P0 with the canary text |
Env-script chain is verified. Setup script ran, wrote AGENTS.md to a workspace the reviewer reads, and reviewer followed the guidance. |
Commit to the env-script path as the primary centralization mechanism. Update #294 status to "✅ Verified." Reshape #A (#279) to author canonical AGENTS.md + document consumer-side env setup-script template. |
| Review flags it with different wording but cites a P0 rule |
Chain works; Codex paraphrases. Still a verified positive. |
Same as above; note the paraphrase behavior. |
Review does NOT flag sentinel_canary_xyz at all |
Chain does not work. Either setup script doesn't run on the App-review path, or it runs but writes to a different workspace, or Codex doesn't re-resolve AGENTS.md after the setup phase. |
Fall back to sync-workflow hybrid (option 1 in #294). File a follow-up to investigate which sub-step failed (env-side debug: did the script run? did the file exist post-setup?). |
Review flags sentinel_canary_xyz but cites Codex's default rules, not the canary text |
Ambiguous. Codex may have flagged it for an unrelated reason. |
Re-run with a more distinctive sentinel (e.g., flag-text contains a UUID). |
Variants to run if the primary test passes
Only run these if the primary test verifies the chain — otherwise skip:
- Variant 1 — remote fetch. Replace the heredoc with
curl -fsSL https://raw.githubusercontent.com/glitchwerks/github-actions/main/AGENTS.md > ./AGENTS.md (or a public test gist). Confirms the curl-from-canonical-source pattern works, not just a static heredoc.
- Variant 2 —
.codex/REVIEW.md resolution. Try writing the file to .codex/REVIEW.md instead of ./AGENTS.md. Confirms which paths Codex resolves.
Acceptance
Gating
Hard gate on #A (#279). Do not start AGENTS.md authoring (#A) until this experiment lands with a recommendation. Soft gate on #J (#288). Consumer onboarding docs depend on whether the env-script pattern or the sync-workflow pattern is the documented setup, so #J's content can't finalize until #Q resolves.
Execution note
This experiment requires a human in the Codex Cloud UI (creating the environment, pasting the setup script) — it's not currently scriptable from this repo's CI. Recommend cbeaulieu-gt drives the UI portion; agent assistance for PR setup and result interpretation.
🤖 Generated by Claude Code on behalf of @cbeaulieu-gt
Parent epic: #273 · Gating sub-issue under #294 · Hard gate on #A (#279)
Why this exists
Issue #294 surfaced a spec gap: with the Codex pivot, the central-prompt property of the Claude model is lost unless Codex offers an equivalent mechanism. Research narrowed the most promising candidate to Codex Cloud environment + setup script that fetches a canonical
AGENTS.mdfromglitchwerks/github-actions. That path is currently rated "plausible, with third-party supporting evidence" (see #294 correction comment) — the chain "setup script writes file → Codex sees it during review" is inferred from openai/codex#20093, not empirically verified.This issue is the empirical test that turns "plausible" into "verified" (or rules it out).
Hypothesis
If a Codex Cloud environment is configured for a repo with a setup script that writes a file (or
curls a remote file) into the workspace as./AGENTS.mdbefore review fires, then@codex reviewon a PR in that repo will follow the guidance in that file.Experiment design
Setup
cbeaulieu-gt/codex-pivot-env-test(or reuse an existing throwaway). Plain repo, minimal scaffolding (one source file in whatever language is convenient).AGENTS.mdto the workspace:Test PR
Open a PR on the test repo that adds a single function literally named
sentinel_canary_xyz. The function body is irrelevant.Trigger and observation
@codex reviewexplicitly.Outcome decision matrix
sentinel_canary_xyzas P0 with the canary textsentinel_canary_xyzat allsentinel_canary_xyzbut cites Codex's default rules, not the canary textVariants to run if the primary test passes
Only run these if the primary test verifies the chain — otherwise skip:
curl -fsSL https://raw.githubusercontent.com/glitchwerks/github-actions/main/AGENTS.md > ./AGENTS.md(or a public test gist). Confirms thecurl-from-canonical-source pattern works, not just a static heredoc..codex/REVIEW.mdresolution. Try writing the file to.codex/REVIEW.mdinstead of./AGENTS.md. Confirms which paths Codex resolves.Acceptance
Gating
Hard gate on #A (#279). Do not start AGENTS.md authoring (#A) until this experiment lands with a recommendation. Soft gate on #J (#288). Consumer onboarding docs depend on whether the env-script pattern or the sync-workflow pattern is the documented setup, so #J's content can't finalize until #Q resolves.
Execution note
This experiment requires a human in the Codex Cloud UI (creating the environment, pasting the setup script) — it's not currently scriptable from this repo's CI. Recommend cbeaulieu-gt drives the UI portion; agent assistance for PR setup and result interpretation.
🤖 Generated by Claude Code on behalf of @cbeaulieu-gt