Skip to content

feat: PR automation — opt-in --create-pr flag for both pipelines#75

Merged
jramos merged 6 commits into
mainfrom
feat/pr-automation
May 25, 2026
Merged

feat: PR automation — opt-in --create-pr flag for both pipelines#75
jramos merged 6 commits into
mainfrom
feat/pr-automation

Conversation

@jramos
Copy link
Copy Markdown
Owner

@jramos jramos commented May 25, 2026

Summary

Closes the last remaining item from the May 03 phase-1 next-steps doc: auto-generate pull requests against the source skill/tool repository when an evolved artifact passes the deploy gate. Opt-in via --create-pr. The PR carries the evolved artifact (atomic-write), and a body summarizing the gate decision + metrics.

What ships

  • New module evolution/core/pr_automation.py: create_pr(...), find_git_root(...), two block-builder helpers (disabled_pr_block, pr_block_from_result), and PRResult dataclass. Stdlib-only (no PyGithub); shells out to git + gh following the existing run_benchmark_hook pattern.
  • Five new CLI flags on both evolve_skill and evolve_tool:
    • --create-pr / --no-create-pr (default off)
    • --pr-base-branch (default main)
    • --pr-branch-prefix (default evolve/)
    • --pr-draft (default off)
    • --pr-allow-dirty (default off)
  • gate_decision.json::pr_created block records every outcome (status: created / skipped / failed / disabled) with stable 5-field shape so downstream consumers can index payload["pr_created"]["url"] unconditionally.
  • build_run_inputs gains a required create_pr: bool so reruns are reproducible from the artifact alone.

Design choices baked into this PR

  • Personal-use direct-push only. Fork-and-PR is a deliberate future workstream; the orchestration helper's signature (base_branch, no fork kwargs) is shaped so adding --pr-fork-owner later is one kwarg pair, not a refactor.
  • Branch from origin/<base>, not local HEAD. Eliminates a class of bugs: stale local main carrying unrelated commits into the PR diff, user on a feature branch committing to the wrong place, working-tree drift polluting the branch.
  • Refuse dirty working tree by default. git status --porcelain check before any mutation; escape hatch --pr-allow-dirty.
  • Atomic artifact writes via tempfile + os.replace (same-filesystem) — matches MCPManifestSource.apply_evolved's existing atomicity pattern.
  • 4-char secrets.token_hex(2) suffix on branch names so back-to-back runs can't collide at second precision.
  • Claude Code plugin cache refused (find_git_root returns Nonecreate_pr returns skipped with a clear reason).
  • Single-write gate_decision.json: pr_created is built before the (one and only) write_gate_decision call. No re-write.

Test plan

  • 14 new unit + 1 integration test in tests/core/test_pr_automation.py — the integration test creates a real bare repo + working clone, runs the full orchestration with ONLY gh pr create mocked at the subprocess boundary, asserts the new branch reaches the bare remote and the file content matches.
  • 2 new validation-flow tests per side (4 total) assert the pr_created block shape in gate_decision.json under --create-pr off + skipped paths.
  • Full suite green locally: 1166 passed (1162 baseline + 14 new — the integration test counts as one).
  • CLI flags visible: evolve_skill --help and evolve_tool --help show all 5.
  • CI green across 3.10 / 3.11 / 3.12 / 3.13.

Smoke deliberately skipped

A real-GitHub end-to-end smoke was not run. The integration test exercises the full real-git path (init bare repo, clone, branch, atomic copy, commit, push) and only mocks the gh pr create subprocess invocation. The gh side is a well-known one-shot subprocess call with stderr captured and surfaced on failure — if the user's gh auth works (verified locally), the real failure modes (auth, missing repo) fail loud rather than silent. Open to running a real-fork smoke as a follow-up if needed.

Constraints respected

  • Tool/skill symmetry: 5 CLI flags mirrored exactly between files; help text byte-identical
  • Stable pr_created shape across all 4 statuses (no downstream .get() ceremony required)
  • pyproject.toml unchanged: stdlib + already-present Rich Console

jramos added 6 commits May 25, 2026 11:08
Adds evolution/core/pr_automation.py with create_pr(), find_git_root(),
PRResult, and a small set of private helpers (_branch_name, _atomic_copy,
_format_pr_body). The helper packages the four manual steps after a
successful evolve — branch off origin/main, copy in the evolved artifact,
commit/push, gh pr create — behind one orchestration call.

Helper is unused in production this commit; the CLI flag that wires it
into evolve_skill/evolve_tool lands separately. Skip / fail paths return
structured PRResults with the branch and commit SHA populated where
applicable so users can recover from partial progress without re-running.
Address review feedback on c9f2f5c:
- Push reason-string formatting into _run_git so every callsite collapses
  to `return PRResult(status="failed", reason=res)`. Eliminates the
  inconsistent "git status timed out" vs "git checkout error: <repr>"
  split that the prior tuple-of-(ok, exception-or-completed) caused.
- _branch_name drops the dead `datetime | str` overload; only datetime
  is ever passed.
- _format_pr_body drops the duplicate `### Closed-loop tasks` section
  (info already in the headline).
- Drop the dead `else ""` branch on the delta line (delta is guaranteed
  non-None inside the surrounding `if baseline and evolved` block).
A successful deploy now optionally branches the source repo, commits the
evolved artifact, pushes, and opens a PR via gh — collapsing the prior
manual copy/branch/commit/push/PR cycle into one CLI flag. Off by
default; --pr-base-branch, --pr-branch-prefix, --pr-draft, and
--pr-allow-dirty tune the PR shape. The pr_created outcome lands in
gate_decision.json in the same single write as the rest of the decision
block so calibration scripts can grep one source of truth.
…ility

Address review feedback on 4624757:
- The disabled default was {"status": "disabled"} only; the created/
  skipped/failed branches build a full 5-field dict. Downstream
  consumers doing payload["pr_created"]["url"] would KeyError on the
  default path. Add reason/branch/commit_sha/url = None so the shape
  is stable across all four statuses.
- Tighten the disabled-block tests to assert the full 5-field equality
  rather than just the status string.
…tring

Address review feedback on the wiring commit:
- Add disabled_pr_block() and pr_block_from_result() to pr_automation
  so the 5-field shape lives in one place. Both evolve modules import
  and use them, eliminating six byte-identical dict literals.
- Move evolved_*.md / evolved_manifest.json writes to before the PR
  hook, guarded by `if growth_pass:`. The post-table block now relies
  on those writes instead of redoing them, eliminating the duplicate
  write on the --create-pr deploy path.
- EvolutionConfig.create_pr gets a docstring explaining why the field
  is kept on the dataclass even though no current code path reads it
  (CLI flag carries the per-run boolean directly via create_pr_flag
  kwarg; field reserved for future ergonomic-default support).
README PR-review-guardrail section now mentions the new --create-pr
opt-in flag, the personal-use-direct-push scope, and the campaign-loop
caveat. Memory entries updated:
- project_path_e_to_deploy_gate_arc_shipped.md marks PR automation
  shipped (last item from May 03 phase-1 next-steps doc)
- project_pr_automation_shipped.md (new) captures the design
  constraints that bit during implementation
@jramos jramos merged commit 8f34d8e into main May 25, 2026
4 checks passed
@jramos jramos deleted the feat/pr-automation branch May 25, 2026 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant