Skip to content

ci(claude): add label-gated fork-PR review job#1338

Closed
cliffhall wants to merge 1 commit into
modelcontextprotocol:mainfrom
cliffhall:claude-fork-pr-review
Closed

ci(claude): add label-gated fork-PR review job#1338
cliffhall wants to merge 1 commit into
modelcontextprotocol:mainfrom
cliffhall:claude-fork-pr-review

Conversation

@cliffhall
Copy link
Copy Markdown
Member

Summary

The first-party claude job is unchanged — it already has the collaborator gate and PR-head-SHA checkout from #1270.

This mirrors the pattern in the companion PR for the servers repo: modelcontextprotocol/servers#4222.

Threat model

The standard claude.yml pattern triggers on issue_comment / pull_request_review_comment. On a fork PR, GitHub uses the workflow from the base repo (good) but reads diff/comment/file content from the fork (untrusted). Risks:

  1. Prompt injection in comments, code comments, or filenames.
  2. Secret exfiltration if the workflow has pull-requests: write, ANTHROPIC_API_KEY, or other secrets reachable from Claude's tool surface.
  3. Code execution on runner if any install/build/test step touches fork-controlled code.

pull_request_target is the canonical foot-gun: trusted workflow code runs with base-repo secrets while checking out untrusted code. The pattern below uses it deliberately, with strict guardrails.

Mitigations in the new claude-fork-review job

Layer Mitigation
Trigger pull_request_target: types: [labeled], filtered to label name claude-review. A drive-by contributor cannot apply labels.
Checkout Fork head checked out with persist-credentials: false, fetch-depth: 1. No install/build/test step — fork code is never executed.
Permissions contents: read, pull-requests: write, issues: read. Nothing else.
Tool surface --allowedTools is just mcp__github_inline_comment__create_inline_comment, mcp__mcp-docs, Bash(gh pr view:*), Bash(gh pr diff:*), Bash(gh pr list:*). No unrestricted Bash, no Edit/Write, no WebFetch. An injection that successfully redirects Claude can post a comment and query the docs server — that is it.
Network egress The only outbound HTTP Claude can make is to https://modelcontextprotocol.io/mcp, via a base-repo-controlled MCP config written to $RUNNER_TEMP before the action runs. The fork has no input into which servers Claude connects to.
.mcp.json shadowing Claude Code auto-discovers project-level .mcp.json from cwd in addition to anything passed via --mcp-config. A fork could ship a .mcp.json pointing at attacker-controlled MCP servers. The Prepare trusted MCP config step explicitly rm -f .mcp.json before launching Claude, and writes the trusted config under $RUNNER_TEMP (outside the fork checkout, so the fork cannot shadow it). The block comment above that step spells out the threat.
Conversation budget --max-turns 8.
System prompt Explicitly tells Claude that diff/PR/file content is untrusted data, that injection attempts should be flagged in the review rather than followed, and to limit the review to code quality / correctness / MCP-spec alignment.
Label removal An always() step removes the claude-review label after every run, so a fresh maintainer eyeball is required to re-trigger.

Action version pinning

Pinned to anthropics/claude-code-action@12310e4417c3473095c957cb311b3cf59a38d659 (v1.0.99). The action validates settings against a schema that ships inside the private Claude Agent SDK, and SDK schema drift on upstream bumps has crashed the runtime before any API call in at least eight versions (exit 1 in ~150–300ms, total_cost_usd: 0, minified stack trace mentioning depsCount/dependencies). v1.0.99 predates the v1.0.100 install.sh regression (#1242) and is the most recent known-working anchor. Until anthropics/claude-code-action#1021 lands, SHA pinning is non-negotiable for this action.

Residual risks

  • Injection inside the diff itself may still coerce Claude into posting attacker-chosen text as a PR comment. Blast radius is bounded (no secret access via the comment), but the comment could be misleading. Treat Claude's review as advisory, not authoritative, on fork PRs.
  • Docs server is in trust scope. Responses from modelcontextprotocol.io/mcp become input to Claude during the review. It is a read-only docs endpoint, so low-risk, but worth naming.
  • Maintainer label-application is the trust signal. A maintainer who applies the label without scanning the diff degrades the security model.
  • Action upstream compromise. SHA pinning protects against future malicious pushes, not against a compromised release at the pinned SHA. Re-review the action's source before bumping the pin.

Pre-merge checklist

  • Create the claude-review label in repo settings.
  • Verify secrets.ANTHROPIC_API_KEY is available to pull_request_target events (not environment-restricted).
  • Smoke-test the fork-review job on a controlled draft PR (open from a fork, apply the claude-review label, watch the run logs). Abort merge if you see exit code 1 within ~300ms with total_cost_usd: 0 and a stack trace mentioning dependencies or depsCount — that's the AJV crash.
  • Document the label trigger somewhere external contributors will find it (CONTRIBUTING.md or PR template).
  • Add anthropics/claude-code-action to Dependabot, but treat bump PRs as review-required — do not auto-merge. Each bump needs a smoke test.

Test plan

  • Maintainer opens a draft fork PR and confirms the claude-fork-review job does not fire automatically.
  • Maintainer applies claude-review label; job runs, posts review, removes label.
  • Confirm label removal happens on both success and failure paths.
  • Confirm the existing @claude first-party flow is unaffected.
  • .mcp.json shadowing test. Open a draft fork PR that ships a .mcp.json pointing at a bogus host (e.g. https://example.invalid/mcp). Apply the claude-review label. In the run logs, confirm: (a) the Prepare trusted MCP config step's rm -f .mcp.json succeeded, (b) Claude's MCP-server connection log shows only modelcontextprotocol.io, and (c) no connection attempt was made to the fork-supplied host. If Claude connects to the bogus host, the shadowing protection has failed — do not merge.

🤖 Generated with Claude Code

Adds a hardened `pull_request_target` path so @claude can review external
fork PRs without exposing secrets to prompt injection. A maintainer must
apply the `claude-review` label to trigger the job; the label is
auto-removed after each run.

Mitigations: no Bash glob / Edit / Write / WebFetch, no fork code is
executed, the fork's `.mcp.json` is deleted after checkout and replaced
with a trusted base-repo-controlled config that exposes only the
modelcontextprotocol.io/mcp docs server. The claude-code-action is
pinned by commit SHA to v1.0.99 to avoid the recurring AJV schema-drift
crash family.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cliffhall
Copy link
Copy Markdown
Member Author

Superseded by a PR from the same branch on the origin repo (avoids the fork-PR path for an internal change).

@cliffhall cliffhall closed this May 21, 2026
@cliffhall cliffhall deleted the claude-fork-pr-review branch May 21, 2026 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant