ci(claude): add label-gated fork-PR review job by cliffhall · Pull Request #1338 · modelcontextprotocol/inspector

cliffhall · 2026-05-21T16:20:57Z

Summary

Adds a hardened, label-gated pull_request_target path so @claude can review external fork PRs without exposing secrets to prompt injection.
The fork-review job exposes a single outbound network destination — the mcp-docs MCP server at https://modelcontextprotocol.io/mcp — for protocol lookups during review. No WebFetch, no other MCP servers, no unrestricted Bash.
Pins anthropics/claude-code-action to v1.0.99 by commit SHA (12310e4417c3473095c957cb311b3cf59a38d659) to avoid the recurring AJV-schema-drift crash family (Does mcp inspector support 06-18 auth flow? #852/Calling a tool with structured output that returns an array of objects throws an error #872/Add implicit default origin in case of port 80 #892/Fix $ref resolution in request schema properties before validation #902/Add proposed spec for UX #947/Missing support for icon theme #965/Updating recommendation for V2 UI Components #980/Add Tasks support #1013, root cause tracked in anthropics/claude-code-action#1021).

The first-party claude job is unchanged — it already has the collaborator gate and PR-head-SHA checkout from #1270.

This mirrors the pattern in the companion PR for the servers repo: modelcontextprotocol/servers#4222.

Threat model

The standard claude.yml pattern triggers on issue_comment / pull_request_review_comment. On a fork PR, GitHub uses the workflow from the base repo (good) but reads diff/comment/file content from the fork (untrusted). Risks:

Prompt injection in comments, code comments, or filenames.
Secret exfiltration if the workflow has pull-requests: write, ANTHROPIC_API_KEY, or other secrets reachable from Claude's tool surface.
Code execution on runner if any install/build/test step touches fork-controlled code.

pull_request_target is the canonical foot-gun: trusted workflow code runs with base-repo secrets while checking out untrusted code. The pattern below uses it deliberately, with strict guardrails.

Mitigations in the new `claude-fork-review` job

Layer	Mitigation
Trigger	`pull_request_target: types: [labeled]`, filtered to label name `claude-review`. A drive-by contributor cannot apply labels.
Checkout	Fork head checked out with `persist-credentials: false`, `fetch-depth: 1`. No install/build/test step — fork code is never executed.
Permissions	`contents: read`, `pull-requests: write`, `issues: read`. Nothing else.
Tool surface	`--allowedTools` is just `mcp__github_inline_comment__create_inline_comment`, `mcp__mcp-docs`, `Bash(gh pr view:)`, `Bash(gh pr diff:)`, `Bash(gh pr list:*)`. No unrestricted `Bash`, no `Edit`/`Write`, no `WebFetch`. An injection that successfully redirects Claude can post a comment and query the docs server — that is it.
Network egress	The only outbound HTTP Claude can make is to `https://modelcontextprotocol.io/mcp`, via a base-repo-controlled MCP config written to `$RUNNER_TEMP` before the action runs. The fork has no input into which servers Claude connects to.
`.mcp.json` shadowing	Claude Code auto-discovers project-level `.mcp.json` from cwd in addition to anything passed via `--mcp-config`. A fork could ship a `.mcp.json` pointing at attacker-controlled MCP servers. The Prepare trusted MCP config step explicitly `rm -f .mcp.json` before launching Claude, and writes the trusted config under `$RUNNER_TEMP` (outside the fork checkout, so the fork cannot shadow it). The block comment above that step spells out the threat.
Conversation budget	`--max-turns 8`.
System prompt	Explicitly tells Claude that diff/PR/file content is untrusted data, that injection attempts should be flagged in the review rather than followed, and to limit the review to code quality / correctness / MCP-spec alignment.
Label removal	An `always()` step removes the `claude-review` label after every run, so a fresh maintainer eyeball is required to re-trigger.

Action version pinning

Pinned to anthropics/claude-code-action@12310e4417c3473095c957cb311b3cf59a38d659 (v1.0.99). The action validates settings against a schema that ships inside the private Claude Agent SDK, and SDK schema drift on upstream bumps has crashed the runtime before any API call in at least eight versions (exit 1 in ~150–300ms, total_cost_usd: 0, minified stack trace mentioning depsCount/dependencies). v1.0.99 predates the v1.0.100 install.sh regression (#1242) and is the most recent known-working anchor. Until anthropics/claude-code-action#1021 lands, SHA pinning is non-negotiable for this action.

Residual risks

Injection inside the diff itself may still coerce Claude into posting attacker-chosen text as a PR comment. Blast radius is bounded (no secret access via the comment), but the comment could be misleading. Treat Claude's review as advisory, not authoritative, on fork PRs.
Docs server is in trust scope. Responses from modelcontextprotocol.io/mcp become input to Claude during the review. It is a read-only docs endpoint, so low-risk, but worth naming.
Maintainer label-application is the trust signal. A maintainer who applies the label without scanning the diff degrades the security model.
Action upstream compromise. SHA pinning protects against future malicious pushes, not against a compromised release at the pinned SHA. Re-review the action's source before bumping the pin.

Pre-merge checklist

Create the claude-review label in repo settings.
Verify secrets.ANTHROPIC_API_KEY is available to pull_request_target events (not environment-restricted).
Smoke-test the fork-review job on a controlled draft PR (open from a fork, apply the claude-review label, watch the run logs). Abort merge if you see exit code 1 within ~300ms with total_cost_usd: 0 and a stack trace mentioning dependencies or depsCount — that's the AJV crash.
Document the label trigger somewhere external contributors will find it (CONTRIBUTING.md or PR template).
Add anthropics/claude-code-action to Dependabot, but treat bump PRs as review-required — do not auto-merge. Each bump needs a smoke test.

Test plan

Maintainer opens a draft fork PR and confirms the claude-fork-review job does not fire automatically.
Maintainer applies claude-review label; job runs, posts review, removes label.
Confirm label removal happens on both success and failure paths.
Confirm the existing @claude first-party flow is unaffected.
.mcp.json shadowing test. Open a draft fork PR that ships a .mcp.json pointing at a bogus host (e.g. https://example.invalid/mcp). Apply the claude-review label. In the run logs, confirm: (a) the Prepare trusted MCP config step's rm -f .mcp.json succeeded, (b) Claude's MCP-server connection log shows only modelcontextprotocol.io, and (c) no connection attempt was made to the fork-supplied host. If Claude connects to the bogus host, the shadowing protection has failed — do not merge.

🤖 Generated with Claude Code

@claude

Adds a hardened `pull_request_target` path so @claude can review external fork PRs without exposing secrets to prompt injection. A maintainer must apply the `claude-review` label to trigger the job; the label is auto-removed after each run. Mitigations: no Bash glob / Edit / Write / WebFetch, no fork code is executed, the fork's `.mcp.json` is deleted after checkout and replaced with a trusted base-repo-controlled config that exposes only the modelcontextprotocol.io/mcp docs server. The claude-code-action is pinned by commit SHA to v1.0.99 to avoid the recurring AJV schema-drift crash family. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cliffhall · 2026-05-21T16:27:12Z

Superseded by a PR from the same branch on the origin repo (avoids the fork-PR path for an internal change).

cliffhall closed this May 21, 2026

cliffhall deleted the claude-fork-pr-review branch May 21, 2026 16:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci(claude): add label-gated fork-PR review job#1338

ci(claude): add label-gated fork-PR review job#1338
cliffhall wants to merge 1 commit into
modelcontextprotocol:mainfrom
cliffhall:claude-fork-pr-review

cliffhall commented May 21, 2026

Uh oh!

cliffhall commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cliffhall commented May 21, 2026

Summary

Threat model

Mitigations in the new claude-fork-review job

Action version pinning

Residual risks

Pre-merge checklist

Test plan

Uh oh!

cliffhall commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Mitigations in the new `claude-fork-review` job