feat(runtime): abstract Claude Code behind pluggable agent runtime by ifireball · Pull Request #1780 · fullsend-ai/fullsend

ifireball · 2026-06-02T13:15:04Z

Summary

Add internal/runtime with a Runtime interface and ClaudeRuntime as the default implementation.
Runner uses BootstrapInput and optional ClaudeHooksBootstrap (via harness adapter) instead of calling Claude Code directly from cli/run.
Narrow security.ClaudeSandboxHooks for hook generation; host-side scans stay in the runner.

Test plan

go test ./internal/runtime/... ./internal/security/...
go test ./internal/cli/ -run 'TestNewHarnessBootstrap|TestBuildRunCommand|TestBuildScanContext'
Smoke fullsend run against a harness agent in CI or locally

Made with Cursor

github-actions · 2026-06-02T13:16:40Z

Site preview

Preview: https://c7b5769a-site.fullsend-ai.workers.dev

Commit: ad214b192d2fd0596548b19bee8c6d7e467feb6f

Introduce internal/runtime with BootstrapInput and ClaudeHooksBootstrap interfaces so the runner can support other agent backends later. Move Claude bootstrap, execution, progress, and transcript handling out of cli/run while keeping host-side security scans in the runner. Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>

fullsend-ai-review · 2026-06-02T13:31:20Z

Review

Findings

Medium

[test-inadequate] internal/runtime/claude_test.go:TestBuildPluginConfigs_SinglePlugin — Test coverage was reduced during the move from internal/cli/run_test.go. The old test verified owner, version, source, category, installed_plugins qualified name, and settings.json enabledPlugins. The new test only checks plugin name and lspServers presence, dropping 5+ structural assertions on marketplace config output. The underlying buildPluginConfigs logic is unchanged, so these omissions could mask regressions if the config structure evolves.
Remediation: Restore the dropped assertions — verify mkt["owner"] is non-nil, plugin entry has version/source/category fields, installed_plugins.json contains the qualified name, and settings.json enabledPlugins contains the qualified name set to true.
[edge-case] internal/cli/run.go:396 — TOCTOU window between host-side content scan (scanRuntimeContent) and sandbox upload (rt.Bootstrap). An adversary who can modify host files between scan and upload could bypass the injection scan. The code comment acknowledges this ("Host files could change between scan and upload"). Pre-existing issue — the refactor preserved the previous behavior without closing the gap.
Remediation: Read file content once into memory, scan it, then pass the scanned bytes to Bootstrap for upload, eliminating the TOCTOU window.
[edge-case] internal/runtime/claude.go:Bootstrap — Security hooks (Tirith, SSRF, canary, secret redaction, unicode normalization) are only installed when BootstrapInput also implements ClaudeHooksBootstrap (via type assertion at line ~63). When the assertion fails, Bootstrap silently returns nil with no warning. A future runtime adapter that does not implement this extension will silently skip all sandbox-side security hooks. docs/runtimes.md documents this, but there is no programmatic enforcement or log warning.
Remediation: Add a log warning when the ClaudeHooksBootstrap type assertion fails, e.g., fmt.Fprintf(os.Stderr, "WARNING: BootstrapInput does not implement ClaudeHooksBootstrap; sandbox security hooks will not be installed\n").

Low

[test-inadequate] internal/runtime/claude_test.go:TestBuildPluginConfigs_ConfigStructure — Old test verified known_marketplaces.json contents (source repo field). New test only checks path suffixes, dropping content verification.
[test-inadequate] internal/runtime/claude_test.go:TestBuildPluginConfigs_EmptyPluginList — Old test verified both marketplace.json plugins array (nil check) and settings.json enabledPlugins. New test only checks settings.json, dropping the marketplace assertion.
[injection-vuln] internal/runtime/claude.go:bootstrapPlugins — Plugin directory basenames from harness config are interpolated into shell echo commands without quoting. If a plugin directory name contains shell metacharacters, the command could break or inject. Pre-existing pattern moved from cli/run.go; risk is low because this runs inside the sandbox and harness validation rejects unusual characters.

Info

[sub-agent-failure] N/A — The intent-coherence, style-conventions, and docs-currency sub-agents were not dispatched: model claude-sonnet-4-5@20250929 is unavailable on the Vertex deployment in prior runs. These are sonnet-tier agents; their absence does not block the review.

Previous run

Review

Findings

Medium

[test-inadequate] internal/runtime/claude_test.go:TestBuildPluginConfigs_SinglePlugin — Test coverage was reduced during the move from internal/cli/run_test.go. The old test verified owner, version, source, category, installed_plugins qualified name, and settings.json enabledPlugins. The new test only checks plugin name and lspServers presence, dropping 5+ structural assertions on marketplace config output. The underlying buildPluginConfigs logic is unchanged, so these omissions could mask regressions if the config structure evolves.
Remediation: Restore the dropped assertions — verify mkt["owner"] is non-nil, plugin entry has version/source/category fields, installed_plugins.json contains the qualified name, and settings.json enabledPlugins contains the qualified name set to true.
[edge-case] internal/cli/run.go:396 — TOCTOU window between host-side content scan (scanRuntimeContent) and sandbox upload (rt.Bootstrap). An adversary who can modify host files between scan and upload could bypass the injection scan. The code comment acknowledges this ("Host files could change between scan and upload"). Pre-existing issue — the refactor preserved the previous behavior without closing the gap.
Remediation: Read file content once into memory, scan it, then pass the scanned bytes to Bootstrap for upload, eliminating the TOCTOU window.
[edge-case] internal/runtime/claude.go:Bootstrap — Security hooks (Tirith, SSRF, canary, secret redaction, unicode normalization) are only installed when BootstrapInput also implements ClaudeHooksBootstrap (via type assertion). A future runtime that does not implement this extension will silently skip all sandbox-side security hooks with no warning. runtimes.md documents this, but there is no programmatic enforcement.
Remediation: Add a log warning or runtime interface method (e.g., HasSandboxHooks() bool) so the runner can detect when sandbox hooks are absent while the harness has security enabled.

Low

[test-inadequate] internal/runtime/claude_test.go:TestBuildPluginConfigs_ConfigStructure — Old test verified known_marketplaces.json contents (source repo field). New test only checks path suffixes, dropping content verification.
[test-inadequate] internal/runtime/claude_test.go:TestBuildPluginConfigs_EmptyPluginList — Old test verified both marketplace.json plugins array and settings.json enabledPlugins. New test only checks settings.json, dropping the marketplace assertion.
[injection-vuln] internal/runtime/claude.go:bootstrapPlugins — Plugin directory basenames from harness config are interpolated into shell echo commands without quoting. If a plugin directory name contains single quotes or shell metacharacters, the command could break. Pre-existing pattern; risk is low because this runs inside the sandbox and harness validation likely rejects unusual characters.

Info

[sub-agent-failure] N/A — The intent-coherence, style-conventions, and docs-currency sub-agents did not return findings: model claude-sonnet-4-5@20250929 is not available on the Vertex deployment. These are sonnet-tier agents; their failure does not block the review.

Previous run (2)

Review

Findings

Low

[error-handling] internal/runtime/claude.go:65 — ClaudeRuntime.Bootstrap() passes input.AgentPath() directly to sandbox.Upload() without an empty-string guard. The parallel function scanRuntimeContent (in bootstrap_scan.go) validates AgentPath() != "" at the top, but Bootstrap() does not. Similarly, the SkillDirs() loop in Bootstrap() does not skip empty entries, unlike the scanning code which explicitly continues on empty strings. Upstream harness validation (Validate() + ValidateFilesExist()) prevents empty paths in practice, so this is defense-in-depth, not a runtime risk.
Remediation: Add if input.AgentPath() == "" { return fmt.Errorf("agent path is required") } at the top of Bootstrap(), and add if skillPath == "" { continue } in the skill upload loop, consistent with scanRuntimeContent.
[TOCTOU] internal/cli/run.go:396 — The TOCTOU window between scanRuntimeContent() (host-side injection scan) and rt.Bootstrap() (upload) persists from the prior code structure. The inline comment acknowledges the runner owns the host FS. Accepted risk, not a regression.

Info

[test-adequacy] internal/runtime/claude_test.go — All 19 previously-deleted tests from run_test.go are ported to claude_test.go (TestBuildRunCommand_*, TestBuildPluginConfigs_*). Assertions are equivalent. No test weakening detected.
[security-preserved] internal/runtime/claude.go — Security hook installation (installClaudeHooks), command injection defenses (single-quote escaping in buildRunCommand), output sanitization (SanitizeOutput), and the security gate type-assertion (ClaudeHooksBootstrap) are all functionally equivalent to the removed code. The dual-settings-file contract is now explicitly documented in a code comment. No new injection surfaces or privilege escalation paths introduced.
[security-preserved] internal/security/hooks.go — GenerateClaudeSettings and HookFiles now accept ClaudeSandboxHooks instead of *harness.Harness. Default-enabled behavior, hook ordering, and nil-safety checks are preserved. No change in security semantics.
[sub-agent-failure] N/A — The style-conventions, intent-coherence, and docs-currency sub-agents did not return findings: model claude-sonnet-4-5@20250929 unavailable on the Vertex deployment. These are non-critical review dimensions (sonnet-tier).

Previous run (3)

Review

Findings

Low

[edge-case] internal/cli/bootstrap_scan.go:41 — scanRuntimeContent does not validate that BootstrapInput fields are non-empty before calling os.ReadFile. An empty AgentPath() would produce a confusing error rather than a clear validation message. Upstream harness validation likely prevents this, but the function doesn't enforce its own preconditions.
[error-handling] internal/runtime/claude.go:175 — bootstrapPlugins writes settings.json to configDir/settings.json while installClaudeHooks writes to SandboxWorkspace/.claude/settings.json. These are different paths today (not a bug), but the dual-settings-file contract is undocumented. If a future refactor unifies the paths, one would silently overwrite the other, losing either plugin config or security hooks.
[TOCTOU] internal/cli/run.go:396 — The refactoring widens the TOCTOU window between scanRuntimeContent() (host-side injection scan) and rt.Bootstrap() (upload). The code acknowledges this: "Host files could change between scan and upload; the runner owns the host FS here." Accepted risk, not a regression.

Info

[test-adequacy] internal/runtime/claude_test.go — Prior medium-severity finding about test coverage reduction is resolved. All 19 previously-deleted tests from run_test.go are now ported to claude_test.go, including PluginDirEscapesQuotes, DebugEscapesQuotes, NoDoubleSpaces (6 subtests), NoLspJSON, InvalidLspJSON, EmptyLspJSON, ConfigStructure, and EmptyPluginList.
[security-preserved] internal/runtime/claude.go — Security hook installation (installClaudeHooks), command injection defenses (single-quote escaping in buildRunCommand), output sanitization (SanitizeOutput), and security gate logic are all functionally equivalent to the removed code. No new injection surfaces or privilege escalation paths introduced.
[sub-agent-failure] N/A — The style-conventions, intent-coherence, and docs-currency sub-agents did not return findings: model claude-sonnet-4-5@20250929 unavailable on the Vertex deployment. These are non-critical review dimensions (sonnet-tier).

Previous run (4)

Review

Findings

Medium

[test-inadequate] internal/runtime/claude_test.go — Significant test coverage reduction during refactor. The PR removes 19 tests from run_test.go (TestBuildClaudeCommand_* and TestBuildPluginConfigs_*) but only adds 9 replacement tests in claude_test.go. Dropped edge-case tests include: quote escaping in plugin dirs (PluginDirEscapesQuotes), debug flag escaping (DebugEscapesQuotes), double-space prevention (NoDoubleSpaces with 6 subtests), multiple plugin dirs, disabled debug, and malformed/missing .lsp.json handling (NoLspJSON, InvalidLspJSON, EmptyLspJSON). These covered important defensive behavior in command construction.
Remediation: Port the missing tests into internal/runtime/claude_test.go. At minimum, re-add the quote-escaping tests, the NoDoubleSpaces table test, and the error-handling plugin config tests.

Low

[error-handling-gap] internal/runtime/claude.go — bootstrapPlugins writes settings.json to configDir/settings.json while installClaudeHooks writes to SandboxWorkspace/.claude/settings.json. This is the same behavior as the old code (not a regression), but the dual-settings-file contract is undocumented and fragile if paths are ever unified.
[edge-case] internal/cli/bootstrap_scan.go:41 — scanRuntimeContent does not validate that BootstrapInput fields are non-empty before calling os.ReadFile. An empty AgentPath() would produce a confusing error rather than a clear validation message. Upstream harness validation likely prevents this, but the function doesn't enforce its own preconditions.

Info

[sub-agent-failure] N/A — The style-conventions, intent-coherence, and docs-currency sub-agents did not return findings: model claude-sonnet-4-5@20250929 unavailable on the Vertex deployment. These are non-critical review dimensions (sonnet-tier).

waynesun09

Review Squad Report — 7 agents, 14 findings after dedup, 4 false positives removed

Agents: 2x claude-coder, 2x claude-researcher, 2x gemini-code-review, 1x cursor-code-review

Clean mechanical refactor that successfully abstracts Claude Code behind a Runtime interface. The architecture and code movement are sound — no behavioral bugs detected.

Summary of findings (HIGH: 3, MEDIUM: 6, LOW: 4, INFO: 1)

High:

runtime → harness dependency inversion via BoolDefault (5-agent consensus)
8 test cases dropped without replacement, including shell-injection-preventing quote escaping tests (5-agent consensus)
ClaudeHooksBootstrap in shared interface file couples portable layer to Claude (2 agents)

Medium:
4. No tests for 99 lines of security-critical bootstrap_scan.go (5-agent consensus)
5. TOCTOU regression — scan and upload no longer atomic per file (3 agents)
6. Runtime interface too wide at 10 methods + inconsistent SandboxName passing (5-agent consensus)
7. ClaudeSandboxHooks exports mutable harness.SandboxHooks pointer (5-agent consensus)
8. Duplicate ExtractTranscripts — deprecated copy with no remaining callers (4 agents)

See inline comments for details and suggestions.

ralphbean

I think this needs a few changes before we can merge. See inline comments.

Rebase onto main (remote resource resolve, --offline) while keeping the runtime abstraction. Port dropped command/plugin tests, drop duplicate sandbox.ExtractTranscripts, remove runtime→harness import, and move ClaudeHooksBootstrap to claude_bootstrap.go. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>

Add docs/runtimes.md, split TranscriptHandler from Runtime, narrow ClaudeSandboxHooks accessors, add bootstrap_scan tests, and document scan/upload TOCTOU follow-up. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>

Reject empty agent path before host-side scan. Document why plugin settings and security hooks use separate settings.json paths. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>

Validate agent path in Bootstrap, skip empty skill/plugin dirs, use ConfigDir/WorkspaceDir methods instead of sandbox constants, and unexport sanitizeOutput. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>

ifireball · 2026-06-03T09:07:22Z

All review threads from the first pass are resolved, and follow-ups from @maruiz93 / @rh-hemartin / fullsend-ai-review are addressed in 8bf10e4c (bootstrap path validation, WorkspaceDir(), unexported sanitizeOutput, merge with main).

Still blocking merge: CHANGES_REQUESTED from @ralphbean — could you take another look and re-approve when satisfied? @waynesun09 same if anything remains on your side.

TOCTOU hardening remains tracked in #1801 (accepted risk for this PR).

bootstrapCommon is runner-level; Claude hook dirs belong in installClaudeHooks behind ClaudeHooksBootstrap. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>

ifireball requested a review from waynesun09 June 2, 2026 13:15

ifireball self-assigned this Jun 2, 2026

ifireball requested review from maruiz93 and ralphbean June 2, 2026 13:15

ifireball marked this pull request as ready for review June 2, 2026 13:15

github-actions Bot deployed to site-preview June 2, 2026 13:16 View deployment

ifireball force-pushed the cursor/bb4bb4d3 branch from 3e31589 to d3e9c49 Compare June 2, 2026 13:22

github-actions Bot deployed to site-preview June 2, 2026 13:24 View deployment

fullsend-ai-review Bot added the requires-manual-review Review requires human judgment label Jun 2, 2026

waynesun09 reviewed Jun 2, 2026

View reviewed changes

ralphbean requested changes Jun 2, 2026

View reviewed changes

Comment thread internal/runtime/runtime.go

Comment thread internal/runtime/claude_test.go

Comment thread internal/runtime/runtime.go Outdated

github-actions Bot deployed to site-preview June 2, 2026 18:06 View deployment

fullsend-ai-review Bot approved these changes Jun 2, 2026

View reviewed changes

Comment thread internal/cli/bootstrap_scan.go

Comment thread internal/runtime/claude.go

Comment thread internal/cli/run.go

fullsend-ai-review Bot added ready-for-merge All reviewers approved — ready to merge and removed requires-manual-review Review requires human judgment labels Jun 2, 2026

ifireball mentioned this pull request Jun 2, 2026

Harden host-side runtime scan/upload atomicity #1801

Open

github-actions Bot deployed to site-preview June 2, 2026 18:16 View deployment

ifireball requested review from ralphbean and waynesun09 June 2, 2026 18:22

github-actions Bot deployed to site-preview June 2, 2026 18:23 View deployment

fullsend-ai-review Bot approved these changes Jun 2, 2026

View reviewed changes

Comment thread internal/runtime/claude.go

Comment thread internal/cli/run.go

fullsend-ai-review Bot added ready-for-merge All reviewers approved — ready to merge and removed ready-for-merge All reviewers approved — ready to merge labels Jun 2, 2026

maruiz93 reviewed Jun 2, 2026

View reviewed changes

Comment thread internal/runtime/claude.go

Comment thread internal/runtime/sanitize.go Outdated

maruiz93 mentioned this pull request Jun 2, 2026

Investigate deterministic orchestration for intra-stage multi-agent pipelines #1817

Open

rh-hemartin reviewed Jun 3, 2026

View reviewed changes

Comment thread docs/runtimes.md

rh-hemartin reviewed Jun 3, 2026

View reviewed changes

Comment thread internal/cli/run.go Outdated

rh-hemartin reviewed Jun 3, 2026

View reviewed changes

Comment thread internal/runtime/claude.go Outdated

rh-hemartin reviewed Jun 3, 2026

View reviewed changes

Comment thread internal/runtime/claude.go Outdated

ifireball and others added 2 commits June 3, 2026 12:05

Merge branch 'main' into cursor/bb4bb4d3

d895415

github-actions Bot deployed to site-preview June 3, 2026 09:08 View deployment

fullsend-ai-review Bot added requires-manual-review Review requires human judgment and removed ready-for-merge All reviewers approved — ready to merge labels Jun 3, 2026

fix(cli): create .claude/hooks only when installing security hooks

ad214b1

bootstrapCommon is runner-level; Claude hook dirs belong in installClaudeHooks behind ClaudeHooksBootstrap. Signed-off-by: Barak Korren <bkorren@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>

github-actions Bot deployed to site-preview June 3, 2026 09:20 View deployment

fullsend-ai-review Bot added requires-manual-review Review requires human judgment and removed requires-manual-review Review requires human judgment labels Jun 3, 2026

Conversation

ifireball commented Jun 2, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Site preview

Uh oh!

fullsend-ai-review Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review

Findings

Medium

Low

Info

Review

Findings

Medium

Low

Info

Review

Findings

Low

Info

Review

Findings

Low

Info

Review

Findings

Medium

Low

Info

Uh oh!

waynesun09 left a comment

Choose a reason for hiding this comment

Review Squad Report — 7 agents, 14 findings after dedup, 4 false positives removed

Summary of findings (HIGH: 3, MEDIUM: 6, LOW: 4, INFO: 1)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ralphbean left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ifireball commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

github-actions Bot commented Jun 2, 2026 •

edited

Loading

fullsend-ai-review Bot commented Jun 2, 2026 •

edited

Loading