Skip to content

feat(runtime): abstract Claude Code behind pluggable agent runtime#1780

Open
ifireball wants to merge 7 commits into
fullsend-ai:mainfrom
ifireball:cursor/bb4bb4d3
Open

feat(runtime): abstract Claude Code behind pluggable agent runtime#1780
ifireball wants to merge 7 commits into
fullsend-ai:mainfrom
ifireball:cursor/bb4bb4d3

Conversation

@ifireball
Copy link
Copy Markdown
Contributor

Summary

  • Add internal/runtime with a Runtime interface and ClaudeRuntime as the default implementation.
  • Runner uses BootstrapInput and optional ClaudeHooksBootstrap (via harness adapter) instead of calling Claude Code directly from cli/run.
  • Narrow security.ClaudeSandboxHooks for hook generation; host-side scans stay in the runner.

Test plan

  • go test ./internal/runtime/... ./internal/security/...
  • go test ./internal/cli/ -run 'TestNewHarnessBootstrap|TestBuildRunCommand|TestBuildScanContext'
  • Smoke fullsend run against a harness agent in CI or locally

Made with Cursor

@ifireball ifireball requested a review from waynesun09 June 2, 2026 13:15
@ifireball ifireball self-assigned this Jun 2, 2026
@ifireball ifireball requested review from maruiz93 and ralphbean June 2, 2026 13:15
@ifireball ifireball marked this pull request as ready for review June 2, 2026 13:15
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Site preview

Preview: https://c7b5769a-site.fullsend-ai.workers.dev

Commit: ad214b192d2fd0596548b19bee8c6d7e467feb6f

Introduce internal/runtime with BootstrapInput and ClaudeHooksBootstrap
interfaces so the runner can support other agent backends later. Move
Claude bootstrap, execution, progress, and transcript handling out of
cli/run while keeping host-side security scans in the runner.

Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@fullsend-ai-review
Copy link
Copy Markdown

fullsend-ai-review Bot commented Jun 2, 2026

Review

Findings

Medium

  • [test-inadequate] internal/runtime/claude_test.go:TestBuildPluginConfigs_SinglePlugin — Test coverage was reduced during the move from internal/cli/run_test.go. The old test verified owner, version, source, category, installed_plugins qualified name, and settings.json enabledPlugins. The new test only checks plugin name and lspServers presence, dropping 5+ structural assertions on marketplace config output. The underlying buildPluginConfigs logic is unchanged, so these omissions could mask regressions if the config structure evolves.
    Remediation: Restore the dropped assertions — verify mkt["owner"] is non-nil, plugin entry has version/source/category fields, installed_plugins.json contains the qualified name, and settings.json enabledPlugins contains the qualified name set to true.

  • [edge-case] internal/cli/run.go:396 — TOCTOU window between host-side content scan (scanRuntimeContent) and sandbox upload (rt.Bootstrap). An adversary who can modify host files between scan and upload could bypass the injection scan. The code comment acknowledges this ("Host files could change between scan and upload"). Pre-existing issue — the refactor preserved the previous behavior without closing the gap.
    Remediation: Read file content once into memory, scan it, then pass the scanned bytes to Bootstrap for upload, eliminating the TOCTOU window.

  • [edge-case] internal/runtime/claude.go:Bootstrap — Security hooks (Tirith, SSRF, canary, secret redaction, unicode normalization) are only installed when BootstrapInput also implements ClaudeHooksBootstrap (via type assertion at line ~63). When the assertion fails, Bootstrap silently returns nil with no warning. A future runtime adapter that does not implement this extension will silently skip all sandbox-side security hooks. docs/runtimes.md documents this, but there is no programmatic enforcement or log warning.
    Remediation: Add a log warning when the ClaudeHooksBootstrap type assertion fails, e.g., fmt.Fprintf(os.Stderr, "WARNING: BootstrapInput does not implement ClaudeHooksBootstrap; sandbox security hooks will not be installed\n").

Low

  • [test-inadequate] internal/runtime/claude_test.go:TestBuildPluginConfigs_ConfigStructure — Old test verified known_marketplaces.json contents (source repo field). New test only checks path suffixes, dropping content verification.

  • [test-inadequate] internal/runtime/claude_test.go:TestBuildPluginConfigs_EmptyPluginList — Old test verified both marketplace.json plugins array (nil check) and settings.json enabledPlugins. New test only checks settings.json, dropping the marketplace assertion.

  • [injection-vuln] internal/runtime/claude.go:bootstrapPlugins — Plugin directory basenames from harness config are interpolated into shell echo commands without quoting. If a plugin directory name contains shell metacharacters, the command could break or inject. Pre-existing pattern moved from cli/run.go; risk is low because this runs inside the sandbox and harness validation rejects unusual characters.

Info

  • [sub-agent-failure] N/A — The intent-coherence, style-conventions, and docs-currency sub-agents were not dispatched: model claude-sonnet-4-5@20250929 is unavailable on the Vertex deployment in prior runs. These are sonnet-tier agents; their absence does not block the review.
Previous run

Review

Findings

Medium

  • [test-inadequate] internal/runtime/claude_test.go:TestBuildPluginConfigs_SinglePlugin — Test coverage was reduced during the move from internal/cli/run_test.go. The old test verified owner, version, source, category, installed_plugins qualified name, and settings.json enabledPlugins. The new test only checks plugin name and lspServers presence, dropping 5+ structural assertions on marketplace config output. The underlying buildPluginConfigs logic is unchanged, so these omissions could mask regressions if the config structure evolves.
    Remediation: Restore the dropped assertions — verify mkt["owner"] is non-nil, plugin entry has version/source/category fields, installed_plugins.json contains the qualified name, and settings.json enabledPlugins contains the qualified name set to true.

  • [edge-case] internal/cli/run.go:396 — TOCTOU window between host-side content scan (scanRuntimeContent) and sandbox upload (rt.Bootstrap). An adversary who can modify host files between scan and upload could bypass the injection scan. The code comment acknowledges this ("Host files could change between scan and upload"). Pre-existing issue — the refactor preserved the previous behavior without closing the gap.
    Remediation: Read file content once into memory, scan it, then pass the scanned bytes to Bootstrap for upload, eliminating the TOCTOU window.

  • [edge-case] internal/runtime/claude.go:Bootstrap — Security hooks (Tirith, SSRF, canary, secret redaction, unicode normalization) are only installed when BootstrapInput also implements ClaudeHooksBootstrap (via type assertion). A future runtime that does not implement this extension will silently skip all sandbox-side security hooks with no warning. runtimes.md documents this, but there is no programmatic enforcement.
    Remediation: Add a log warning or runtime interface method (e.g., HasSandboxHooks() bool) so the runner can detect when sandbox hooks are absent while the harness has security enabled.

Low

  • [test-inadequate] internal/runtime/claude_test.go:TestBuildPluginConfigs_ConfigStructure — Old test verified known_marketplaces.json contents (source repo field). New test only checks path suffixes, dropping content verification.

  • [test-inadequate] internal/runtime/claude_test.go:TestBuildPluginConfigs_EmptyPluginList — Old test verified both marketplace.json plugins array and settings.json enabledPlugins. New test only checks settings.json, dropping the marketplace assertion.

  • [injection-vuln] internal/runtime/claude.go:bootstrapPlugins — Plugin directory basenames from harness config are interpolated into shell echo commands without quoting. If a plugin directory name contains single quotes or shell metacharacters, the command could break. Pre-existing pattern; risk is low because this runs inside the sandbox and harness validation likely rejects unusual characters.

Info

  • [sub-agent-failure] N/A — The intent-coherence, style-conventions, and docs-currency sub-agents did not return findings: model claude-sonnet-4-5@20250929 is not available on the Vertex deployment. These are sonnet-tier agents; their failure does not block the review.
Previous run (2)

Review

Findings

Low

  • [error-handling] internal/runtime/claude.go:65ClaudeRuntime.Bootstrap() passes input.AgentPath() directly to sandbox.Upload() without an empty-string guard. The parallel function scanRuntimeContent (in bootstrap_scan.go) validates AgentPath() != "" at the top, but Bootstrap() does not. Similarly, the SkillDirs() loop in Bootstrap() does not skip empty entries, unlike the scanning code which explicitly continues on empty strings. Upstream harness validation (Validate() + ValidateFilesExist()) prevents empty paths in practice, so this is defense-in-depth, not a runtime risk.
    Remediation: Add if input.AgentPath() == "" { return fmt.Errorf("agent path is required") } at the top of Bootstrap(), and add if skillPath == "" { continue } in the skill upload loop, consistent with scanRuntimeContent.

  • [TOCTOU] internal/cli/run.go:396 — The TOCTOU window between scanRuntimeContent() (host-side injection scan) and rt.Bootstrap() (upload) persists from the prior code structure. The inline comment acknowledges the runner owns the host FS. Accepted risk, not a regression.

Info

  • [test-adequacy] internal/runtime/claude_test.go — All 19 previously-deleted tests from run_test.go are ported to claude_test.go (TestBuildRunCommand_*, TestBuildPluginConfigs_*). Assertions are equivalent. No test weakening detected.

  • [security-preserved] internal/runtime/claude.go — Security hook installation (installClaudeHooks), command injection defenses (single-quote escaping in buildRunCommand), output sanitization (SanitizeOutput), and the security gate type-assertion (ClaudeHooksBootstrap) are all functionally equivalent to the removed code. The dual-settings-file contract is now explicitly documented in a code comment. No new injection surfaces or privilege escalation paths introduced.

  • [security-preserved] internal/security/hooks.goGenerateClaudeSettings and HookFiles now accept ClaudeSandboxHooks instead of *harness.Harness. Default-enabled behavior, hook ordering, and nil-safety checks are preserved. No change in security semantics.

  • [sub-agent-failure] N/A — The style-conventions, intent-coherence, and docs-currency sub-agents did not return findings: model claude-sonnet-4-5@20250929 unavailable on the Vertex deployment. These are non-critical review dimensions (sonnet-tier).

Previous run (3)

Review

Findings

Low

  • [edge-case] internal/cli/bootstrap_scan.go:41scanRuntimeContent does not validate that BootstrapInput fields are non-empty before calling os.ReadFile. An empty AgentPath() would produce a confusing error rather than a clear validation message. Upstream harness validation likely prevents this, but the function doesn't enforce its own preconditions.

  • [error-handling] internal/runtime/claude.go:175bootstrapPlugins writes settings.json to configDir/settings.json while installClaudeHooks writes to SandboxWorkspace/.claude/settings.json. These are different paths today (not a bug), but the dual-settings-file contract is undocumented. If a future refactor unifies the paths, one would silently overwrite the other, losing either plugin config or security hooks.

  • [TOCTOU] internal/cli/run.go:396 — The refactoring widens the TOCTOU window between scanRuntimeContent() (host-side injection scan) and rt.Bootstrap() (upload). The code acknowledges this: "Host files could change between scan and upload; the runner owns the host FS here." Accepted risk, not a regression.

Info

  • [test-adequacy] internal/runtime/claude_test.go — Prior medium-severity finding about test coverage reduction is resolved. All 19 previously-deleted tests from run_test.go are now ported to claude_test.go, including PluginDirEscapesQuotes, DebugEscapesQuotes, NoDoubleSpaces (6 subtests), NoLspJSON, InvalidLspJSON, EmptyLspJSON, ConfigStructure, and EmptyPluginList.

  • [security-preserved] internal/runtime/claude.go — Security hook installation (installClaudeHooks), command injection defenses (single-quote escaping in buildRunCommand), output sanitization (SanitizeOutput), and security gate logic are all functionally equivalent to the removed code. No new injection surfaces or privilege escalation paths introduced.

  • [sub-agent-failure] N/A — The style-conventions, intent-coherence, and docs-currency sub-agents did not return findings: model claude-sonnet-4-5@20250929 unavailable on the Vertex deployment. These are non-critical review dimensions (sonnet-tier).

Previous run (4)

Review

Findings

Medium

  • [test-inadequate] internal/runtime/claude_test.go — Significant test coverage reduction during refactor. The PR removes 19 tests from run_test.go (TestBuildClaudeCommand_* and TestBuildPluginConfigs_*) but only adds 9 replacement tests in claude_test.go. Dropped edge-case tests include: quote escaping in plugin dirs (PluginDirEscapesQuotes), debug flag escaping (DebugEscapesQuotes), double-space prevention (NoDoubleSpaces with 6 subtests), multiple plugin dirs, disabled debug, and malformed/missing .lsp.json handling (NoLspJSON, InvalidLspJSON, EmptyLspJSON). These covered important defensive behavior in command construction.
    Remediation: Port the missing tests into internal/runtime/claude_test.go. At minimum, re-add the quote-escaping tests, the NoDoubleSpaces table test, and the error-handling plugin config tests.

Low

  • [error-handling-gap] internal/runtime/claude.gobootstrapPlugins writes settings.json to configDir/settings.json while installClaudeHooks writes to SandboxWorkspace/.claude/settings.json. This is the same behavior as the old code (not a regression), but the dual-settings-file contract is undocumented and fragile if paths are ever unified.

  • [edge-case] internal/cli/bootstrap_scan.go:41scanRuntimeContent does not validate that BootstrapInput fields are non-empty before calling os.ReadFile. An empty AgentPath() would produce a confusing error rather than a clear validation message. Upstream harness validation likely prevents this, but the function doesn't enforce its own preconditions.

Info

  • [sub-agent-failure] N/A — The style-conventions, intent-coherence, and docs-currency sub-agents did not return findings: model claude-sonnet-4-5@20250929 unavailable on the Vertex deployment. These are non-critical review dimensions (sonnet-tier).

@fullsend-ai-review fullsend-ai-review Bot added the requires-manual-review Review requires human judgment label Jun 2, 2026
Copy link
Copy Markdown
Contributor

@waynesun09 waynesun09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Squad Report — 7 agents, 14 findings after dedup, 4 false positives removed

Agents: 2x claude-coder, 2x claude-researcher, 2x gemini-code-review, 1x cursor-code-review

Clean mechanical refactor that successfully abstracts Claude Code behind a Runtime interface. The architecture and code movement are sound — no behavioral bugs detected.

Summary of findings (HIGH: 3, MEDIUM: 6, LOW: 4, INFO: 1)

High:

  1. runtime → harness dependency inversion via BoolDefault (5-agent consensus)
  2. 8 test cases dropped without replacement, including shell-injection-preventing quote escaping tests (5-agent consensus)
  3. ClaudeHooksBootstrap in shared interface file couples portable layer to Claude (2 agents)

Medium:
4. No tests for 99 lines of security-critical bootstrap_scan.go (5-agent consensus)
5. TOCTOU regression — scan and upload no longer atomic per file (3 agents)
6. Runtime interface too wide at 10 methods + inconsistent SandboxName passing (5-agent consensus)
7. ClaudeSandboxHooks exports mutable harness.SandboxHooks pointer (5-agent consensus)
8. Duplicate ExtractTranscripts — deprecated copy with no remaining callers (4 agents)

See inline comments for details and suggestions.

Comment thread internal/runtime/claude.go
Comment thread internal/runtime/claude_test.go
Comment thread internal/runtime/bootstrap.go Outdated
Comment thread internal/cli/bootstrap_scan.go
Comment thread internal/cli/run.go
Comment thread internal/runtime/runtime.go
Comment thread internal/security/claude_hooks.go
Comment thread internal/sandbox/sandbox.go Outdated
Copy link
Copy Markdown
Contributor

@ralphbean ralphbean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs a few changes before we can merge. See inline comments.

Comment thread internal/runtime/runtime.go
Comment thread internal/runtime/claude_test.go
Comment thread internal/runtime/runtime.go Outdated
Rebase onto main (remote resource resolve, --offline) while keeping the
runtime abstraction. Port dropped command/plugin tests, drop duplicate
sandbox.ExtractTranscripts, remove runtime→harness import, and move
ClaudeHooksBootstrap to claude_bootstrap.go.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Comment thread internal/cli/bootstrap_scan.go
Comment thread internal/runtime/claude.go
Comment thread internal/cli/run.go
@fullsend-ai-review fullsend-ai-review Bot added ready-for-merge All reviewers approved — ready to merge and removed requires-manual-review Review requires human judgment labels Jun 2, 2026
Add docs/runtimes.md, split TranscriptHandler from Runtime, narrow
ClaudeSandboxHooks accessors, add bootstrap_scan tests, and document
scan/upload TOCTOU follow-up.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Reject empty agent path before host-side scan. Document why plugin
settings and security hooks use separate settings.json paths.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Comment thread internal/runtime/claude.go
Comment thread internal/cli/run.go
@fullsend-ai-review fullsend-ai-review Bot added ready-for-merge All reviewers approved — ready to merge and removed ready-for-merge All reviewers approved — ready to merge labels Jun 2, 2026
Comment thread internal/runtime/claude.go
Comment thread internal/runtime/sanitize.go Outdated
Comment thread docs/runtimes.md
Comment thread internal/cli/run.go Outdated
Comment thread internal/runtime/claude.go Outdated
Comment thread internal/runtime/claude.go Outdated
ifireball and others added 2 commits June 3, 2026 12:05
Validate agent path in Bootstrap, skip empty skill/plugin dirs, use
ConfigDir/WorkspaceDir methods instead of sandbox constants, and
unexport sanitizeOutput.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@ifireball
Copy link
Copy Markdown
Contributor Author

All review threads from the first pass are resolved, and follow-ups from @maruiz93 / @rh-hemartin / fullsend-ai-review are addressed in 8bf10e4c (bootstrap path validation, WorkspaceDir(), unexported sanitizeOutput, merge with main).

Still blocking merge: CHANGES_REQUESTED from @ralphbean — could you take another look and re-approve when satisfied? @waynesun09 same if anything remains on your side.

TOCTOU hardening remains tracked in #1801 (accepted risk for this PR).

@fullsend-ai-review fullsend-ai-review Bot added requires-manual-review Review requires human judgment and removed ready-for-merge All reviewers approved — ready to merge labels Jun 3, 2026
bootstrapCommon is runner-level; Claude hook dirs belong in
installClaudeHooks behind ClaudeHooksBootstrap.

Signed-off-by: Barak Korren <bkorren@redhat.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@fullsend-ai-review fullsend-ai-review Bot added requires-manual-review Review requires human judgment and removed requires-manual-review Review requires human judgment labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

requires-manual-review Review requires human judgment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants