Skip to content

Commit 07cc5b5

Browse files
authored
Feat execute bash (#119)
* feat: add execute_bash tool — sandboxed bash interpreter via just-bash Adds a new execute_bash tool that runs bash commands inside the Hyperlight micro-VM using just-bash (pure TypeScript bash interpreter). Commands run in hardware-isolated QuickJS with no host process access. 40+ Unix commands available: ls, cat, grep, jq, sed, awk, sort, find, tree, diff, head, tail, wc, cut, tr, xargs, tee, cp, mkdir, touch, base64, md5sum, date, env vars, pipes, redirects, and more. Architecture: - just-bash bundled via esbuild with node: API stubs (1.4 MB) - Polyfills for URL, Buffer, process, AbortController, crypto, setTimeout - Auto-registered internal handler (sys-bash-runner) - Stateless: fresh Bash instance per call - Heap auto-expanded to 64 MB on first use - Rebuilds automatically as part of npm run build:modules Key design decisions: - Tool, not module: LLM calls execute_bash({command}) directly - No in-memory FS: all file ops go through host:fs-read/host:fs-write - curl requires fetch plugin enabled (actionable error if not) - Unsupported commands (rm, mv, python3, sqlite3) return clear errors - Command list in tool description for strong LLM guidance Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * feat: add execute_bash to system message workflow guidance The LLM didn't know about execute_bash because the system prompt only mentioned register_handler + execute_javascript. Added bash tool guidance between WORKFLOW and DIRECT FILE I/O sections. Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * feat: wire execute_bash to show-code and code log Shows '$ command' inline when show-code is enabled and writes bash commands to the code log file alongside JS handler code. Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * fix: add execute_bash to ALLOWED_TOOLS gating list Without this, the tool exists but the SDK blocks the LLM from calling it. Absolute rookie mistake. Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * fix: add execute_bash to all skill allowed-tools lists When a skill is active, the SDK only exposes tools listed in the skill's allowed-tools. execute_bash was missing from all 9 skills, so the LLM couldn't see or use it when any skill was active. Also adds a pattern-integrity test that verifies every defineTool in index.ts has a corresponding entry in ALLOWED_TOOLS — catches this class of bug automatically. Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * chore: gitignore output dirs and temp bundle file Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * fix: add execute_bash to availableTools — the THIRD allowlist The SDK has THREE places that control tool visibility: 1. tools[] — implementation registration 2. ALLOWED_TOOLS — pre-execution gate (tool-gating.ts) 3. availableTools[] — tells SDK what to show the model execute_bash was in 1 and 2 but not 3, so the model literally couldn't see it. Added test that verifies every defineTool has entries in BOTH ALLOWED_TOOLS AND availableTools. Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * feat: native RDRAND-backed crypto and Math.random for sandbox Adds hardware random number generation to the Hyperlight QuickJS runtime using x86_64 RDRAND instruction: - crypto.getRandomValues(Uint8Array) — fills array with HW random bytes - crypto.randomUUID() — generates RFC 4122 v4 UUIDs with proper version/variant bits, backed by RDRAND (not a fake hex generator) - Math.random() — overrides QuickJS built-in with RDRAND-backed implementation, enabling awk rand() in just-bash Also fixes execute_bash: - Auto-configures heap (64MB) and buffers (2048KB) at startup when ha:bash module is present (not lazily on first call) - Stubs sql.js to reduce bundle size - Removes JS crypto polyfill (now native) - Fixes curl listed as 'NOT AVAILABLE' in tool description (curl is supported when fetch plugin is enabled) Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * fix: isolate bash from JS sandbox — dedicated sandbox + blocked from JS APIs - Bash runs in its own dedicated sandbox (getBashSandbox) with only ha:bash loaded — no pptx/pdf/xlsx memory competition - Excluded ha:bash from main JS sandbox module loading - Filtered bash from list_modules output (LLM won't see it) - module_info('bash') returns clear error: 'use execute_bash tool' - Fixed contradictory system message ('No bash' vs bash tool section) - JS handlers cannot import ha:bash — it only exists in the bash sandbox Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * fix: actionable memory error guidance for both execute_javascript and execute_bash - onPostToolUse now handles execute_bash memory errors (was JS only) - Clearly tells LLM to increase SCRATCH for 'Out of physical memory' and HEAP for 'malloc failed' — with specific configure_sandbox call - configure_sandbox description now lists common memory errors and which setting to change for each Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * feat: crypto.subtle.digest + Buffer.toString(encoding) for bash sandbox - Add sha1/sha2 RustCrypto crates (no_std, audited) to native-globals - Implement crypto.subtle.digest('SHA-1'|'SHA-256', data) → Promise<ArrayBuffer> matching the Web Crypto API — fixes sha256sum/sha1sum in just-bash - Fix Buffer polyfill: HaBuffer subclass with .toString(encoding) supporting base64, latin1/binary, hex, utf-8 — fixes silent data corruption in curl (Basic auth), yq, and unguarded Buffer.from().toString() paths - Restore md5sum/sha256sum to BASH_SUPPORTED_COMMANDS (were removed when crypto.subtle was missing — now backed by RustCrypto) - Add POSIX-only flag caveat to execute_bash description - Update large-output guidance to mention execute_bash as alternative to handler-based file processing Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * fix: use format! instead of alloc::format! for clippy without hyperlight cfg The CI runs clippy without the hyperlight cfg flag, so 'extern crate alloc' is not active and alloc::format! fails with E0433. Plain format! works in both std and no_std+alloc contexts via the prelude. Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * fix: address all Copilot PR review comments (#119) - lib.rs: update module docs — RDRAND not xorshift128+ PRNG - lib.rs: add cfg(target_arch=x86_64) guard on rdrand64 with fallback - lib.rs: use #[macro_use] extern crate alloc for format! in no_std - lib.rs: fix crypto.subtle.digest to respect byteOffset/byteLength on ArrayBufferViews (subarray would hash wrong data) - node-path-stub.mjs: fix normalize('foo/..') returning 'foo' instead of '.' — pop when out.length > 0, preserve '..' for relative paths - index.ts: sync plugins to bashSandbox via syncPluginsToSandbox() so curl/fs-read/fs-write/MCP work in execute_bash - index.ts: reset bashSandbox on reset_sandbox (was leaking old sandbox) - index.ts: separate memory error guidance for bash vs JS sandbox (bash has fixed limits, configure_sandbox doesn't affect it) - system-message.ts: fix 'No curl' → 'same plugins', add crypto to AVAILABLE GLOBALS (getRandomValues, randomUUID, subtle.digest) - Remove unused scripts/bash-bundle/build-ha-bash.mjs Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * fix: add curl to bash commands, improve /sessions and /resume UX - Add 'curl' to BASH_SUPPORTED_COMMANDS — it was in just-bash but missing from the allowlist, causing 'command not found' - /sessions now shows full session UUIDs (not truncated) so users can copy-paste into /resume - /sessions -all now works as alias for --all - /resume without an ID shows interactive numbered picker (top 20 most recent sessions) — type a number to resume, like az CLI subscription selector Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * fix: wire fetch plugin into bash curl + fix arrow keys in REPL - curl was in BASH_SUPPORTED_COMMANDS but just-bash's curl needs a fetch function to make HTTP requests. Build a Web Fetch adapter over host:fetch (fetchText/post) and pass it to the Bash constructor. curl now works when the fetch plugin is enabled. - Fix arrow keys printing ^[[A after slash commands: use rl.prompt() instead of raw process.stdout.write() in questionCapturingPaste() so readline properly manages terminal raw mode Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * fix: native module discovery, curl, FS wiring, SIGILL, docs, tests Native modules (ha:html, ha:markdown, ha:image, ha:ziplib): - Add NATIVE_MODULES to sandbox getAvailableModules() so validator accepts imports (was rejecting 'ha:html is not available') - listModules() and loadModule() now find native Rust modules via .json metadata + .d.ts exports (no .js file needed) - 8 new tests: native module visibility, metadata, handler registration Bash curl: - Replace broken lazy-loaded curl with defineCommand custom command that bridges host:fetch (just-bash's lazy loader breaks in QuickJS) - Fix FetchResult type: Record<string,string> not Map for headers Bash filesystem: - Wire host:fs-read/fs-write into bash via hybrid IFileSystem adapter (delegates to plugins, falls back to InMemoryFs) - Add cross-tool file access guidance to tool description - Remove html-to-markdown from bash (no proper no_std crate available) Build fixes: - build.mjs: fix require() in ESM — use imported unlinkSync - build-modules.js: move bash bundle before hash update step - Export defineCommand from entry.mjs for custom bash commands Process fixes: - Clean up bashSandbox before exit (prevents SIGILL on shutdown) - Add 'Shutting down — please wait...' message on exit/SIGINT - /sessions -all alias, full UUID display, interactive /resume picker Docs: - Update README, ARCHITECTURE, SECURITY, USAGE, HOW-IT-WORKS for execute_bash tool (SDK bash still blocked, sandboxed bash available) Commands: - Add missing commands: strings, split, sha1sum, gzip/gunzip/zcat, xan, pwd, false - Remove rm/mv/ln/rmdir (no delete/rename in security model) - Custom 'commands' command lists actual available commands Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> * fix: native module tests fail on CI — copy .d.ts/.json in beforeAll Tests for native module discovery (html, markdown, image, ziplib) assumed ~/.hyperagent/modules/ already had metadata files from a previous agent run. On CI the directory is empty. Added beforeAll that copies .json and .d.ts from builtin-modules/ to the modules dir before testing, matching what loadAllModules() does at startup. Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com> --------- Signed-off-by: Simon Davies <simongdavies@users.noreply.github.com>
1 parent 22256d1 commit 07cc5b5

46 files changed

Lines changed: 3036 additions & 153 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,4 +52,6 @@ plugins/shared/*.js
5252
plugins/plugin-schema-types.d.ts
5353
plugins/plugin-schema-types.js
5454
plugins/host-modules.d.ts
55-
output-hyperagent**/**
55+
output-hyperagent**/**scripts/bash-bundle/_tmp_bundle.js
56+
output-hyperagent-*/
57+
scripts/bash-bundle/_tmp_bundle.js

Justfile

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -148,6 +148,10 @@ setup: ensure-tools install
148148
build: install
149149
@echo "✅ Build complete — run 'just start' to launch the agent"
150150

151+
# Rebuild the ha:bash bundle from just-bash (only needed when just-bash updates)
152+
build-bash:
153+
node scripts/bash-bundle/build.mjs
154+
151155
# Build everything in release mode (hyperlight-js, guest runtime, NAPI addon)
152156
build-release: install-release
153157
@echo "✅ Release build complete — run 'just start-release' to launch"

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,7 @@ The sandbox has no direct filesystem, network, shell, or process access. Capabil
125125
| ---------------- | ---------------------------------------------------------------- |
126126
| Files | `fs-read` and `fs-write` plugins with path jails |
127127
| HTTP | `fetch` plugin with domain allowlists and SSRF checks |
128+
| Bash commands | `execute_bash` — sandboxed pure-JS interpreter (ls, grep, jq, curl, etc.) |
128129
| Reusable code | `ha:*` system and user modules |
129130
| External systems | MCP servers exposed as typed `host:mcp-*` modules |
130131
| Bigger jobs | Profiles that raise limits; profile tools can enable plugin sets |
@@ -235,7 +236,7 @@ MCP servers are not Hyperlight-sandboxed; they run as normal host processes. Rev
235236
HyperAgent is designed to make generated-code execution less terrifying, not magically safe.
236237

237238
- **Hardware isolation:** JavaScript runs in Hyperlight micro-VMs.
238-
- **Tool gating:** SDK built-ins like shell, edit, and grep are blocked; the model gets HyperAgent-specific tools.
239+
- **Tool gating:** SDK built-ins like shell, edit, and grep are blocked; the model gets HyperAgent-specific tools including a sandboxed bash interpreter (`execute_bash`).
239240
- **Code validation:** Generated JavaScript is checked before execution.
240241
- **No ambient host access:** Files, network, and external systems require explicit plugins or MCP connections.
241242
- **Plugin auditing:** Plugin code is audited before use.

builtin-modules/bash.json

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
{
2+
"name": "bash",
3+
"description": "Sandboxed bash interpreter powered by just-bash. Provides 40+ Unix commands (ls, grep, jq, curl, sed, awk, etc.) running entirely in JavaScript inside the Hyperlight micro-VM.",
4+
"author": "system",
5+
"mutable": false,
6+
"type": "script",
7+
"sourceHash": "sha256:31219cbf85cf26d1",
8+
"hints": {
9+
"overview": "Pure-JS bash interpreter for the sandbox. Used internally by the execute_bash tool — not intended for direct import in handlers."
10+
}
11+
}

docs/ARCHITECTURE.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ This document describes Hyperagent's system architecture.
1212
│ streaming | infinite sessions | multi-model │
1313
├───────────────────────────────────────────────────────┤
1414
│ Tool Gating Layer │
15-
│ (blocks all SDK built-in tools like bash/edit)
15+
│ (blocks SDK built-in tools like edit/grep/write)
1616
├───────────────────────────────────────────────────────┤
1717
│ Custom Tools │
1818
│ register_handler, execute_javascript, ask_user │
@@ -51,8 +51,9 @@ The main agent file handles:
5151
### Tool Gating (`src/agent/tool-gating.ts`)
5252

5353
Intercepts all tool calls from the LLM and:
54-
- Blocks most GitHub Copilot SDK built-in tools (bash, edit, grep, read, write)
55-
- Allows only registered custom tools
54+
- Blocks most GitHub Copilot SDK built-in tools (edit, grep, read, write)
55+
- Allows only registered custom tools (including `execute_bash` which runs
56+
a sandboxed pure-JS bash interpreter, not a host shell)
5657
- Logs blocked attempts for debugging
5758

5859
### Sandbox Tool (`src/sandbox/tool.js`)

docs/HOW-IT-WORKS.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -105,9 +105,10 @@ All code runs in Hyperlight micro-VMs:
105105

106106
The LLM cannot escape the sandbox:
107107

108-
- Most GitHub Copilot SDK built-in tools (bash, edit, grep, read, write) are **blocked** the exceptions being `ask_user` (questions) and `report_intent` (protocol)
109-
- All functionality comes from custom Hyperagent tools (`execute_javascript`, `register_handler`, etc.)
110-
- Even if the LLM tries to use bash, it won't work
108+
- Most GitHub Copilot SDK built-in tools (edit, grep, read, write) are **blocked** the exceptions being `ask_user` (questions) and `report_intent` (protocol)
109+
- The SDK's built-in bash tool is also blocked — HyperAgent provides its own `execute_bash` which runs a pure-JS bash interpreter inside the sandbox
110+
- All functionality comes from custom Hyperagent tools (`execute_javascript`, `register_handler`, `execute_bash`, etc.)
111+
- Even if the LLM tries to use the SDK's bash, it won't work — but `execute_bash` is available for shell-style data processing
111112

112113
### Code Validation
113114

docs/SECURITY.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,15 @@ Hyperagent implements defense-in-depth security through multiple layers. No sing
1212

1313
The Copilot SDK provides built-in tools (bash, edit, grep, read, write) that would allow arbitrary code execution. Hyperagent blocks most of them, allowing only safe tools like `ask_user` (for user interaction) and `report_intent` (protocol).
1414

15+
Hyperagent provides its own `execute_bash` tool — a pure-JS bash interpreter (just-bash) running **inside** the Hyperlight sandbox. This is NOT the SDK's bash tool and does NOT have host shell access. It supports ~50 common commands (ls, grep, jq, curl, sed, awk, etc.) with the same isolation guarantees as `execute_javascript`.
16+
1517
**Implementation** (`src/agent/tool-gating.ts`):
1618
- Intercepts all tool calls from the LLM
1719
- Maintains an allowlist of custom tools plus safe SDK tools
1820
- Rejects any tool not on the allowlist
1921
- Logs blocked attempts for debugging
2022

21-
**Effect**: The LLM cannot escape the sandbox by calling SDK tools. Even if prompted to "run bash", the tool call is rejected.
23+
**Effect**: The LLM cannot escape the sandbox by calling SDK tools. Even if prompted to "run bash", the SDK's bash tool is rejected. The sandboxed `execute_bash` runs inside a micro-VM with no host shell access.
2224

2325
### Layer 2: Hyperlight Micro-VMs
2426

docs/USAGE.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -108,13 +108,15 @@ Toggle options at runtime without restarting. Type `/` and press Tab for complet
108108

109109
## LLM Tools
110110

111-
The agent registers custom tools that the LLM can call. All SDK built-in tools
112-
(bash, grep, edit, etc.) are **blocked** by the tool gating layer.
111+
The agent registers custom tools that the LLM can call. Most SDK built-in tools
112+
(grep, edit, etc.) are **blocked** by the tool gating layer. The SDK's bash tool
113+
is also blocked — `execute_bash` is Hyperagent's own sandboxed bash interpreter.
113114

114115
| Tool | Purpose |
115116
| -------------------- | -------------------------------------------------------------------- |
116117
| `register_handler` | Register named JavaScript handler code in the sandbox |
117118
| `execute_javascript` | Execute a registered handler with optional event data |
119+
| `execute_bash` | Run bash commands in a sandboxed pure-JS interpreter (just-bash) |
118120
| `delete_handler` | Remove a handler from the sandbox |
119121
| `get_handler_source` | Retrieve handler source for inspection or editing |
120122
| `edit_handler` | Surgically edit an existing handler |

0 commit comments

Comments
 (0)