fix(security): close rm flag-ordering bypasses by kushalsai-01 · Pull Request #2718 · OpenHands/software-agent-sdk

kushalsai-01 · 2026-04-06T09:36:30Z

Summary

This PR closes a security gap where rm commands could bypass safeguards by reordering -r and -f flags or inserting additional options between them.

The validation logic now correctly detects recursive and force flags regardless of ordering or intervening arguments, preventing evasion of the safety checks.

Changes

Normalize and validate rm flag combinations independent of order
Handle cases with interleaved flags and options
Strengthen command parsing to prevent bypass scenarios
Add regression tests covering reported evasion patterns

Testing

Added unit tests for:
- rm -rf, rm -fr
- Flags separated by other options (e.g., rm -r --verbose -f)
- Edge cases involving mixed/invalid ordering
All existing tests pass

Notes

This change is backward-compatible and only affects validation logic for potentially destructive rm commands.

Checklist

Tests added/updated
Verified changes locally
CI passing (or expected to pass)

kushalsai-01 · 2026-04-06T09:40:22Z

Hi @maintainers, I’ve addressed the issue related to rm recursive-force flag ordering bypass and added regression tests for the reported cases.

Could you please review when you get a chance? Happy to make any changes if needed.

Copilot

Pull request overview

This PR aims to close a security gap in the SDK’s defense-in-depth analyzers where destructive rm invocations could evade detection by reordering -r/-f flags or inserting other options between them.

Changes:

Updated rm -rf detection regexes to allow intervening options and any flag ordering.
Extended policy-rail catastrophic-delete detection to use the updated rm flag detection.
Added regression tests for intervening/split-flag variants.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File	Description
`openhands-sdk/openhands/sdk/security/defense_in_depth/pattern.py`	Updates the HIGH-risk `rm -rf` pattern to detect re-ordered/interleaved flags.
`openhands-sdk/openhands/sdk/security/defense_in_depth/policy_rails.py`	Updates the catastrophic-delete rail’s `rm` recursive+force detection logic.
`tests/sdk/security/defense_in_depth/test_pattern.py`	Adds new HIGH test cases for intervening/split `rm` flags.
`tests/sdk/security/defense_in_depth/test_policy_rails.py`	Adds new catastrophic-delete rail tests for intervening/split `rm` flags.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-06T09:41:22Z

+        r"\brm\b(?:(?![;&|]).){0,120}"
+        r"(?:(?:--recursive\b|-[^\s-]*[rR][^\s]*)"
+        r"(?:(?![;&|]).){0,120}(?:--force\b|-[^\s-]*f[^\s]*)"
+        r"|(?:--force\b|-[^\s-]*f[^\s]*)"
+        r"(?:(?![;&|]).){0,120}(?:--recursive\b|-[^\s-]*[rR][^\s]*))",


The new rm -rf detector relies on multiple .{0,120} bounded spans. This still allows bypass by inserting a long positional argument / option value (e.g., a >120-char filename) between rm and -r/-f, or between the two flags, which GNU rm accepts and would evade this regex. Consider switching to a token-based scan (anchor on rm, then scan the rest of the same command segment for recursive+force flags) or otherwise remove the fixed-length dependency so detection can’t be defeated by long-but-valid arguments.

Suggested change

r"\brm\b(?:(?![;&|]).){0,120}"

r"(?:(?:--recursive\b|-[^\s-]*[rR][^\s]*)"

r"(?:(?![;&|]).){0,120}(?:--force\b|-[^\s-]*f[^\s]*)"

r"|(?:--force\b|-[^\s-]*f[^\s]*)"

r"(?:(?![;&|]).){0,120}(?:--recursive\b|-[^\s-]*[rR][^\s]*))",

r"\brm\b"

r"(?=[^;&|]*(?:--recursive\b|-[^\s-]*[rR][^\s]*))"

r"(?=[^;&|]*(?:--force\b|-[^\s-]*f[^\s]*))"

r"[^;&|]*",

Copilot · 2026-04-06T09:41:22Z

+        r"\brm\b(?:(?![;&|]).){0,120}"
+        r"(?:(?:--recursive\b|-[^\s-]*[rR][^\s]*)"
+        r"(?:(?![;&|]).){0,120}(?:--force\b|-[^\s-]*f[^\s]*)"
+        r"|(?:--force\b|-[^\s-]*f[^\s]*)"
+        r"(?:(?![;&|]).){0,120}(?:--recursive\b|-[^\s-]*[rR][^\s]*))",


The (?![;&|]) tempered-dot treats any & as a command boundary, but & is also valid inside redirections (e.g., 2>&1, &>/dev/null). An attacker can place such a redirection between -r and -f (e.g., rm -r 2>&1 -f /) and bypass detection while the command remains a single rm invocation. Consider narrowing the boundary logic to actual shell operators (;, |, &&, background &) rather than any raw & character, or use a small shell-operator tokenizer for the segment.

Copilot · 2026-04-06T09:41:22Z

        has_recursive_force = bool(
            re.search(
-                r"\brm\s+(?:-[frR]{2,}|-[rR]\s+-f|-f\s+-[rR]"
-                r"|--recursive\s+--force|--force\s+--recursive)\b",
+                r"\brm\b(?:(?![;&|]).){0,120}"
+                r"(?:(?:--recursive\b|-[^\s-]*[rR][^\s]*)"
+                r"(?:(?![;&|]).){0,120}(?:--force\b|-[^\s-]*f[^\s]*)"
+                r"|(?:--force\b|-[^\s-]*f[^\s]*)"
+                r"(?:(?![;&|]).){0,120}(?:--recursive\b|-[^\s-]*[rR][^\s]*))",
                seg,
                ci,
            )


This same complex rm -rf regex is duplicated here and in pattern.py. With the added complexity (tempered dots, ordering alternation), keeping two copies increases the risk of future drift/fixes being applied in only one analyzer. Consider factoring the rm recursive+force detection into a shared helper/constant used by both analyzers.

Copilot · 2026-04-06T09:41:23Z

            re.search(
-                r"\brm\s+(?:-[frR]{2,}|-[rR]\s+-f|-f\s+-[rR]"
-                r"|--recursive\s+--force|--force\s+--recursive)\b",
+                r"\brm\b(?:(?![;&|]).){0,120}"
+                r"(?:(?:--recursive\b|-[^\s-]*[rR][^\s]*)"
+                r"(?:(?![;&|]).){0,120}(?:--force\b|-[^\s-]*f[^\s]*)"
+                r"|(?:--force\b|-[^\s-]*f[^\s]*)"
+                r"(?:(?![;&|]).){0,120}(?:--recursive\b|-[^\s-]*[rR][^\s]*))",
                seg,
                ci,
            )


Like pattern.py, this rm -rf detector depends on fixed .{0,120} windows. A long-but-valid argument between rm and -r/-f (or between the flags) can exceed the bound and prevent the catastrophic-delete rail from firing. Since this rail gates RAIL_CATASTROPHIC_DELETE, consider a token-based scan within the segment (up to the actual command operator boundary) so the rail can’t be bypassed by argument length.

Copilot · 2026-04-06T09:41:23Z

+

The PR description/linked issue call out “post-argument flags” bypasses (e.g., rm / -rf, rm /etc -r -f), but the new regression tests added here don’t cover those cases. Adding explicit cases for flags appearing after positional arguments (and ideally a redirection-interleaving case like rm -r 2>&1 -f /) would better ensure the bypass class is closed.

Suggested change

def test_catastrophic_delete_root_with_post_argument_flags(self):

decision = _evaluate_rail("rm / -rf")

assert decision.outcome == SecurityRisk.HIGH

assert decision.rule_name == RAIL_CATASTROPHIC_DELETE

def test_catastrophic_delete_non_root_target_with_post_argument_flags(self):

decision = _evaluate_rail("rm /etc -r -f")

assert decision.outcome == SecurityRisk.HIGH

assert decision.rule_name == RAIL_CATASTROPHIC_DELETE

def test_catastrophic_delete_with_redirection_interleaved_between_flags(self):

decision = _evaluate_rail("rm -r 2>&1 -f /")

assert decision.outcome == SecurityRisk.HIGH

assert decision.rule_name == RAIL_CATASTROPHIC_DELETE

Copilot · 2026-04-06T09:41:23Z

 _HIGH_CASES = [
    ("rm -rf /", "rm -rf"),
    ("rm -fr /home", "rm -fr"),
    ("rm -r -f /tmp", "rm -r -f"),
+    ("rm --no-preserve-root -rf /", "rm with intervening long flag"),
+    ("rm --preserve-root=all -r -f /tmp", "rm split flags with long option"),
    ("rm --recursive --force /", "rm --recursive --force"),
    ("sudo rm secret.db", "sudo rm"),


New HIGH cases cover intervening long options, but the reported “post-argument flags” bypasses (e.g., rm / -rf, rm /etc -r -f) still aren’t represented in this parametrized list. Adding those commands (and a redirection-interleaving variant like rm -r 2>&1 -f /) would make the regression suite align with the PR description and ensure ordering/position invariance holds.

kushalsai-01 · 2026-04-06T09:48:53Z

Thanks for the detailed review.

I see the concerns around the bounded regex (.{0,120}) allowing bypass via long arguments, as well as edge cases like redirection interleaving and post-argument flags. I’ll rework the detection to remove fixed-length constraints and move toward a more robust approach, along with adding the suggested regression tests and reducing duplication.

Pushing an update shortly.

kushalsai-01 · 2026-04-06T09:50:48Z

Addressed the review feedback:

Replaced regex with token-based parsing (shlex)
Removed fixed-length bounds to prevent bypass via long arguments
Handled redirections and command segmentation correctly
Added regression tests for post-argument flags and edge cases
Refactored shared detection logic

Workflows are awaiting approval—once approved, CI should run.

Would appreciate a re-review.

Close flag-ordering bypasses by matching rm recursive and force flags in any order with intervening options, and add regression tests for the reported evasion cases. Made-with: Cursor

Replace bounded rm regex matching with shared token-based shell segment parsing, reuse it across pattern and policy analyzers, and add regression tests for reordered flags, redirections, long arguments, and substring safety. Made-with: Cursor

kushalsai-01 · 2026-04-07T04:10:02Z

Hi @maintainers — could you please approve workflows so CI can run? I’ve rebased onto the latest main. Thanks.

all-hands-bot · 2026-04-12T12:25:32Z

[Automatic Post]: It has been a while since there was any activity on this PR. @kushalsai-01, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

all-hands-bot · 2026-04-16T12:39:03Z

[Automatic Post]: This PR seems to be currently waiting for review. @copilot-pull-request-reviewer[bot], could you please take a look when you have a chance?

Fieldnote-Echo · 2026-04-19T19:25:20Z

@kushalsai-01, thanks for picking this up. As the issue author, the scope in #2707 is met: your changes catch intervening flags (rm --no-preserve-root -rf /), post-argument flags (rm / -rf), compact/reversed short flags (-rf, -fr), and long flags in any order. Bonus, your shlex-based tokenization also closes path-qualified commands (/bin/rm -rf /) and quote-based evasions (r"m" -rf /) that were on my follow-up list but out of scope for #2707.

A few things I liked specifically:

Single _has_rm_recursive_force shared across PatternSecurityAnalyzer and the catastrophic-delete rail, no duplicated detection logic
Env-assignment prefix handling (FOO=bar rm -rf /) and redirection-token skip (rm -r 2>&1 -f /) are both nice touches I didn't think to call out
Negative tests on germ -rf /, rm -r alone, and rm -f alone, the kind of guard that prevents FP regressions later
Conservative docstring ("does not attempt a full shell grammar") sets expectations for the next person

One note for reviewers: I support the tree-sitter direction in #2721, but that's a cross-package migration on its own timeline. This PR solves a real bypass class now and is merge-shaped and clean.

Your CI hasn't been able to run yet because workflows need maintainer approval for first-time contributors. @enyst, could you approve the workflow runs on this so we get a CI signal?

all-hands-bot · 2026-04-25T12:27:15Z

[Automatic Post]: It has been a while since there was any activity on this PR. @kushalsai-01, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

all-hands-bot · 2026-04-30T12:48:08Z

[Automatic Post]: It has been a while since there was any activity on this PR. @kushalsai-01, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

all-hands-bot · 2026-05-06T12:53:05Z

[Automatic Post]: It has been a while since there was any activity on this PR. @kushalsai-01, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

all-hands-bot · 2026-05-11T13:20:04Z

[Automatic Post]: It has been a while since there was any activity on this PR. @kushalsai-01, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

Fieldnote-Echo · 2026-05-13T17:29:07Z

@kushalsai-01, quick context update.

The maintainers are moving the broader shell-parsing layer to tree-sitter-bash (issue #2721, phased plan in VascoSch92's comment). Phase 1 (PR #3237) is already open and drops the GPLv3 bashlex dependency. Phase 2 migrates the security analyzer onto the shared parser.

That doesn't make this PR obsolete. It makes it the demonstration that motivated the bigger shift. Showing the bypass class was real and that regex couldn't close it is the work that put a structural parser on the table. The shlex direction was the right call for the surface available at the time, and the negative tests (germ -rf /, rm -r alone, rm -f alone) set the false-positive discipline the tree-sitter version will inherit.

If you're interested in helping with the migration, Vasco split Phase 2 into focused chunks (2a-2f). The pieces that overlap with what you built here are Phase 2c (policy_rails.py rewrite, which extends your _has_rm_recursive_force direction to the full rail) and Phase 2d (replaces the hand-rolled shlex parsing with tree-sitter queries). Either would build on your work directly.

Thanks for picking this up. Your PR is the reason the discussion moved past "another regex tweak" to "structural fix."

Copilot AI review requested due to automatic review settings April 6, 2026 09:36

Copilot started reviewing on behalf of kushalsai-01 April 6, 2026 09:37 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

VascoSch92 mentioned this pull request Apr 6, 2026

Proposal: Replace regex-based shell command analysis with tree-sitter-bash #2721

Open

kushalsai-01 added 2 commits April 7, 2026 09:31

fix(security): harden rm recursive-force detection

8889ad8

Close flag-ordering bypasses by matching rm recursive and force flags in any order with intervening options, and add regression tests for the reported evasion cases. Made-with: Cursor

kushalsai-01 force-pushed the fix/security-rm-flag-ordering-bypass branch from bbcbbd9 to 594a039 Compare April 7, 2026 04:01

+    def test_catastrophic_delete_root_with_post_argument_flags(self):
+        decision = _evaluate_rail("rm / -rf")
+        assert decision.outcome == SecurityRisk.HIGH
+        assert decision.rule_name == RAIL_CATASTROPHIC_DELETE
+    def test_catastrophic_delete_non_root_target_with_post_argument_flags(self):
+        decision = _evaluate_rail("rm /etc -r -f")
+        assert decision.outcome == SecurityRisk.HIGH
+        assert decision.rule_name == RAIL_CATASTROPHIC_DELETE
+    def test_catastrophic_delete_with_redirection_interleaved_between_flags(self):
+        decision = _evaluate_rail("rm -r 2>&1 -f /")
+        assert decision.outcome == SecurityRisk.HIGH
+        assert decision.rule_name == RAIL_CATASTROPHIC_DELETE

Conversation

kushalsai-01 commented Apr 6, 2026

Summary

Changes

Testing

Notes

Checklist

Uh oh!

kushalsai-01 commented Apr 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

kushalsai-01 commented Apr 6, 2026

Uh oh!

kushalsai-01 commented Apr 6, 2026

Uh oh!

kushalsai-01 commented Apr 7, 2026

Uh oh!

all-hands-bot commented Apr 12, 2026

Uh oh!

all-hands-bot commented Apr 16, 2026

Uh oh!

Fieldnote-Echo commented Apr 19, 2026

Uh oh!

all-hands-bot commented Apr 25, 2026

Uh oh!

all-hands-bot commented Apr 30, 2026

Uh oh!

all-hands-bot commented May 6, 2026

Uh oh!

all-hands-bot commented May 11, 2026

Uh oh!

Fieldnote-Echo commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants