Skip to content

fix(security): close rm flag-ordering bypasses#2718

Open
kushalsai-01 wants to merge 2 commits into
OpenHands:mainfrom
kushalsai-01:fix/security-rm-flag-ordering-bypass
Open

fix(security): close rm flag-ordering bypasses#2718
kushalsai-01 wants to merge 2 commits into
OpenHands:mainfrom
kushalsai-01:fix/security-rm-flag-ordering-bypass

Conversation

@kushalsai-01
Copy link
Copy Markdown
Contributor

Fixes #2707

Summary

This PR closes a security gap where rm commands could bypass safeguards by reordering -r and -f flags or inserting additional options between them.

The validation logic now correctly detects recursive and force flags regardless of ordering or intervening arguments, preventing evasion of the safety checks.

Changes

  • Normalize and validate rm flag combinations independent of order
  • Handle cases with interleaved flags and options
  • Strengthen command parsing to prevent bypass scenarios
  • Add regression tests covering reported evasion patterns

Testing

  • Added unit tests for:
    • rm -rf, rm -fr
    • Flags separated by other options (e.g., rm -r --verbose -f)
    • Edge cases involving mixed/invalid ordering
  • All existing tests pass

Notes

This change is backward-compatible and only affects validation logic for potentially destructive rm commands.

Checklist

  • Tests added/updated
  • Verified changes locally
  • CI passing (or expected to pass)

Copilot AI review requested due to automatic review settings April 6, 2026 09:36
@kushalsai-01
Copy link
Copy Markdown
Contributor Author

Hi @maintainers, I’ve addressed the issue related to rm recursive-force flag ordering bypass and added regression tests for the reported cases.

Could you please review when you get a chance? Happy to make any changes if needed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to close a security gap in the SDK’s defense-in-depth analyzers where destructive rm invocations could evade detection by reordering -r/-f flags or inserting other options between them.

Changes:

  • Updated rm -rf detection regexes to allow intervening options and any flag ordering.
  • Extended policy-rail catastrophic-delete detection to use the updated rm flag detection.
  • Added regression tests for intervening/split-flag variants.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
openhands-sdk/openhands/sdk/security/defense_in_depth/pattern.py Updates the HIGH-risk rm -rf pattern to detect re-ordered/interleaved flags.
openhands-sdk/openhands/sdk/security/defense_in_depth/policy_rails.py Updates the catastrophic-delete rail’s rm recursive+force detection logic.
tests/sdk/security/defense_in_depth/test_pattern.py Adds new HIGH test cases for intervening/split rm flags.
tests/sdk/security/defense_in_depth/test_policy_rails.py Adds new catastrophic-delete rail tests for intervening/split rm flags.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +73 to +77
r"\brm\b(?:(?![;&|]).){0,120}"
r"(?:(?:--recursive\b|-[^\s-]*[rR][^\s]*)"
r"(?:(?![;&|]).){0,120}(?:--force\b|-[^\s-]*f[^\s]*)"
r"|(?:--force\b|-[^\s-]*f[^\s]*)"
r"(?:(?![;&|]).){0,120}(?:--recursive\b|-[^\s-]*[rR][^\s]*))",
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new rm -rf detector relies on multiple .{0,120} bounded spans. This still allows bypass by inserting a long positional argument / option value (e.g., a >120-char filename) between rm and -r/-f, or between the two flags, which GNU rm accepts and would evade this regex. Consider switching to a token-based scan (anchor on rm, then scan the rest of the same command segment for recursive+force flags) or otherwise remove the fixed-length dependency so detection can’t be defeated by long-but-valid arguments.

Suggested change
r"\brm\b(?:(?![;&|]).){0,120}"
r"(?:(?:--recursive\b|-[^\s-]*[rR][^\s]*)"
r"(?:(?![;&|]).){0,120}(?:--force\b|-[^\s-]*f[^\s]*)"
r"|(?:--force\b|-[^\s-]*f[^\s]*)"
r"(?:(?![;&|]).){0,120}(?:--recursive\b|-[^\s-]*[rR][^\s]*))",
r"\brm\b"
r"(?=[^;&|]*(?:--recursive\b|-[^\s-]*[rR][^\s]*))"
r"(?=[^;&|]*(?:--force\b|-[^\s-]*f[^\s]*))"
r"[^;&|]*",

Copilot uses AI. Check for mistakes.
Comment on lines +73 to +77
r"\brm\b(?:(?![;&|]).){0,120}"
r"(?:(?:--recursive\b|-[^\s-]*[rR][^\s]*)"
r"(?:(?![;&|]).){0,120}(?:--force\b|-[^\s-]*f[^\s]*)"
r"|(?:--force\b|-[^\s-]*f[^\s]*)"
r"(?:(?![;&|]).){0,120}(?:--recursive\b|-[^\s-]*[rR][^\s]*))",
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The (?![;&|]) tempered-dot treats any & as a command boundary, but & is also valid inside redirections (e.g., 2>&1, &>/dev/null). An attacker can place such a redirection between -r and -f (e.g., rm -r 2>&1 -f /) and bypass detection while the command remains a single rm invocation. Consider narrowing the boundary logic to actual shell operators (;, |, &&, background &) rather than any raw & character, or use a small shell-operator tokenizer for the segment.

Copilot uses AI. Check for mistakes.
Comment on lines 87 to 96
has_recursive_force = bool(
re.search(
r"\brm\s+(?:-[frR]{2,}|-[rR]\s+-f|-f\s+-[rR]"
r"|--recursive\s+--force|--force\s+--recursive)\b",
r"\brm\b(?:(?![;&|]).){0,120}"
r"(?:(?:--recursive\b|-[^\s-]*[rR][^\s]*)"
r"(?:(?![;&|]).){0,120}(?:--force\b|-[^\s-]*f[^\s]*)"
r"|(?:--force\b|-[^\s-]*f[^\s]*)"
r"(?:(?![;&|]).){0,120}(?:--recursive\b|-[^\s-]*[rR][^\s]*))",
seg,
ci,
)
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This same complex rm -rf regex is duplicated here and in pattern.py. With the added complexity (tempered dots, ordering alternation), keeping two copies increases the risk of future drift/fixes being applied in only one analyzer. Consider factoring the rm recursive+force detection into a shared helper/constant used by both analyzers.

Copilot uses AI. Check for mistakes.
Comment on lines 88 to 96
re.search(
r"\brm\s+(?:-[frR]{2,}|-[rR]\s+-f|-f\s+-[rR]"
r"|--recursive\s+--force|--force\s+--recursive)\b",
r"\brm\b(?:(?![;&|]).){0,120}"
r"(?:(?:--recursive\b|-[^\s-]*[rR][^\s]*)"
r"(?:(?![;&|]).){0,120}(?:--force\b|-[^\s-]*f[^\s]*)"
r"|(?:--force\b|-[^\s-]*f[^\s]*)"
r"(?:(?![;&|]).){0,120}(?:--recursive\b|-[^\s-]*[rR][^\s]*))",
seg,
ci,
)
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like pattern.py, this rm -rf detector depends on fixed .{0,120} windows. A long-but-valid argument between rm and -r/-f (or between the flags) can exceed the bound and prevent the catastrophic-delete rail from firing. Since this rail gates RAIL_CATASTROPHIC_DELETE, consider a token-based scan within the segment (up to the actual command operator boundary) so the rail can’t be bypassed by argument length.

Copilot uses AI. Check for mistakes.
Comment on lines +89 to 90


Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description/linked issue call out “post-argument flags” bypasses (e.g., rm / -rf, rm /etc -r -f), but the new regression tests added here don’t cover those cases. Adding explicit cases for flags appearing after positional arguments (and ideally a redirection-interleaving case like rm -r 2>&1 -f /) would better ensure the bypass class is closed.

Suggested change
def test_catastrophic_delete_root_with_post_argument_flags(self):
decision = _evaluate_rail("rm / -rf")
assert decision.outcome == SecurityRisk.HIGH
assert decision.rule_name == RAIL_CATASTROPHIC_DELETE
def test_catastrophic_delete_non_root_target_with_post_argument_flags(self):
decision = _evaluate_rail("rm /etc -r -f")
assert decision.outcome == SecurityRisk.HIGH
assert decision.rule_name == RAIL_CATASTROPHIC_DELETE
def test_catastrophic_delete_with_redirection_interleaved_between_flags(self):
decision = _evaluate_rail("rm -r 2>&1 -f /")
assert decision.outcome == SecurityRisk.HIGH
assert decision.rule_name == RAIL_CATASTROPHIC_DELETE

Copilot uses AI. Check for mistakes.
Comment on lines 287 to 294
_HIGH_CASES = [
("rm -rf /", "rm -rf"),
("rm -fr /home", "rm -fr"),
("rm -r -f /tmp", "rm -r -f"),
("rm --no-preserve-root -rf /", "rm with intervening long flag"),
("rm --preserve-root=all -r -f /tmp", "rm split flags with long option"),
("rm --recursive --force /", "rm --recursive --force"),
("sudo rm secret.db", "sudo rm"),
Copy link

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New HIGH cases cover intervening long options, but the reported “post-argument flags” bypasses (e.g., rm / -rf, rm /etc -r -f) still aren’t represented in this parametrized list. Adding those commands (and a redirection-interleaving variant like rm -r 2>&1 -f /) would make the regression suite align with the PR description and ensure ordering/position invariance holds.

Copilot uses AI. Check for mistakes.
@kushalsai-01
Copy link
Copy Markdown
Contributor Author

Thanks for the detailed review.

I see the concerns around the bounded regex (.{0,120}) allowing bypass via long arguments, as well as edge cases like redirection interleaving and post-argument flags. I’ll rework the detection to remove fixed-length constraints and move toward a more robust approach, along with adding the suggested regression tests and reducing duplication.

Pushing an update shortly.

@kushalsai-01
Copy link
Copy Markdown
Contributor Author

Addressed the review feedback:

  • Replaced regex with token-based parsing (shlex)
  • Removed fixed-length bounds to prevent bypass via long arguments
  • Handled redirections and command segmentation correctly
  • Added regression tests for post-argument flags and edge cases
  • Refactored shared detection logic

Workflows are awaiting approval—once approved, CI should run.

Would appreciate a re-review.

Close flag-ordering bypasses by matching rm recursive and force flags in any order with intervening options, and add regression tests for the reported evasion cases.

Made-with: Cursor
Replace bounded rm regex matching with shared token-based shell segment parsing, reuse it across pattern and policy analyzers, and add regression tests for reordered flags, redirections, long arguments, and substring safety.

Made-with: Cursor
@kushalsai-01 kushalsai-01 force-pushed the fix/security-rm-flag-ordering-bypass branch from bbcbbd9 to 594a039 Compare April 7, 2026 04:01
@kushalsai-01
Copy link
Copy Markdown
Contributor Author

Hi @maintainers — could you please approve workflows so CI can run? I’ve rebased onto the latest main. Thanks.

@all-hands-bot
Copy link
Copy Markdown
Collaborator

[Automatic Post]: It has been a while since there was any activity on this PR. @kushalsai-01, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

@all-hands-bot
Copy link
Copy Markdown
Collaborator

[Automatic Post]: This PR seems to be currently waiting for review. @copilot-pull-request-reviewer[bot], could you please take a look when you have a chance?

@Fieldnote-Echo
Copy link
Copy Markdown
Contributor

@kushalsai-01, thanks for picking this up. As the issue author, the scope in #2707 is met: your changes catch intervening flags (rm --no-preserve-root -rf /), post-argument flags (rm / -rf), compact/reversed short flags (-rf, -fr), and long flags in any order. Bonus, your shlex-based tokenization also closes path-qualified commands (/bin/rm -rf /) and quote-based evasions (r"m" -rf /) that were on my follow-up list but out of scope for #2707.

A few things I liked specifically:

  • Single _has_rm_recursive_force shared across PatternSecurityAnalyzer and the catastrophic-delete rail, no duplicated detection logic
  • Env-assignment prefix handling (FOO=bar rm -rf /) and redirection-token skip (rm -r 2>&1 -f /) are both nice touches I didn't think to call out
  • Negative tests on germ -rf /, rm -r alone, and rm -f alone, the kind of guard that prevents FP regressions later
  • Conservative docstring ("does not attempt a full shell grammar") sets expectations for the next person

One note for reviewers: I support the tree-sitter direction in #2721, but that's a cross-package migration on its own timeline. This PR solves a real bypass class now and is merge-shaped and clean.

Your CI hasn't been able to run yet because workflows need maintainer approval for first-time contributors. @enyst, could you approve the workflow runs on this so we get a CI signal?

@all-hands-bot
Copy link
Copy Markdown
Collaborator

[Automatic Post]: It has been a while since there was any activity on this PR. @kushalsai-01, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

3 similar comments
@all-hands-bot
Copy link
Copy Markdown
Collaborator

[Automatic Post]: It has been a while since there was any activity on this PR. @kushalsai-01, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

@all-hands-bot
Copy link
Copy Markdown
Collaborator

[Automatic Post]: It has been a while since there was any activity on this PR. @kushalsai-01, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

@all-hands-bot
Copy link
Copy Markdown
Collaborator

[Automatic Post]: It has been a while since there was any activity on this PR. @kushalsai-01, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

@Fieldnote-Echo
Copy link
Copy Markdown
Contributor

@kushalsai-01, quick context update.

The maintainers are moving the broader shell-parsing layer to tree-sitter-bash (issue #2721, phased plan in VascoSch92's comment). Phase 1 (PR #3237) is already open and drops the GPLv3 bashlex dependency. Phase 2 migrates the security analyzer onto the shared parser.

That doesn't make this PR obsolete. It makes it the demonstration that motivated the bigger shift. Showing the bypass class was real and that regex couldn't close it is the work that put a structural parser on the table. The shlex direction was the right call for the surface available at the time, and the negative tests (germ -rf /, rm -r alone, rm -f alone) set the false-positive discipline the tree-sitter version will inherit.

If you're interested in helping with the migration, Vasco split Phase 2 into focused chunks (2a-2f). The pieces that overlap with what you built here are Phase 2c (policy_rails.py rewrite, which extends your _has_rm_recursive_force direction to the full rail) and Phase 2d (replaces the hand-rolled shlex parsing with tree-sitter queries). Either would build on your work directly.

Thanks for picking this up. Your PR is the reason the discussion moved past "another regex tweak" to "structural fix."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Two-phase rm detection to eliminate flag-ordering bypasses

4 participants