feat(security): defense-in-depth security analyzers by Fieldnote-Echo · Pull Request #2472 · OpenHands/software-agent-sdk

Fieldnote-Echo · 2026-03-16T22:08:09Z

Summary

Promotes the defense-in-depth security analyzer from a standalone example into first-class SDK analyzers under openhands.sdk.security.

This lands a deterministic, local, composable security analyzer family in the SDK. It does not change default wiring or claim universal safety as a default-on control.

Public analyzers

PatternSecurityAnalyzer — regex-based threat detection with two-corpus scanning. Shell-destructive patterns scan executable fields only; injection patterns scan all fields including reasoning text. Lives in openhands.sdk.security.defense_in_depth.
PolicyRailSecurityAnalyzer — deterministic segment-aware rails for composed threats: fetch-to-exec, raw-disk-op, catastrophic-delete. Lives in openhands.sdk.security.defense_in_depth.
EnsembleSecurityAnalyzer — pure combiner with max-severity fusion. No detection logic of its own. UNKNOWN preserved as first-class. Lives at openhands.sdk.security.ensemble (top-level, composes any analyzers — not limited to defense-in-depth siblings).

All three are importable from openhands.sdk.security.

Key changes

SecurityRisk comparison methods (__lt__, __gt__, __le__, __ge__) explicitly defined with shared _RISK_ORDER dict so max() works naturally on risk levels. @total_ordering cannot help here due to str, Enum MRO — documented in code.
EnsembleSecurityAnalyzer moved to top-level openhands.sdk.security.ensemble (per reviewer feedback — it composes analyzers generally, not just defense-in-depth siblings)
propagate_unknown: bool = False added to ensemble — default preserves current semantics; True makes any child UNKNOWN propagate immediately for stricter environments
RailDecision simplified: RailOutcome enum removed, outcome is now SecurityRisk directly (HIGH/LOW). reason field preserved for observability.
Rails docstring clarifies why they are separate from patterns: interpretability and maintainability of composed conditions, not regex capability
Ensemble is a pure combiner — no enable_policy_rails, no extraction, no normalization
Rails are their own analyzer (PolicyRailSecurityAnalyzer)
Two-corpus invariant preserved: shell patterns never scan reasoning text
Normalization expanded with broader invisible character set (same defensive category, more complete)
Stable detector/rule IDs as string constants (15 DET_* + 3 RAIL_*)
Serialization round-trip tests follow SDK convention (direct, polymorphic, container-field, roundtrip-then-detect, nested ensemble, propagate_unknown survives round-trip)
Example thinned from 1002 lines to ~44-line SDK import demonstration (renumbered to 47_)
Old conftest.py importlib hack deleted
Accidental docs-draft-defense-in-depth.mdx removed

What this does NOT do

No new dependencies
No changes to ConversationState, VerificationSettings, or agent wiring
No default-on behavior — additive only, users opt in
No homoglyph/confusable replacement (deferred)
No credential-path, permission-change, or encoded-payload heuristics (deferred)

Test results

218 passed, 5 strict xfailed, 0 failures. All pre-commit hooks pass (ruff, pyright, import rules).

Companion docs PR: OpenHands/docs#402

Test plan

uv run pytest tests/sdk/security/ -v — 218 pass, 5 xfail
Serialization round-trips verified (including propagate_unknown=True)
Two-corpus invariant tested
UNKNOWN handling tested (never promoted, never collapsed; propagate_unknown flag matrix covered)
Stable detector/rule IDs structurally asserted
No Dependabot alerts introduced

…al tests Single-file example implementing layered security analysis for agent tool calls: whitelisted extraction with field boundaries, Unicode normalization (NFKC + invisible/bidi stripping), segment-aware deterministic policy rails, regex pattern scanning, and max-severity ensemble fusion. Includes baseline test suite (93 tests) covering extraction, normalization, rails, pattern classification, ensemble fusion, and confirmation policy. Adversarial test suite (32 tests) with TDD-driven bug fixes, strict xfails documenting irreducible limitations, and hostile input stress tests. Closes OpenHands#2067

The 30k-char extraction cap silently drops content past the limit. Integrators who don't read the adversarial test suite won't know they're exposed if their agent processes large inputs.

The digit-prefixed example filename (45_...) requires importlib to import. Move the hack into conftest.py so it runs once at collection time; both test files now just reference sys.modules.

enyst · 2026-03-16T22:26:04Z

@OpenHands Do a /codereview on this PR. Investigate deeply the relevant code in the codebase.

openhands-ai · 2026-03-16T22:26:29Z

I'm on it! enyst can track my progress at all-hands.dev

enyst

I did a deeper pass on the example + tests and found two important correctness issues that look worth fixing before merge. Current targeted tests/examples pass, but both of the cases below reproduce locally with small ActionEvent probes.

[examples/01_standalone_sdk/45_defense_in_depth_security.py, Lines 153-161 and 563-577] 🔍 Correctness: _extract_segments() feeds thought, reasoning_content, and summary into the same corpus that shell-destructive rails/patterns scan. That means non-executable reasoning text can flip an otherwise safe action to HIGH. Repro: an action whose actual command is ls /tmp but whose thought says I should avoid rm -rf / currently returns HIGH in both PatternSecurityAnalyzer and EnsembleSecurityAnalyzer. That seems to undercut the stated goal of avoiding reasoning-related false positives. I think the safest fix is to keep shell/permission detection scoped to executable fields (tool_call.arguments, maybe tool metadata) and, if you still want thought/summary coverage, handle that in a separate prompt-injection/text-only path.

[examples/01_standalone_sdk/45_defense_in_depth_security.py, Lines 345-346 and 446] 🛠️ Correctness: the raw-disk dd detection only matches if=... of=/dev/.... dd operands are order-independent, so common destructive forms like dd of=/dev/sda if=/dev/zero and dd bs=1M of=/dev/sda if=/dev/zero currently fall through as LOW end-to-end (rail PASS, pattern LOW, ensemble LOW). Please make this check order-independent and add a regression test for of= before if=.

Fieldnote-Echo · 2026-03-16T22:39:36Z

Thank you @enyst for the feedback. I am drafting fixes and will push again when they are ready.

Finding 1: shell-destructive patterns now scan only executable fields (tool_name, tool_call.name, tool_call.arguments). Thought, reasoning, and summary are scanned only for injection/social-engineering patterns. Prevents false positives when an agent reasons about dangerous commands it chose not to run. Finding 2: dd raw-disk detection now matches of=/dev/ regardless of operand order. Adds regression tests for reversed and interleaved operands.

Fieldnote-Echo · 2026-03-16T22:53:28Z

@enyst Thanks OpenHands for the thorough review. Both findings are fixed in c398cf2: extraction now splits into executable and text corpora so shell patterns never see reasoning content, and the dd rail matches of=/dev/ regardless of operand order.

Co-authored-by: openhands <openhands@all-hands.dev>

Companion to OpenHands/software-agent-sdk#2472.

Fieldnote-Echo · 2026-03-17T21:13:20Z

Companion docs PR: OpenHands/docs#402

csmith49

Thanks for the contribution here! I'll leave some file-level comments/questions, but I also have a small project org. concern with the tests.

It's definitely a change in how things are organized to add SDK tests to things that are only in the examples. The current pattern is that the examples surface things in the SDK, so we don't need "example tests" because they're captured by the standard unit tests.

We could definitely just roll with this, but an alternative I prefer would be to move the relevant security analyzers to the SDK proper and just have the example surface how to use/configure them. They're general enough and useful enough that people building on the SDK might want to use them without the dynamic module loading tricks relied upon in the examples.

ETA: another alternative might be turning the example into a folder and package the tests directly with it instead of in tests/sdk.

@xingyaoww Any thoughts on the organization here?

examples/01_standalone_sdk/45_defense_in_depth_security.py

Fieldnote-Echo · 2026-03-18T21:51:51Z

@csmith49 Thank you Calvin for the substantive feedback. It seems the requested changes only make sense if this is moved into the SDK proper, so I will wait for feedback from @xingyaoww before pushing any new commits.

xingyaoww · 2026-03-19T17:45:37Z

prefer would be to move the relevant security analyzers to the SDK proper and just have the example surface how to use/configure them. They're general enough and useful enough that people building on the SDK might want to use them without the dynamic module loading tricks relied upon in the examples.

Agree with this! We could potentially just come up with a new type of SecurityAnalyzer instead of dumping all the logic in the example script.

I'd love to get this actually into SDK, maybe even as a security analyzer that always run since it doesn't seems to consume too much resource 🙏

Fieldnote-Echo · 2026-03-20T17:48:13Z

@xingyaoww @csmith49 Solid direction, thank you. I'll refactor it into a new SecurityAnalyzer type in the SDK. Before I push anything, I'll study the existing analyzer module structure to make sure it fits the current patterns. Any preferences on where it should live under openhands_aci/?

csmith49 · 2026-03-20T17:54:28Z

I expect most everything can live as a sub-module in openhands.sdk.security, much like the grayswan security analyzer we already have. But your contribution will represent a significant portion of that module, so do what makes sense for this approach!

all-hands-bot · 2026-03-26T12:34:02Z

[Automatic Post]: It has been a while since there was any activity on this PR. @Fieldnote-Echo, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

all-hands-bot · 2026-03-26T12:34:10Z

[Automatic Post]: It has been a while since there was any activity on this PR. @Fieldnote-Echo, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

Lead with the problem being solved, then the solution, then caveats. Module and class docstrings now explain what the reader needs to DO (use the analyzer) before explaining how it works internally. Docs guide restructured: problem -> what you get -> quick start -> composition -> design rationale -> limitations.

…-depth-sdk-promotion

Fieldnote-Echo · 2026-03-30T23:13:09Z

@csmith49 All four review comments are addressed in the latest push. The analyzers now live in openhands.sdk.security.defense_in_depth as SDK proper - ensemble is a pure combiner, rails are their own analyzer, and SecurityRisk.__lt__ enables natural max() ordering. Ready for re-review when you have a chance.

enyst · 2026-03-30T23:53:28Z

Just to note, the docs and review gate in CI are not mandatory, please don't be concerned about them - though the docs PR is most welcome of course!

…ter option

- _walk_json_strings: let RecursionError propagate to _extract_exec_segments raw-argument fallback instead of returning [] (Codex P1 — silent false-negative path for deeply nested payloads) - U+2028/U+2029: keep as whitespace for \s+ collapse instead of stripping (Codex P2 — stripping merges tokens, bypasses word-boundary regex detectors) - Downgrade per-match logs from info to debug to prevent log flooding in high-volume environments (Gemini — rate-limiting deferred)

Fieldnote-Echo · 2026-03-31T05:47:43Z

@xingyaoww One rollout note for later discussion, not for this PR: if the team eventually wants broader/default use, I think the main follow-ups are observability of why something was flagged and some real-world false-positive tuning on coding workflows. I kept those out of this PR so the SDK promotion stays tight, but I’m happy to open follow-up issues once this lands. These are more product decisions requiring OpenHands judgement and deployment experience specifically.

Fieldnote-Echo · 2026-03-31T14:54:57Z

@enyst Okay team, I believe this and the docs companion PR are "review-ready" - Docs PR: OpenHands/docs#402

openhands-sdk/openhands/sdk/security/risk.py

openhands-sdk/openhands/sdk/security/ensemble.py

openhands-sdk/openhands/sdk/security/defense_in_depth/ensemble.py

openhands-sdk/openhands/sdk/security/defense_in_depth/policy_rails.py

openhands-sdk/openhands/sdk/security/defense_in_depth/__init__.py

openhands-sdk/openhands/sdk/security/defense_in_depth/policy_rails.py

docs-draft-defense-in-depth.mdx

csmith49 · 2026-04-01T15:35:32Z

Hey @Fieldnote-Echo, this is coming along great. I left some comments (mostly suggestions), once you get a chance to look through those and see if they're relevant we can get this merged. Excited to get onto the observability/tuning so we can make these default as well!

- Move EnsembleSecurityAnalyzer to openhands.sdk.security.ensemble (top-level, composes any analyzers — not limited to defense-in-depth) - Add propagate_unknown: bool = False to ensemble (opt-in strict mode where any child UNKNOWN propagates immediately) - Simplify RailDecision: drop RailOutcome enum, outcome is now SecurityRisk directly (HIGH/LOW). reason field preserved. - Deduplicate SecurityRisk ordering via shared _RISK_ORDER dict and _check_comparable helper. Add __le__/__ge__ for complete surface. @total_ordering cannot help here due to str Enum MRO. - Update PolicyRailSecurityAnalyzer docstring: clarify rails are separate from patterns for interpretability, not regex capability - Delete accidental docs-draft-defense-in-depth.mdx 218 passed, 5 xfailed, 0 failures. Coding-Agent: claude-code Model: claude-opus-4-6

Document the new ensemble propagate_unknown flag in the stricter environments section and update the UNKNOWN explanation to cover both default and strict modes. Companion to OpenHands/software-agent-sdk#2472. Coding-Agent: claude-code Model: claude-opus-4-6

Fieldnote-Echo · 2026-04-03T15:47:05Z

@csmith49 Thank you for the polish and suggestions! I left one convo open in case you'd like a rename for clarity. Other than that, everything else is resolved.

csmith49

Much needed security analyzers and infrastructure to help fold them easily into our existing infra. Nice work!

Fieldnote-Echo mentioned this pull request Mar 16, 2026

docs(examples): Custom SecurityAnalyzerBase with defense-in-depth patterns #2067

Open

Fieldnote-Echo and others added 3 commits March 16, 2026 17:15

docs: surface _EXTRACT_HARD_CAP tradeoff for downstream integrators

e18fc80

The 30k-char extraction cap silently drops content past the limit. Integrators who don't read the adversarial test suite won't know they're exposed if their agent processes large inputs.

Merge branch 'main' into feat/defense-in-depth-security-analyzer

3433b1e

refactor(tests): extract importlib hack into conftest.py

1e50a1a

The digit-prefixed example filename (45_...) requires importlib to import. Move the hack into conftest.py so it runs once at collection time; both test files now just reference sys.modules.

enyst reviewed Mar 16, 2026

View reviewed changes

This comment was marked as duplicate.

Sign in to view

Fieldnote-Echo and others added 2 commits March 17, 2026 15:41

style: apply pre-commit ruff-format fixes

90bc325

style: fix ruff-format in conftest.py

86ce55f

Co-authored-by: openhands <openhands@all-hands.dev>

Fieldnote-Echo added a commit to Fieldnote-Echo/docs that referenced this pull request Mar 17, 2026

docs: add defense-in-depth security analyzer section

181da50

Companion to OpenHands/software-agent-sdk#2472.

Fieldnote-Echo mentioned this pull request Mar 17, 2026

docs: add defense-in-depth security analyzer section OpenHands/docs#402

Merged

enyst requested review from csmith49 and neubig March 18, 2026 08:03

csmith49 requested changes Mar 18, 2026

View reviewed changes

csmith49 reviewed Mar 18, 2026

View reviewed changes

examples/01_standalone_sdk/45_defense_in_depth_security.py Outdated Show resolved Hide resolved

csmith49 reviewed Mar 18, 2026

View reviewed changes

examples/01_standalone_sdk/45_defense_in_depth_security.py Outdated Show resolved Hide resolved

csmith49 reviewed Mar 18, 2026

View reviewed changes

examples/01_standalone_sdk/45_defense_in_depth_security.py Outdated Show resolved Hide resolved

csmith49 reviewed Mar 18, 2026

View reviewed changes

examples/01_standalone_sdk/45_defense_in_depth_security.py Outdated Show resolved Hide resolved

Fieldnote-Echo added 3 commits March 30, 2026 17:44

Merge remote-tracking branch 'upstream/main' into refactor/defense-in…

007c820

…-depth-sdk-promotion

test: add missing DET_EXEC_CODE_SUBPROCESS stable ID assertion

c7b0391

Fieldnote-Echo changed the title ~~feat(examples): defense-in-depth security analyzer~~ feat(security): defense-in-depth security analyzers Mar 30, 2026

Fieldnote-Echo requested a review from csmith49 March 30, 2026 23:13

Fieldnote-Echo added 3 commits March 31, 2026 00:02

Merge branch 'main' into feat/defense-in-depth-security-analyzer

fd5af05

fix: use HIGH as default confirmation threshold, show MEDIUM as stric…

a885b66

…ter option