Skip to content

Comments

Fix: Security analyzer ignores LLM security_risk when no analyzer is configured#2130

Open
juanmichelini wants to merge 4 commits intomainfrom
openhands/fix-security-analyzer-ignore-llm-risk
Open

Fix: Security analyzer ignores LLM security_risk when no analyzer is configured#2130
juanmichelini wants to merge 4 commits intomainfrom
openhands/fix-security-analyzer-ignore-llm-risk

Conversation

@juanmichelini
Copy link
Collaborator

@juanmichelini juanmichelini commented Feb 19, 2026

Summary

Fixes #1957

Previously, when llm_security_analyzer=False (or when no security analyzer was configured via conversation.set_security_analyzer()), the agent would still evaluate and use the security_risk value provided by the LLM. This caused actions to be incorrectly flagged with security risks even when security analysis was explicitly disabled.

Root Cause

The _extract_security_risk() method in agent.py was extracting and using the LLM-provided security_risk value regardless of whether a security analyzer was configured. The code only checked if the analyzer was an instance of LLMSecurityAnalyzer but didn't handle the case when security_analyzer is None.

Solution

Modified _extract_security_risk() to return SecurityRisk.UNKNOWN when security_analyzer is None, effectively ignoring any security_risk value provided by the LLM when no analyzer is configured.

Why add_security_risk_prediction=True is NOT changed

The add_security_risk_prediction=True parameter in openhands/sdk/agent/utils.py is intentionally kept unchanged. This parameter ensures the security_risk field is always included in tool schemas for consistency, even when the security analyzer is disabled. The schema consistency helps:

  1. LLMs have a consistent tool signature to work with
  2. Weaker models can still omit the field without breaking validation
  3. Runtime behavior is controlled by _extract_security_risk(), not the schema

The fix ensures that the field is properly ignored at runtime when no analyzer is set, which is the correct behavior as documented in the code comments.

Changes

  1. openhands-sdk/openhands/sdk/agent/agent.py: Added early return in _extract_security_risk() to return UNKNOWN when security_analyzer is None
  2. tests/sdk/agent/test_extract_security_risk.py: Updated test expectations to verify that security_risk is ignored (returns UNKNOWN) when no analyzer is set
  3. tests/sdk/agent/test_security_policy_integration.py: Enhanced existing test to verify the fix for the reported issue

Checklist

  • If the PR is changing/adding functionality, are there tests to reflect this?
    • Yes, updated existing tests and enhanced integration test
  • If there is an example, have you run the example to make sure that it works?
    • N/A - this is a bug fix, not a new example
  • If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
    • Yes, all tests pass
  • If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
    • N/A - this is a bug fix that restores expected behavior, no doc changes needed
  • Is the github CI passing?
    • Pre-commit checks passed locally, awaiting CI

@juanmichelini can click here to continue refining the PR


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:16daf37-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-16daf37-python \
  ghcr.io/openhands/agent-server:16daf37-python

All tags pushed for this build

ghcr.io/openhands/agent-server:16daf37-golang-amd64
ghcr.io/openhands/agent-server:16daf37-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:16daf37-golang-arm64
ghcr.io/openhands/agent-server:16daf37-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:16daf37-java-amd64
ghcr.io/openhands/agent-server:16daf37-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:16daf37-java-arm64
ghcr.io/openhands/agent-server:16daf37-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:16daf37-python-amd64
ghcr.io/openhands/agent-server:16daf37-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:16daf37-python-arm64
ghcr.io/openhands/agent-server:16daf37-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:16daf37-golang
ghcr.io/openhands/agent-server:16daf37-java
ghcr.io/openhands/agent-server:16daf37-python

About Multi-Architecture Support

  • Each variant tag (e.g., 16daf37-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 16daf37-python-amd64) are also available if needed

… set

Previously, when llm_security_analyzer=False (or when no security analyzer
was configured), the agent would still evaluate and use the security_risk
value provided by the LLM. This caused actions to be incorrectly flagged
with security risks even when security analysis was disabled.

This fix modifies _extract_security_risk() to return SecurityRisk.UNKNOWN
when security_analyzer is None, effectively ignoring any security_risk
value provided by the LLM when no analyzer is configured.

The schema still includes security_risk field (via add_security_risk_prediction=True)
for consistency, but the field is now properly ignored at runtime when
no analyzer is set.

Fixes #1957

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 19, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-agent-server/openhands/agent_server
   server_details_router.py48785%19–20, 70, 86–87, 89–90
openhands-agent-server/openhands/agent_server/docker
   build.py37123038%67, 70–71, 85, 90, 94–96, 100, 105, 122–127, 130, 153, 155, 163, 165–166, 168–172, 174, 178, 181, 184–187, 189–191, 193, 195, 197, 199, 203–204, 289–290, 294–295, 299–300, 304–305, 318, 320–328, 393, 402, 414, 418–422, 424, 436–437, 440, 455–457, 459–463, 465, 469–472, 474, 478–479, 483–484, 487, 490–491, 494, 496–501, 503, 507–515, 520–524, 528, 536–538, 546–549, 551–552, 554–555, 557, 572–573, 575, 577–578, 581–583, 585, 587, 595, 599–601, 607, 611, 621–622, 624, 626, 631, 635, 637–645, 647–648, 650–653, 660–661, 666, 669, 676, 681, 686, 692, 697, 704–705, 710, 715, 721, 726, 735, 738–744, 747–749, 752, 765–770, 772–773, 775–777, 780–781, 785–788, 790, 799, 810, 813, 829–834, 836–837, 839–843, 845, 852, 856
openhands-sdk/openhands/sdk/agent
   agent.py2333684%94, 98, 238–240, 242, 272–273, 280–281, 313, 366–367, 369, 409, 554–555, 560, 572–573, 578–579, 598–599, 601, 629–630, 637–638, 642, 650–651, 688, 694, 706, 713
openhands-sdk/openhands/sdk/workspace/remote
   base.py701775%40–45, 86–90, 167–169, 183–185
TOTAL19174502673% 

@juanmichelini juanmichelini marked this pull request as ready for review February 19, 2026 18:55
Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Clean, pragmatic bug fix. The early return in _extract_security_risk() properly separates "schema includes field" from "runtime uses field value". Tests reproduce the actual bug from #1957 and verify the fix. LGTM!

@juanmichelini
Copy link
Collaborator Author

@OpenHands @XZ-X has tested it but still sees “security risk” as “low”, not “unknown” as expected from this PR. Please double check the tests are testing correctly, and rerun them. Also do a manual integration test to first reproduce the issue, then make sure the fix addresses the issue. If not reconsider it.

@openhands-ai
Copy link

openhands-ai bot commented Feb 19, 2026

I'm on it! juanmichelini can track my progress at all-hands.dev

Copy link
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did the LLM return any security risk, if no analyzer was configured?

This report documents:
- Comprehensive testing of the fix (29/29 tests pass)
- Manual integration testing confirming the fix works correctly
- Architecture explanation of how security_risk extraction works
- Troubleshooting guide for @XZ-X to verify the fix in their environment
- Possible causes if still seeing 'low' instead of 'unknown'

The fix is correct and thoroughly tested. If @XZ-X is still experiencing
the issue, it's likely due to not using the code from this PR branch or
environmental factors (plugins, caching, etc).

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 20, 2026

PR Artifacts Cleaned Up

The .pr/ directory has been automatically removed.

@openhands-ai
Copy link

openhands-ai bot commented Feb 20, 2026

Summary of Investigation

I thoroughly investigated PR #2130 in response to @XZ-X's report that they still see "security risk" as "low" instead of "unknown" despite the fix.

What I Did

  1. ✅ Verified All Tests Pass

    • Ran all 29 security-related tests - all pass
    • Tests specifically verify that when llm_security_analyzer=False and no analyzer is set, LLM-provided security_risk values (LOW/MEDIUM/HIGH) are ignored and return UNKNOWN
  2. ✅ Created Manual Integration Tests

  3. ✅ Deep Code Analysis

    • Traced the complete code flow from LLM response through security_risk extraction
    • Confirmed the fix logic is correct: early return when security_analyzer is None
    • Verified there's no code path that auto-creates a security analyzer

Conclusion

The fix is CORRECT and thoroughly tested. The implementation properly ignores LLM-provided security_risk values when no analyzer is configured.

If @XZ-X is still seeing "low" instead of "unknown", it's due to environmental factors:

  • Most likely: Not using code from this PR branch
  • Possibly: Plugins/custom code setting a security analyzer
  • Possibly: Looking at wrong event type (non-ActionEvent)
  • Possibly: Stale Python bytecode cache

Changes Made

I added .pr/verification-report.md containing:

  • Complete test results proving the fix works
  • Architecture explanation of security_risk handling
  • Troubleshooting guide with debug code for @XZ-X
  • Step-by-step verification steps

This report has been committed and pushed to the PR branch for @XZ-X to review.

No changes to the actual fix code were needed - the fix is correct as implemented.

@enyst
Copy link
Collaborator

enyst commented Feb 21, 2026

Why did the LLM return any security risk, if no analyzer was configured?

Ah, I think the answer is here:

This parameter ensures the security_risk field is always included in tool schemas, even when the security analyzer is disabled

OK 🤔

I believe Rohit had implemented this, if you need a second opinion, but SGTM. @juanmichelini

@github-actions
Copy link
Contributor

github-actions bot commented Feb 22, 2026

API breakage checks (Griffe)

Result: Passed

Action log

@XZ-X
Copy link

XZ-X commented Feb 23, 2026

@openhands-agent I still see security_risks as LOW/HIGH (not UNKNOWN). Here's my script:

from openhands.sdk import (
    LLM,
    Agent,
    Conversation,
    Event,
    Message,
    TextContent,
)
from openhands.sdk.logger import setup_logging
from openhands.tools.preset.default import get_default_tools
from openhands.workspace import DockerWorkspace


workspace = DockerWorkspace(
    # base_image="nikolaik/python-nodejs:python3.12-nodejs22",
    server_image="ghcr.io/openhands/agent-server:latest-python",
    working_dir="/workspace",
    detach_logs=False,
)


task = open("danger.txt").read()
prompt = """
    You are a helpful assistant that can help with coding tasks
    Here is the task:
    {task}
""".format(
    task=task
)
msg = Message(
    role="user",
    content=[TextContent(text=prompt)],
)
llm = LLM(
    model="Qwen3-Coder-30B-A3B",
    base_url="....",
    api_key="...",
    custom_llm_provider="openai",
)
system_prompt_kwargs: dict[str, object] = {"cli_mode": False, "llm_security_analyzer": False}
agent = Agent(
    llm=llm,
    tools=get_default_tools(enable_browser=False),
    system_prompt_kwargs=system_prompt_kwargs,
)
conversation = Conversation(
    agent=agent,
    workspace=workspace,
    # callbacks=[lambda ev: logger.info("Event: %s", ev)],
    callbacks=[],
    max_iteration_per_run=3,
)

conversation.send_message(msg)
try:
    conversation.run()
except Exception as e:
    # should be exceed max step error, it's okay.
    pass
    # import traceback
    # traceback.print_exc()
events: list[Event] = list(conversation.state.events)
for event in events:
    if hasattr(event, "security_risk"):
        print(event.security_risk)

workspace.cleanup()

And I validated my repo commit id as follows:

% git rev-parse HEAD
e988c6aa0470038bee45878f1c1d1d2211c08167

I also execute make build before running the test. Did I miss something? How to check the exact version I'm running?

@enyst
Copy link
Collaborator

enyst commented Feb 23, 2026

@OpenHands investigate and fix #2130 (comment)

@openhands-ai
Copy link

openhands-ai bot commented Feb 23, 2026

I'm on it! enyst can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>
@openhands-ai
Copy link

openhands-ai bot commented Feb 23, 2026

Summary (new work since last update)

I investigated the PR comment report (“still seeing security_risk=LOW/HIGH”) and confirmed the most likely cause is version mismatch: the repro script uses DockerWorkspace(server_image="ghcr.io/openhands/agent-server:latest-python"), so the agent logic executes in the remote agent-server container, not the local checkout where the PR fix exists. Therefore git rev-parse HEAD in the client repo does not guarantee the running agent-server has the fix.

To make this easy to verify (and to unblock debugging), I implemented and pushed agent-server build/version introspection:

  • Enhanced GET /server_info to include:
    • sdk_version, tools_version, workspace_version
    • build_git_sha, build_git_ref (from env)
    • python_version
  • Updated agent-server docker build helper to pass OPENHANDS_BUILD_GIT_SHA / OPENHANDS_BUILD_GIT_REF as build args.
  • Updated agent-server Dockerfile to expose those build args as environment variables.
  • Added RemoteWorkspace.get_server_info() in the SDK so users can programmatically print the agent-server metadata from a DockerWorkspace.

Checklist

  • Request addressed: investigated the report and implemented a concrete fix to diagnose “what version am I running?” in DockerWorkspace/agent-server setups.
  • Changes are concise: only touches server info endpoint + build metadata plumbing + small SDK helper.
  • Quality gates: ran pre-commit on touched files and ran targeted pytest covering impacted areas.
  • Pushed updates to the existing PR branch.

Git / PR update

This enables the reporter to verify whether their running latest-python image actually contains the _extract_security_risk() change (it won’t until the image is built from a commit that includes it), and to confirm the exact build SHA/ref of the agent-server they’re executing against.

@enyst
Copy link
Collaborator

enyst commented Feb 23, 2026

Oh, I see. @juanmichelini and @XZ-X , please feel free to use or remove the agent's last commit, as you see fit. It just tries to show info in docker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Security analyzer reports security risks when llm_security_analyzer is set to False

5 participants