Fix event search performance and kind filter bug by rbren · Pull Request #2174 · OpenHands/software-agent-sdk

rbren · 2026-02-23T02:26:03Z

Summary

This PR fixes critical performance issues and a bug in the events search endpoint that caused:

Kind filter never matching - The filter compared against full module paths instead of simple class names
Slow searches - Holding FIFOLock during full O(n) disk scans blocked agent execution

Key Changes

Bug Fixes

Kind filter comparison - Changed from comparing f"{event.__class__.__module__}.{event.__class__.__name__}" to just event.__class__.__name__. Clients send simple class names like 'MessageEvent', not full module paths.

Performance Improvements

Removed FIFOLock during search - Search now operates on a snapshot of event log length (GIL makes int reads atomic), avoiding blocking the agent runner thread
Early exit for TIMESTAMP_DESC - DESC queries now scan from end of event log, making "last N events" queries O(limit) instead of O(n)
O(1) cursor lookup - Uses EventLog.get_index() for pagination cursor lookup when available

Tests Added

New performance tests in test_event_service_perf.py that fail fast for slow queries:

test_kind_filter_uses_simple_class_name - Verifies simple class names work
test_search_with_limit_completes_quickly - Verifies early exit optimization
test_desc_search_last_10_events_fast - Verifies DESC scan from end
test_kind_filter_with_many_events_is_fast - Verifies kind filter performance
test_pagination_cursor_lookup_is_fast - Verifies O(1) cursor lookup

Testing

uv run pytest tests/agent_server/test_event_service.py tests/agent_server/test_event_service_perf.py -v
# 67 passed

Relaxed Constraints

Per the issue, some constraints about the endpoint being perfectly up-to-date have been relaxed:

Search operates on a snapshot of event count, so new events appended during search may not be included
This is acceptable as it avoids blocking agent execution with the FIFOLock

@rbren can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:bee4fe1-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-bee4fe1-python \
  ghcr.io/openhands/agent-server:bee4fe1-python

All tags pushed for this build

ghcr.io/openhands/agent-server:bee4fe1-golang-amd64
ghcr.io/openhands/agent-server:bee4fe1-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:bee4fe1-golang-arm64
ghcr.io/openhands/agent-server:bee4fe1-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:bee4fe1-java-amd64
ghcr.io/openhands/agent-server:bee4fe1-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:bee4fe1-java-arm64
ghcr.io/openhands/agent-server:bee4fe1-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:bee4fe1-python-amd64
ghcr.io/openhands/agent-server:bee4fe1-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:bee4fe1-python-arm64
ghcr.io/openhands/agent-server:bee4fe1-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:bee4fe1-golang
ghcr.io/openhands/agent-server:bee4fe1-java
ghcr.io/openhands/agent-server:bee4fe1-python

About Multi-Architecture Support

Each variant tag (e.g., bee4fe1-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., bee4fe1-python-amd64) are also available if needed

Key changes: - Fix kind filter to use simple class name (e.g., 'MessageEvent') instead of fully qualified module path. The old code compared against full paths like 'openhands.sdk.event.llm_convertible.message.MessageEvent' which never matched the simpler class names sent by clients. - Remove FIFOLock acquisition during event search to avoid blocking the agent runner thread during O(n) disk scans. The search now operates on a snapshot of the event log length (GIL makes int reads atomic). - Add early exit optimization for TIMESTAMP_DESC queries that scan from the end of the event log, reducing search time for 'last N events' queries. - Add pagination cursor O(1) lookup using EventLog.get_index() when available, with fallback to linear search for test mocks. - Update _count_events_sync with same kind filter fix. - Add performance tests that fail fast for slow queries: * test_kind_filter_uses_simple_class_name * test_search_with_limit_completes_quickly * test_desc_search_last_10_events_fast * test_kind_filter_with_many_events_is_fast * test_pagination_cursor_lookup_is_fast - Update existing tests to use simple class names in kind filter assertions. Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-02-23T02:26:28Z

API breakage checks (Griffe)

Result: Passed

Action log

github-actions · 2026-02-23T02:27:40Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-agent-server/openhands/agent_server
event_service.py	348	99	71%	55–56, 74–76, 84–91, 94–97, 117, 141–142, 165–166, 185, 192, 249, 263–264, 272, 325–326, 330, 338, 341, 389–390, 398–399, 415, 417, 421–423, 427, 436–437, 439, 443, 449, 451, 459–464, 600, 602–603, 607, 621–623, 625, 629–632, 636–639, 647–650, 669–670, 672–679, 681–682, 691–692, 694–695, 702–703, 705–706, 710, 716, 733–734
TOTAL	18778	8588	54%

When the agent is actively running, it holds the state lock for potentially long periods (during LLM calls, tool execution, etc.). This caused WebSocket subscriptions and event lookups to block, making the UI appear frozen. Changes: - subscribe_to_events: Use non-blocking lock acquisition. If lock is held, send minimal state update (just execution_status) instead of blocking. - _get_event_sync: Remove lock acquisition entirely. Event reading is safe without lock since events are immutable once appended. - Add test: test_subscribe_does_not_block_when_lock_held verifies that WebSocket subscriptions complete quickly even when lock is held. Co-authored-by: openhands <openhands@all-hands.dev>

rbren · 2026-02-23T02:52:59Z

@OpenHands fix the failing gh actions

openhands-ai · 2026-02-23T02:53:18Z

I'm on it! rbren can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2026-02-23T02:54:31Z

Summary

I fixed the failing GitHub Actions pre-commit check on PR #2174.

Issue

The Pre-commit checks/pre-commit workflow was failing due to a Ruff formatting violation in event_service.py.

Fix Applied

Changed a multi-line logger.debug() call to a single line:

-                logger.debug(
-                    "Lock held during subscribe, sending minimal state update"
-                )
+                logger.debug("Lock held during subscribe, sending minimal state update")

Verification

✅ Ran pre-commit locally - all checks passed
✅ Change is minimal and directly addresses the formatting issue
✅ Committed with proper co-author attribution
✅ Already pushed to fix-event-search-perf branch

The GitHub Actions should now pass with this formatting fix.

Fix formatting in event_service.py

68c3be1

Co-authored-by: openhands <openhands@all-hands.dev>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Fix event search performance and kind filter bug#2174

Fix event search performance and kind filter bug#2174
rbren wants to merge 3 commits intomainfrom
fix-event-search-perf

rbren commented Feb 23, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Feb 23, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 23, 2026 •

edited

Loading

Uh oh!

rbren commented Feb 23, 2026

Uh oh!

openhands-ai bot commented Feb 23, 2026

Uh oh!

openhands-ai bot commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

rbren commented Feb 23, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Changes

Bug Fixes

Performance Improvements

Tests Added

Testing

Relaxed Constraints

Uh oh!

github-actions bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

API breakage checks (Griffe)

Uh oh!

github-actions bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rbren commented Feb 23, 2026

Uh oh!

openhands-ai bot commented Feb 23, 2026

Uh oh!

openhands-ai bot commented Feb 23, 2026

Summary

Issue

Fix Applied

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rbren commented Feb 23, 2026 •

edited by github-actions bot

Loading

github-actions bot commented Feb 23, 2026 •

edited

Loading

github-actions bot commented Feb 23, 2026 •

edited

Loading