perf(agent-server): cut conversation search/count to page-slice work by csmith49 · Pull Request #3260 · OpenHands/software-agent-sdk

csmith49 · 2026-05-14T15:04:44Z

Summary

Addresses #3142 (one of the unresolved sub-issues under the perf tracking issue #3153 that had no PR attached).

ConversationService._search_conversations and _count_conversations previously iterated every entry in _event_services, awaited get_state() on each, and composed a full ConversationInfo (which serializes the whole state via model_dump) before applying pagination. That made GET /api/conversations and the count endpoint O(N) in the total number of conversations regardless of limit.

This PR is the minimum change that decouples the per-request cost from total conversation count, without yet introducing the larger sorted-metadata index suggested in the issue (that refactor can build on top of this).

Changes

_search_conversations now sorts candidates by stored.created_at / stored.updated_at, which already live on StoredConversation metadata. We no longer need a state load (or _compose_conversation_info / model_dump) for ordering. Only the actual page slice composes ConversationInfo.
_count_conversations short-circuits to len(_event_services) when no execution_status filter is supplied, skipping the per-conversation state fan-out entirely.
Behavior contracts are preserved: sort orders, status filtering, pagination, and next_page_id semantics are unchanged.

Tests

All 75 existing test_conversation_service.py tests continue to pass, as do the 67 test_conversation_router{,_acp}.py tests.
Added two regression tests guarding the new behavior:
- test_count_no_filter_skips_state_loads — verifies get_state() is never awaited when counting without a filter.
- test_search_skips_state_for_non_page_items — verifies get_state() is awaited only for conversations actually returned in the page slice.

Notes for reviewers

The state load is still required for execution_status filtering in _search_conversations, since StoredConversation does not currently track that. A future improvement (the "sorted metadata index" idea from the issue) could push that field onto stored metadata and remove the remaining O(N) walk; this PR intentionally keeps the surface area small.
Sort keys come from StoredConversation.created_at/updated_at, which are the same values previously copied into ConversationInfo by _compose_conversation_info, so observable ordering is unchanged.

This PR was created by an AI agent (OpenHands) on behalf of the user.

@csmith49 can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:ea32c95-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-ea32c95-python \
  ghcr.io/openhands/agent-server:ea32c95-python

All tags pushed for this build

ghcr.io/openhands/agent-server:ea32c95-golang-amd64
ghcr.io/openhands/agent-server:ea32c95df98989c3d9ada0f72770cab7dec2e1ef-golang-amd64
ghcr.io/openhands/agent-server:fix-3142-conversation-search-count-perf-golang-amd64
ghcr.io/openhands/agent-server:ea32c95-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:ea32c95-golang-arm64
ghcr.io/openhands/agent-server:ea32c95df98989c3d9ada0f72770cab7dec2e1ef-golang-arm64
ghcr.io/openhands/agent-server:fix-3142-conversation-search-count-perf-golang-arm64
ghcr.io/openhands/agent-server:ea32c95-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:ea32c95-java-amd64
ghcr.io/openhands/agent-server:ea32c95df98989c3d9ada0f72770cab7dec2e1ef-java-amd64
ghcr.io/openhands/agent-server:fix-3142-conversation-search-count-perf-java-amd64
ghcr.io/openhands/agent-server:ea32c95-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:ea32c95-java-arm64
ghcr.io/openhands/agent-server:ea32c95df98989c3d9ada0f72770cab7dec2e1ef-java-arm64
ghcr.io/openhands/agent-server:fix-3142-conversation-search-count-perf-java-arm64
ghcr.io/openhands/agent-server:ea32c95-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:ea32c95-python-amd64
ghcr.io/openhands/agent-server:ea32c95df98989c3d9ada0f72770cab7dec2e1ef-python-amd64
ghcr.io/openhands/agent-server:fix-3142-conversation-search-count-perf-python-amd64
ghcr.io/openhands/agent-server:ea32c95-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:ea32c95-python-arm64
ghcr.io/openhands/agent-server:ea32c95df98989c3d9ada0f72770cab7dec2e1ef-python-arm64
ghcr.io/openhands/agent-server:fix-3142-conversation-search-count-perf-python-arm64
ghcr.io/openhands/agent-server:ea32c95-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:ea32c95-golang
ghcr.io/openhands/agent-server:ea32c95df98989c3d9ada0f72770cab7dec2e1ef-golang
ghcr.io/openhands/agent-server:fix-3142-conversation-search-count-perf-golang
ghcr.io/openhands/agent-server:ea32c95-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:ea32c95-java
ghcr.io/openhands/agent-server:ea32c95df98989c3d9ada0f72770cab7dec2e1ef-java
ghcr.io/openhands/agent-server:fix-3142-conversation-search-count-perf-java
ghcr.io/openhands/agent-server:ea32c95-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:ea32c95-python
ghcr.io/openhands/agent-server:ea32c95df98989c3d9ada0f72770cab7dec2e1ef-python
ghcr.io/openhands/agent-server:fix-3142-conversation-search-count-perf-python
ghcr.io/openhands/agent-server:ea32c95-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

Each variant tag (e.g., ea32c95-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., ea32c95-python-amd64) are also available if needed

Both ``_search_conversations`` and ``_count_conversations`` previously iterated every entry in ``_event_services``, awaited ``get_state()`` on each, and composed a full ``ConversationInfo`` (which serializes the whole state via ``model_dump``) before applying pagination. That made ``GET /api/conversations`` and ``count`` O(N) in the total number of conversations regardless of ``limit``. - ``_search_conversations`` now sorts by ``stored.created_at`` / ``stored.updated_at`` (which already live on ``StoredConversation`` metadata), so we no longer need a state load to order results. Only the actual page slice composes ``ConversationInfo``. - ``_count_conversations`` short-circuits to ``len(_event_services)`` when no ``execution_status`` filter is supplied, skipping the state fan-out entirely. This is the minimum change needed to make these endpoints cheap at scale without introducing a separate sorted metadata index (the bigger refactor suggested in the issue can build on top of this). Refs #3142 Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-05-14T15:05:19Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-05-14T15:05:33Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-05-14T15:06:59Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-agent-server/openhands/agent_server
conversation_service.py	586	116	80%	144–145, 154, 180–181, 185–186, 191, 303–304, 335, 338, 345–351, 378, 384, 478, 484, 489, 495, 503–504, 513–516, 525, 537, 545, 568–569, 607–608, 612, 637–641, 643–644, 647–648, 651–656, 754, 761–765, 768–769, 773–777, 780–781, 785–789, 792–793, 815–816, 820–821, 823–825, 827, 830, 838–842, 845, 852–857, 859–860, 874, 884, 888, 890–891, 896–897, 903–904, 912, 927–928, 958, 973, 1001, 1295, 1298
TOTAL	26862	11684	56%

all-hands-bot

Clean performance optimization that decouples API cost from total conversation count. The key insight - sorting by StoredConversation metadata before loading full state - is well-executed.

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟢 LOW

This is a backend API performance optimization with comprehensive regression tests. The optimization avoids loading unnecessary conversation state while preserving all existing behavior: sort orders, pagination, and filtering semantics are unchanged. All 142 existing tests pass, and two new tests guard against performance regressions.

VERDICT:
✅ Worth merging: Solves a real O(N) performance issue with minimal, well-tested changes.

KEY INSIGHT:
Good data structure analysis: sorting metadata is already on StoredConversation, so full state deserialization can be deferred to the page slice.

all-hands-bot

✅ QA Report: PASS

Verified that conversation search and count operations use the optimized implementation and preserve all existing behavior contracts.

Does this PR achieve its stated goal?

Yes. This PR successfully reduces the computational cost of conversation search and count operations from O(N) to O(page_size) and O(1) respectively. The optimization eliminates unnecessary state loads by sorting on metadata fields (created_at/updated_at) that are already available on StoredConversation, and only composing full ConversationInfo for the actual page slice returned. The count endpoint with no filter now short-circuits to len(_event_services) without touching any conversation state. All existing behavior (sort orders, pagination, status filtering) is preserved.

Phase	Result
Environment Setup	✅ Dependencies installed, tests runnable
CI Status	✅ All checks passing (77 agent-server tests, 62 router tests, all other CI green)
Functional Verification	✅ All tests pass, optimization verified by regression tests

Functional Verification

Test 1: Regression tests verify the optimization

Step 1 — Run the new performance regression tests:

Ran uv run pytest tests/agent_server/test_conversation_service.py::TestConversationServiceSearchCountPerf -v:

tests/agent_server/test_conversation_service.py::TestConversationServiceSearchCountPerf::test_count_no_filter_skips_state_loads PASSED [ 50%]
tests/agent_server/test_conversation_service.py::TestConversationServiceSearchCountPerf::test_search_skips_state_for_non_page_items PASSED [100%]

======================== 2 passed, 5 warnings in 0.60s =========================

Interpretation: The two new regression tests explicitly verify:

test_count_no_filter_skips_state_loads — confirms that counting without a filter never calls get_state() on any conversation
test_search_skips_state_for_non_page_items — confirms that searching only calls get_state() on the conversations in the returned page slice, not all conversations

These tests use mocks to assert that get_state() is only called when necessary, proving the optimization is in place.

Test 2: Existing behavior is preserved

Step 1 — Run all conversation service tests:

Ran uv run pytest tests/agent_server/test_conversation_service.py -v:

======================== 77 passed, 5 warnings in 1.60s =========================

All 77 tests pass, including:

Sort order tests (CREATED_AT, CREATED_AT_DESC, UPDATED_AT, UPDATED_AT_DESC)
Pagination tests (page_id handling, next_page_id generation)
Status filtering tests (execution_status filtering)
Combined filter and sort tests

Step 2 — Run all conversation router tests (REST API):

Ran uv run pytest tests/agent_server/test_conversation_router.py -v:

======================== 62 passed, 5 warnings in 2.22s =========================

All 62 router tests pass, including:

test_search_conversations_default_params
test_search_conversations_with_all_params
test_count_conversations_no_filter
test_count_conversations_with_status_filter

Interpretation: The comprehensive test suite confirms that all API contracts are preserved. Sort orders, pagination, and filtering all work exactly as before. The optimization is a pure internal refactor with no externally observable behavior changes.

Test 3: Code inspection confirms the implementation

Reviewed the implementation in conversation_service.py:

_search_conversations changes:

Sorts candidates by stored.created_at or stored.updated_at directly from metadata (lines 421-425)
Only loads state and composes ConversationInfo for items in the page slice (lines 437-440)
Still loads state for status filtering when execution_status is provided (lines 415-418)

_count_conversations changes:

Fast path: returns len(self._event_services) immediately when no filter (lines 463-464)
Only iterates and loads state when execution_status filter is provided (lines 467-470)

Interpretation: The implementation matches the PR description exactly. Sorting uses metadata, state loading is deferred until needed, and count has a fast path.

Issues Found

None.

csmith49 marked this pull request as ready for review May 14, 2026 18:25

all-hands-bot approved these changes May 14, 2026

View reviewed changes

all-hands-bot reviewed May 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(agent-server): cut conversation search/count to page-slice work#3260

perf(agent-server): cut conversation search/count to page-slice work#3260
csmith49 wants to merge 1 commit into
mainfrom
fix/3142-conversation-search-count-perf

csmith49 commented May 14, 2026 •

edited by github-actions Bot

Loading

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

all-hands-bot left a comment

Uh oh!

all-hands-bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

csmith49 commented May 14, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Tests

Notes for reviewers

Uh oh!

github-actions Bot commented May 14, 2026

Python API breakage checks — ✅ PASSED

Uh oh!

github-actions Bot commented May 14, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

✅ QA Report: PASS

Does this PR achieve its stated goal?

Test 1: Regression tests verify the optimization

Test 2: Existing behavior is preserved

Test 3: Code inspection confirms the implementation

Issues Found

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

csmith49 commented May 14, 2026 •

edited by github-actions Bot

Loading