perf(sdk): cache deserialized events in EventLog to eliminate O(N²) per-step cost by csmith49 · Pull Request #3263 · OpenHands/software-agent-sdk

csmith49 · 2026-05-14T19:17:28Z

Summary

Partial fix for #3134 — perf: O(N²) total cost per conversation from full-history re-scan every step (from tracking issue #3153).

Problem

EventLog._get_single_item() and __iter__() call Event.model_validate_json() on every access — full Pydantic deserialization from disk each time. Agent.step() performs 3+ full-history passes per step (get_unmatched_actions, View.from_events, enforce_properties), so the same events are deserialized multiple times per step. With N events and S steps this is O(N×S) deserialization calls — O(N²) total work per conversation.

Solution

Add an _event_cache: dict[int, Event] to EventLog that stores deserialized events by index. Since events are immutable once written, the cache is always valid.

Cache integration points

Method	Behavior
`_get_single_item`	Check cache before reading from disk; populate on miss
`__iter__`	Check cache before reading from disk; populate on miss
`append`	Cache the event directly (object already in hand, skip roundtrip)
`_scan_and_build_index`	Clear cache on full index rebuild

After the first iteration in a step, all subsequent passes (View construction, property enforcement, unmatched action scan) hit the cache and skip both FileStore I/O and model_validate_json deserialization entirely.

Changes

File	Change
`openhands-sdk/.../conversation/event_store.py`	Add `_event_cache` dict; integrate into `_get_single_item`, `__iter__`, `append`, `_scan_and_build_index`
`tests/sdk/conversation/test_event_store.py`	3 new cache tests + 3 updated tests that account for caching

Testing

All 21 EventLog tests pass (18 existing + 3 new)
All pre-commit checks pass (ruff, pyright, pycodestyle)

This PR was created by an AI agent (OpenHands) on behalf of the user.

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:dd432c3-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-dd432c3-python \
  ghcr.io/openhands/agent-server:dd432c3-python

All tags pushed for this build

ghcr.io/openhands/agent-server:dd432c3-golang-amd64
ghcr.io/openhands/agent-server:dd432c350f6ccbfb9bad0df56d60310f3f7c8778-golang-amd64
ghcr.io/openhands/agent-server:perf-cache-deserialized-events-3134-golang-amd64
ghcr.io/openhands/agent-server:dd432c3-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:dd432c3-golang-arm64
ghcr.io/openhands/agent-server:dd432c350f6ccbfb9bad0df56d60310f3f7c8778-golang-arm64
ghcr.io/openhands/agent-server:perf-cache-deserialized-events-3134-golang-arm64
ghcr.io/openhands/agent-server:dd432c3-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:dd432c3-java-amd64
ghcr.io/openhands/agent-server:dd432c350f6ccbfb9bad0df56d60310f3f7c8778-java-amd64
ghcr.io/openhands/agent-server:perf-cache-deserialized-events-3134-java-amd64
ghcr.io/openhands/agent-server:dd432c3-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:dd432c3-java-arm64
ghcr.io/openhands/agent-server:dd432c350f6ccbfb9bad0df56d60310f3f7c8778-java-arm64
ghcr.io/openhands/agent-server:perf-cache-deserialized-events-3134-java-arm64
ghcr.io/openhands/agent-server:dd432c3-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:dd432c3-python-amd64
ghcr.io/openhands/agent-server:dd432c350f6ccbfb9bad0df56d60310f3f7c8778-python-amd64
ghcr.io/openhands/agent-server:perf-cache-deserialized-events-3134-python-amd64
ghcr.io/openhands/agent-server:dd432c3-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:dd432c3-python-arm64
ghcr.io/openhands/agent-server:dd432c350f6ccbfb9bad0df56d60310f3f7c8778-python-arm64
ghcr.io/openhands/agent-server:perf-cache-deserialized-events-3134-python-arm64
ghcr.io/openhands/agent-server:dd432c3-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:dd432c3-golang
ghcr.io/openhands/agent-server:dd432c350f6ccbfb9bad0df56d60310f3f7c8778-golang
ghcr.io/openhands/agent-server:perf-cache-deserialized-events-3134-golang
ghcr.io/openhands/agent-server:dd432c3-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:dd432c3-java
ghcr.io/openhands/agent-server:dd432c350f6ccbfb9bad0df56d60310f3f7c8778-java
ghcr.io/openhands/agent-server:perf-cache-deserialized-events-3134-java
ghcr.io/openhands/agent-server:dd432c3-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:dd432c3-python
ghcr.io/openhands/agent-server:dd432c350f6ccbfb9bad0df56d60310f3f7c8778-python
ghcr.io/openhands/agent-server:perf-cache-deserialized-events-3134-python
ghcr.io/openhands/agent-server:dd432c3-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

Each variant tag (e.g., dd432c3-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., dd432c3-python-amd64) are also available if needed

…tep cost EventLog._get_single_item() and __iter__() called Event.model_validate_json() on every access, causing redundant Pydantic deserialization across the 3+ full-history passes that Agent.step() performs each step (get_unmatched_actions, View.from_events, enforce_properties). With N events and S steps this is O(N×S) deserialization calls — O(N²) total work per conversation. Add an _event_cache dict[int, Event] that stores deserialized events by index. Since events are immutable once written, the cache is always valid: - __getitem__ / _get_single_item: check cache before reading from disk - __iter__: check cache before reading from disk; populate on miss - append: cache the event directly (object already in hand) - _scan_and_build_index: clear cache on full index rebuild After the first iteration in a step, all subsequent passes hit the cache and skip both FileStore I/O and model_validate_json deserialization. Partial fix for #3134 Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-05-14T19:18:07Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-05-14T19:18:14Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

all-hands-bot

🟢 Good taste - Clean performance fix that eliminates O(N²) deserialization cost.

Key strengths:

Simple cache implementation with correct invalidation (cleared on index rebuild)
Events are immutable once written, so caching is safe
Tests verify both cached and uncached paths appropriately
No API changes or breaking behavior

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟢 LOW

Internal performance optimization with no behavioral changes. Cache grows with conversation length but this is an acceptable tradeoff for eliminating quadratic deserialization cost. Good test coverage ensures correctness.

VERDICT:
✅ Worth merging - Solves real O(N²) problem with minimal, correct code.

KEY INSIGHT:
Since events are immutable after write, caching deserialized objects is safe and eliminates repeated JSON parsing overhead - a textbook performance win.

github-actions · 2026-05-14T19:20:47Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/conversation
event_store.py	168	18	89%	107, 110, 123, 127–128, 167, 171, 177, 179–182, 202–205, 250, 264
TOTAL	26565	7665	71%

all-hands-bot

✅ QA Report: PASS

EventLog caching implementation successfully eliminates O(N²) deserialization cost with zero behavioral changes.

Does this PR achieve its stated goal?

Yes. The PR set out to "cache deserialized events in EventLog to eliminate O(N²) per-step cost" (issue #3134). Functional verification confirms:

Caching works correctly: Events are cached on first access (append or iteration), and subsequent accesses return the exact same object (identity check passes).
Performance improvement delivered: With caching, Agent.step()'s 3+ full-history passes per step now achieve a 66.7% cache hit rate. The second and third passes hit the cache entirely, eliminating redundant model_validate_json() calls.
O(N²) → O(N) transformation: Before the fix, N events × S steps = O(N×S) deserializations. After, each event is deserialized once regardless of how many passes scan the history.
Zero behavioral changes: All 21 EventLog tests pass, integration with Conversation works correctly, and cache invalidation on index rebuild is handled properly.

The cache implementation is sound: events are immutable once written, so caching is safe. Cache clearing on _scan_and_build_index() ensures consistency after index rebuilds.

Phase	Result
Environment Setup	✅ Dependencies installed, project builds successfully
CI Status	✅ All critical tests passing (sdk-tests, tools-tests, workspace-tests, agent-server-tests, pre-commit, API breakage checks)
Functional Verification	✅ 6/6 verification tests passed; performance benchmark confirms caching effectiveness

Functional Verification

Test 1: Cache Identity

Step 1 — Verify caching works:
Ran custom verification script that creates an EventLog, appends an event, and accesses it multiple times:

event = create_test_event("test-1", "Hello World")
log.append(event)
first = log[0]
second = log[0]
third = log[0]

Result:

✓ PASS: Repeated access returns cached object (identity check passed)
  Object ID: 140237719028432

This confirms that all three accesses returned the exact same Python object (same memory address), proving no re-deserialization occurred.

Test 2: Iteration Populates Cache

Step 1 — Verify iteration caching:
Created EventLog with 5 events, cleared cache, then iterated:

log._event_cache.clear()  # Force cold iteration
events_from_iter = list(log)

Result:

Cache cleared. Size: 0
After iteration, cache size: 5
✓ PASS: Indexed access after iteration returns cached objects

This confirms iteration populates the cache, and subsequent indexed access (log[i]) returns the same cached objects.

Test 3: Cache Persistence

Step 1 — Verify cache survives multiple iterations:
Ran three consecutive iterations and verified object identity:

first_pass = list(log)
second_pass = list(log)
third_pass = list(log)

Result:

✓ PASS: Multiple iterations return same cached objects

All three iterations returned identical objects (identity check), confirming the cache persists correctly.

Test 4: Append Caches Directly

Step 1 — Verify append() caches the event:
Created an event, appended it, then retrieved it:

event = create_test_event("original", "Original content")
log.append(event)
retrieved = log[0]

Result:

✓ PASS: Append caches the event object directly
  Original: 140237719028432, Retrieved: 140237719028432

The appended and retrieved objects have the same ID, confirming append() caches directly without a disk roundtrip.

Test 5: Performance Improvement

Step 1 — Simulate Agent.step() behavior:
Created EventLog with 50 events and performed 3 full-history passes:

pass1 = list(log)  # First pass: cache miss, deserialize from disk
pass2 = list(log)  # Second pass: cache hit
pass3 = list(log)  # Third pass: cache hit

Result:

✓ Completed 3 full passes over 50 events in 0.0000s
  Pass 1: 50 events
  Pass 2: 50 events
  Pass 3: 50 events
✓ PASS: All passes used cached objects (no re-deserialization)

All three passes returned identical cached objects. The near-zero time confirms caching eliminates deserialization overhead.

Performance Benchmark Results:

Ran performance benchmark with varying conversation sizes:

Events     Total Reads     Time (s)     Events/sec
10         50              0.000011     4,660,338
25         125             0.000010     12,787,512
50         250             0.000014     18,078,897
100        500             0.000025     19,972,876

Cache Effectiveness Over Time:

Step     Events     Cache Size   Cache Hit %
1        5          5            66.7
2        10         10           66.7
3        20         20           66.7
4        30         30           66.7
5        50         50           66.7

The 66.7% hit rate is expected: first pass misses cache (deserializes), subsequent passes hit cache (2/3 = 66.7%).

Object Identity Across Accesses:

Index    Access 1             Access 2             Same Object?
0        139698293671488      139698293671488      ✓ YES
1        139697644551808      139697644551808      ✓ YES
2        139697644549888      139697644549888      ✓ YES
3        139697644551088      139697644551088      ✓ YES
4        139697644549408      139697644549408      ✓ YES

Every indexed access returned the exact same object, confirming caching works correctly.

Test 6: Integration with Conversation

Step 1 — Verify caching in real Conversation context:
Created a full Conversation with Agent and sent a message:

conversation = Conversation(agent=agent, workspace="/tmp")
conversation.send_message("Hello!")
event_log = conversation._state.events

Result:

✓ PASS: Conversation EventLog caching works (2 events)

The EventLog used by Conversation correctly cached events, and repeated iterations returned identical objects.

Issues Found

None.

This QA report was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot approved these changes May 14, 2026

View reviewed changes

all-hands-bot reviewed May 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(sdk): cache deserialized events in EventLog to eliminate O(N²) per-step cost#3263

perf(sdk): cache deserialized events in EventLog to eliminate O(N²) per-step cost#3263
csmith49 wants to merge 1 commit into
mainfrom
perf/cache-deserialized-events-3134

csmith49 commented May 14, 2026 •

edited by github-actions Bot

Loading

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

all-hands-bot left a comment

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

all-hands-bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

csmith49 commented May 14, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Cache integration points

Changes

Testing

Uh oh!

github-actions Bot commented May 14, 2026

Python API breakage checks — ✅ PASSED

Uh oh!

github-actions Bot commented May 14, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

✅ QA Report: PASS

Does this PR achieve its stated goal?

Test 1: Cache Identity

Test 2: Iteration Populates Cache

Test 3: Cache Persistence

Test 4: Append Caches Directly

Test 5: Performance Improvement

Test 6: Integration with Conversation

Issues Found

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

csmith49 commented May 14, 2026 •

edited by github-actions Bot

Loading