Skip to content

perf(agent-server): add idle conversation eviction to reduce memory#3256

Open
csmith49 wants to merge 1 commit into
mainfrom
fix/idle-conversation-eviction-3141
Open

perf(agent-server): add idle conversation eviction to reduce memory#3256
csmith49 wants to merge 1 commit into
mainfrom
fix/idle-conversation-eviction-3141

Conversation

@csmith49
Copy link
Copy Markdown
Collaborator

@csmith49 csmith49 commented May 14, 2026

Summary

Addresses #3141 - Adds configurable eviction for finished conversations to prevent unbounded memory growth in long-running agent servers.

Problem

The _event_services dict holds every active conversation with no TTL, idle timeout, max size, or background GC task. A conversation that finishes its work remains fully loaded in memory until explicitly deleted via the API or the server restarts. This causes memory to grow monotonically in long-running servers.

Solution

Add a background eviction task that periodically checks and removes idle finished conversations from memory. Evicted conversations can be re-hydrated from disk on next access.

New Configuration Options

Option Type Default Description
idle_timeout_seconds int | None None Time in seconds after which a finished/idle conversation will be evicted from memory. Minimum 60s. Set to None to disable.
max_loaded_conversations int | None None Maximum number of conversations to keep loaded in memory. When exceeded, the least recently active finished conversations are evicted first. Set to None to disable.

Implementation Details

  • Background Task: The eviction loop runs every 60 seconds when either eviction policy is enabled
  • Eligibility: Only terminal-state conversations (FINISHED, ERROR, STUCK) are candidates for eviction
  • Priority: When evicting due to max_loaded_conversations, the most idle (oldest updated_at) finished conversations are evicted first
  • Preservation: Running/idle (RUNNING, IDLE, PAUSED, WAITING_FOR_CONFIRMATION) conversations are never evicted
  • Re-hydration: Evicted conversations remain on disk and are automatically re-loaded when the service restarts or when accessed

Example Usage

# Enable eviction after 5 minutes of inactivity
export OH_IDLE_TIMEOUT_SECONDS=300

# Or limit to max 100 loaded conversations
export OH_MAX_LOADED_CONVERSATIONS=100

# Or both (they work together)
export OH_IDLE_TIMEOUT_SECONDS=300
export OH_MAX_LOADED_CONVERSATIONS=100

Testing

Added 10 unit tests covering:

  • Eviction task lifecycle (start/stop)
  • Idle timeout eviction
  • Max loaded conversations eviction
  • Preservation of non-terminal conversations
  • Combined policy behavior
  • Re-hydration from disk after eviction

All existing test_conversation_service.py tests pass.


This PR was created by an AI agent (OpenHands) on behalf of the user.

@csmith49 can click here to continue refining the PR


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:5a85df3-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-5a85df3-python \
  ghcr.io/openhands/agent-server:5a85df3-python

All tags pushed for this build

ghcr.io/openhands/agent-server:5a85df3-golang-amd64
ghcr.io/openhands/agent-server:5a85df30abcee0a45a6183347f7ea95c7a3ea0a1-golang-amd64
ghcr.io/openhands/agent-server:fix-idle-conversation-eviction-3141-golang-amd64
ghcr.io/openhands/agent-server:5a85df3-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:5a85df3-golang-arm64
ghcr.io/openhands/agent-server:5a85df30abcee0a45a6183347f7ea95c7a3ea0a1-golang-arm64
ghcr.io/openhands/agent-server:fix-idle-conversation-eviction-3141-golang-arm64
ghcr.io/openhands/agent-server:5a85df3-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:5a85df3-java-amd64
ghcr.io/openhands/agent-server:5a85df30abcee0a45a6183347f7ea95c7a3ea0a1-java-amd64
ghcr.io/openhands/agent-server:fix-idle-conversation-eviction-3141-java-amd64
ghcr.io/openhands/agent-server:5a85df3-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:5a85df3-java-arm64
ghcr.io/openhands/agent-server:5a85df30abcee0a45a6183347f7ea95c7a3ea0a1-java-arm64
ghcr.io/openhands/agent-server:fix-idle-conversation-eviction-3141-java-arm64
ghcr.io/openhands/agent-server:5a85df3-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:5a85df3-python-amd64
ghcr.io/openhands/agent-server:5a85df30abcee0a45a6183347f7ea95c7a3ea0a1-python-amd64
ghcr.io/openhands/agent-server:fix-idle-conversation-eviction-3141-python-amd64
ghcr.io/openhands/agent-server:5a85df3-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:5a85df3-python-arm64
ghcr.io/openhands/agent-server:5a85df30abcee0a45a6183347f7ea95c7a3ea0a1-python-arm64
ghcr.io/openhands/agent-server:fix-idle-conversation-eviction-3141-python-arm64
ghcr.io/openhands/agent-server:5a85df3-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:5a85df3-golang
ghcr.io/openhands/agent-server:5a85df30abcee0a45a6183347f7ea95c7a3ea0a1-golang
ghcr.io/openhands/agent-server:fix-idle-conversation-eviction-3141-golang
ghcr.io/openhands/agent-server:5a85df3-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:5a85df3-java
ghcr.io/openhands/agent-server:5a85df30abcee0a45a6183347f7ea95c7a3ea0a1-java
ghcr.io/openhands/agent-server:fix-idle-conversation-eviction-3141-java
ghcr.io/openhands/agent-server:5a85df3-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:5a85df3-python
ghcr.io/openhands/agent-server:5a85df30abcee0a45a6183347f7ea95c7a3ea0a1-python
ghcr.io/openhands/agent-server:fix-idle-conversation-eviction-3141-python
ghcr.io/openhands/agent-server:5a85df3-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

  • Each variant tag (e.g., 5a85df3-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 5a85df3-python-amd64) are also available if needed

Add configurable eviction for finished conversations to prevent unbounded
memory growth in long-running servers. This addresses issue #3141.

New configuration options:
- idle_timeout_seconds: Time after which a finished conversation will be
  evicted from memory (min 60s, default None/disabled)
- max_loaded_conversations: Maximum conversations to keep in memory; when
  exceeded, oldest finished conversations are evicted first (default None)

Implementation:
- Background eviction task runs every 60 seconds when either policy is enabled
- Only terminal-state conversations (FINISHED, ERROR, STUCK) are eligible
- Evicted conversations are saved to disk and can be rehydrated on next access
- Running/idle conversations are never evicted

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-agent-server/openhands/agent_server
   config.py72297%29, 42
   conversation_service.py65212281%144–145, 154, 180–181, 185–186, 191, 303–304, 338, 341, 348–354, 381, 387, 483, 489, 494, 500, 508–509, 518–521, 530, 542, 550, 573–574, 612–613, 617, 642–646, 648–649, 652–653, 656–661, 759, 766–770, 773–774, 778–782, 785–786, 790–794, 797–798, 820–821, 825–826, 828–830, 832, 835, 843–847, 850, 857–862, 864–865, 879, 889, 893, 895–896, 901–902, 908–909, 917, 932–933, 970, 993, 1001, 1046, 1067, 1079–1080, 1101, 1131, 1425, 1428
TOTAL268941169156% 

@csmith49 csmith49 marked this pull request as ready for review May 14, 2026 13:55
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Clean, pragmatic solution to unbounded memory growth

This PR addresses a real production issue with a well-designed implementation. The eviction logic is straightforward: a background task periodically removes idle finished conversations from memory while preserving active ones.

Strengths:

  • Opt-in with sensible defaults (both policies disabled by default)
  • Only evicts terminal-state conversations (FINISHED, ERROR, STUCK)
  • Defensive programming (None checks, safe dict operations)
  • Comprehensive test coverage (10 tests covering all scenarios)
  • Clean, readable code with appropriate logging
  • Re-hydration support for evicted conversations

[RISK ASSESSMENT]
⚠️ Risk Assessment: 🟡 MEDIUM

Changes core conversation service lifecycle with background state mutation. Risk is mitigated by:

  • Feature is opt-in (defaults to None, no behavior change unless explicitly enabled)
  • Only affects terminal conversations, never running/active ones
  • Comprehensive test coverage including edge cases
  • Defensive programming patterns throughout
  • No impact on agent decision-making, prompts, or benchmark behavior

The implementation accepts minor race condition inaccuracies in count tracking (between idle timeout and max loaded eviction phases) as a pragmatic trade-off for simplicity. This is acceptable for memory management where approximate enforcement is sufficient.

VERDICT:
Worth merging - Solves a real problem (monotonic memory growth in long-running servers) with a simple, testable solution.

KEY INSIGHT:
Good example of pragmatic systems programming - prioritizes simplicity and observability (counts, debug logging) over perfect accuracy in non-critical timing windows.

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

All eviction features work as designed. Conversations are correctly evicted based on idle timeout and max loaded limits, running conversations are preserved, and evicted conversations can be re-hydrated from disk.

Does this PR achieve its stated goal?

Yes. The PR set out to "add configurable eviction for finished conversations to prevent unbounded memory growth in long-running agent servers," and it delivers exactly that. I verified the implementation by:

  1. Creating multiple conversations with different states and idle times
  2. Manually triggering eviction cycles
  3. Confirming that only terminal-state conversations (FINISHED, ERROR, STUCK) are evicted
  4. Verifying that running/idle conversations are never evicted
  5. Testing re-hydration from disk after eviction
  6. Confirming environment variable configuration works correctly

The feature works end-to-end as documented.

Phase Result
Environment Setup uv sync --dev succeeded
CI Status ✅ All checks passing (build, pre-commit, API compatibility)
Functional Verification ✅ All 4 functional scenarios verified + 10 unit tests passed
Functional Verification

Test 1: Idle Timeout Eviction

Setup:

ConversationService(
    idle_timeout_seconds=60,
    max_loaded_conversations=None
)

Test execution:

  1. Created a conversation and marked it FINISHED
  2. Set updated_at to 2 minutes ago (beyond 60s timeout)
  3. Manually triggered _run_eviction_cycle()

Result:

✅ SUCCESS: Conversation was evicted from memory
Log: "Evicted 1 idle conversation(s) from memory; 0 remaining"

This confirms that conversations idle for longer than idle_timeout_seconds are correctly evicted.


Test 2: Max Loaded Conversations

Setup:

ConversationService(
    idle_timeout_seconds=None,
    max_loaded_conversations=2
)

Test execution:

  1. Created 3 conversations (exceeds max of 2)
  2. Marked all as FINISHED with different idle times:
    • Conv 1: 30 minutes idle (oldest)
    • Conv 2: 20 minutes idle
    • Conv 3: 10 minutes idle (newest)
  3. Triggered eviction cycle

Result:

✅ SUCCESS: Oldest conversation was evicted, 2 remain
Log: "Evicted 1 idle conversation(s) from memory; 2 remaining"

This confirms that when the max is exceeded, the oldest (most idle) finished conversation is evicted first.


Test 3: Running Conversations Not Evicted

Setup:

ConversationService(idle_timeout_seconds=60)

Test execution:

  1. Created a conversation in IDLE state (non-terminal)
  2. Set updated_at to 2 minutes ago (beyond timeout)
  3. Triggered eviction cycle

Result:

✅ SUCCESS: Running conversation was preserved

This confirms that non-terminal conversations (RUNNING, IDLE, PAUSED, WAITING_FOR_CONFIRMATION) are never evicted, regardless of how long they've been idle.


Test 4: Conversation Re-hydration

Test execution:

  1. Session 1: Created conversation, marked FINISHED, evicted it
  2. Session 2: Restarted ConversationService

Result:

✅ SUCCESS: Conversation was re-hydrated from disk
Log: "Resumed conversation 4d9ab504-756e-4f57-bd85-936c25cd6b6e from persistent storage"

This confirms that evicted conversations remain on disk and are automatically re-loaded when the service restarts.


Test 5: Environment Variable Configuration

Test execution:

export OH_IDLE_TIMEOUT_SECONDS=300
export OH_MAX_LOADED_CONVERSATIONS=100

Result:

✅ idle_timeout_seconds loaded correctly from env: 300
✅ max_loaded_conversations loaded correctly from env: 100

Environment variables work as documented in the PR description.


Test 6: Validation

Test execution:
Attempted to set OH_IDLE_TIMEOUT_SECONDS=30 (below minimum of 60)

Result:

✅ Validation correctly rejected idle_timeout_seconds=30
Error: "greater than or equal to 60"

The minimum 60-second constraint is properly enforced.


Unit Tests

Ran the 10 new unit tests in tests/agent_server/test_conversation_eviction.py:

test_eviction_task_not_started_when_disabled PASSED
test_eviction_task_started_with_idle_timeout PASSED
test_eviction_task_started_with_max_loaded PASSED
test_idle_timeout_evicts_finished_conversation PASSED
test_idle_timeout_does_not_evict_running_conversation PASSED
test_max_loaded_evicts_oldest_finished_first PASSED
test_eviction_preserves_non_terminal_conversations PASSED
test_eviction_loop_runs_periodically PASSED
test_evicted_conversation_can_be_rehydrated PASSED
test_combined_idle_timeout_and_max_loaded PASSED

10 passed in 0.53s

All tests pass, covering task lifecycle, eviction policies, preservation logic, and re-hydration.

Issues Found

None.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants