Enable ACPAgent on RemoteRuntime API by simonrosenberg · Pull Request #2190 · OpenHands/software-agent-sdk

simonrosenberg · 2026-02-23T19:48:06Z

Summary

Polymorphic agent deserialization in EventService so ACPAgent payloads work
Eager ACPAgent import in api.py to register in discriminated union
Agent lifecycle cleanup (close() on AgentBase, called from LocalConversation)
Pre-install claude-code-acp in Docker image
Add remote runtime example for ACPAgent

Test plan

Unit tests pass (72/72): test_acp_agent.py + test_agent_loading.py
CI image build (triggered by this PR)
SWE-bench validation with ACPAgent on 5 instances

🤖 Generated with Claude Code

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:2b3a7ae-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-2b3a7ae-python \
  ghcr.io/openhands/agent-server:2b3a7ae-python

All tags pushed for this build

ghcr.io/openhands/agent-server:2b3a7ae-golang-amd64
ghcr.io/openhands/agent-server:2b3a7ae-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:2b3a7ae-golang-arm64
ghcr.io/openhands/agent-server:2b3a7ae-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:2b3a7ae-java-amd64
ghcr.io/openhands/agent-server:2b3a7ae-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:2b3a7ae-java-arm64
ghcr.io/openhands/agent-server:2b3a7ae-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:2b3a7ae-python-amd64
ghcr.io/openhands/agent-server:2b3a7ae-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:2b3a7ae-python-arm64
ghcr.io/openhands/agent-server:2b3a7ae-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:2b3a7ae-golang
ghcr.io/openhands/agent-server:2b3a7ae-java
ghcr.io/openhands/agent-server:2b3a7ae-python

About Multi-Architecture Support

Each variant tag (e.g., 2b3a7ae-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 2b3a7ae-python-amd64) are also available if needed

- Polymorphic agent deserialization in EventService (type(self.stored.agent) instead of hardcoded Agent) - Eager import of ACPAgent in api.py to register in discriminated union - Add close() lifecycle method to AgentBase, call from LocalConversation.close() - Await conversation close in EventService to ensure subprocess cleanup - Pre-install claude-code-acp in Docker image for ACPAgent support - Add remote runtime example for ACPAgent Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-02-23T19:48:56Z

API breakage checks (Griffe)

Result: Passed

Action log

all-hands-bot

🟢 Good taste - Elegant solution that eliminates special cases

KEY INSIGHT: The polymorphic deserialization (agent_cls = type(self.stored.agent)) is textbook good design - it preserves type information naturally without conditionals, making ACPAgent work seamlessly alongside the base Agent class.

What I like:

Data structure wins: Using Python's type system to handle polymorphism instead of switch statements
Proper lifecycle management: The close() hook is clean extensibility (no-op by default, override where needed)
Correctness fix: The await on run_in_executor (line 609) - you actually need to wait for the cleanup
Solves real problem: ACPAgent subprocess cleanup is necessary, not theoretical

Minor notes:

The eager import in api.py is a necessary hack for Pydantic discriminated unions - acceptable pragmatism
Could add a type guard on agent_cls, but given stored.agent is pre-validated, this is safe

✅ LGTM - Core logic is sound, changes are minimal and well-targeted.

all-hands-bot · 2026-02-23T19:50:38Z

openhands-agent-server/openhands/agent_server/event_service.py

        assert isinstance(workspace, LocalWorkspace)
        Path(workspace.working_dir).mkdir(parents=True, exist_ok=True)
-        agent = Agent.model_validate(
+        agent_cls = type(self.stored.agent)


🟢 Good taste: This polymorphic deserialization is elegant - preserves the concrete type naturally instead of hardcoding Agent.model_validate().

The pattern agent_cls = type(obj) → agent_cls.model_validate() eliminates special cases and makes ACPAgent work seamlessly.

all-hands-bot · 2026-02-23T19:50:38Z

openhands-agent-server/openhands/agent_server/event_service.py

        if self._conversation:
            loop = asyncio.get_running_loop()
-            loop.run_in_executor(None, self._conversation.close)
+            await loop.run_in_executor(None, self._conversation.close)


🟢 Correctness fix: This await was missing - run_in_executor returns a Future that must be awaited. Good catch.

all-hands-bot · 2026-02-23T19:50:38Z

openhands-sdk/openhands/sdk/agent/base.py

        """
        return None
+
+    def close(self) -> None:


🟢 Clean extensibility: Hook pattern done right - no-op by default, subclasses override where needed. No forced coupling.

all-hands-bot · 2026-02-23T19:50:38Z

openhands-agent-server/openhands/agent_server/api.py

 from openhands.agent_server.tool_router import tool_router
 from openhands.agent_server.vscode_router import vscode_router
 from openhands.agent_server.vscode_service import get_vscode_service
+from openhands.sdk.agent import ACPAgent  # noqa: F401  — register in DiscriminatedUnionMixin


🟡 Pragmatic hack: The # noqa: F401 import is necessary for Pydantic discriminated union registration. Not elegant, but solves a real problem. Acceptable trade-off.

The claude-code-acp npm install fails on golang and java base images that don't have npm. Make it conditional on npm availability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…yloads With ACPAgent registered in the discriminated union, agent payloads now require an explicit "kind" field. Updated 6 test payloads and renumbered the ACP example from 07 to 09 to avoid duplicate. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-02-23T20:06:48Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-agent-server/openhands/agent_server
api.py	159	12	92%	73, 85, 100, 106, 267, 270, 274–276, 278, 284, 325
event_service.py	323	87	73%	55–56, 74–76, 80–85, 88–91, 106, 210, 229, 286–287, 291, 299, 302, 348–349, 365, 367, 371–373, 377, 386–387, 389, 393, 399, 401, 409–414, 551, 553–554, 558, 572–574, 576, 580–583, 587–590, 598–601, 620–621, 623–630, 632–633, 642–643, 645–646, 653–654, 656–657, 661, 667, 684–685
openhands-sdk/openhands/sdk/agent
acp_agent.py	360	80	77%	179–181, 235–238, 240–241, 268, 270, 274, 280, 291–292, 297, 364, 456–457, 468, 473, 504, 514, 519, 530–533, 539–541, 544–546, 548, 550–551, 553, 555, 560, 569–570, 574–575, 579, 586–592, 602–606, 609, 611, 618–619, 625–629, 633, 635–637, 673, 677–678, 759–760, 770, 782–783, 870–871
base.py	193	19	90%	200, 289, 293–297, 345–347, 357, 367, 375–376, 486, 523–524, 534–535
openhands-sdk/openhands/sdk/conversation/impl
local_conversation.py	336	21	93%	277, 282, 310, 375, 417, 563–564, 567, 713, 721, 723, 727–728, 739, 741–743, 768, 930, 937–938
TOTAL	19350	5082	73%

The CI Agent Server workflow builds binary target images, not source. Update the example to use target_type="binary" accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add 'agent_type' input (default/acp) to run-eval.yml and pass it through to the evaluation repo's eval-job workflow dispatch payload. This enables dispatching SWE-bench evaluations using ACPAgent (Claude Code via ACP). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

SWE-bench prebaked images use Python-only base images without Node.js. The conditional `if command -v npm` silently skipped claude-code-acp installation. Now installs Node.js from nodesource when npm is absent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ACPAgent now emits a FinishAction ActionEvent + ObservationEvent after each step() completes. This makes it compatible with evaluation frameworks (SWE-bench, etc.) that detect task completion by scanning for FinishAction in the event history. Without this, the benchmarks fake_user_response loop didn't detect completion and sent up to 10 "please continue" messages, and the AgentFinishedCritic marked results as failed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Claude Code ACP defaults to "ask" mode for tool permissions when no settings file exists. This causes the agent to hang waiting for interactive permission approval in headless/eval environments. Add /etc/claude-code/managed-settings.json with allow rules for Edit, Read, and Bash tools so Claude Code can operate without human approval. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Claude Code's internal permission system blocks tools like Write when running in default mode, causing the agent to hang waiting for interactive approval. This happens even when the ACP server's settings allow the tools, because Claude Code checks permissions before making MCP tool calls. Call set_session_mode("bypassPermissions") after creating the session so Claude Code skips all permission checks in headless mode. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add acp_prompt_timeout field (default 1800s) to ACPAgent that wraps the prompt() call with AsyncExecutor's timeout. This prevents the eval from hanging forever when claude-code-acp fails to send the JSON-RPC response after completing its work. Root cause analysis: claude-code-acp's prompt() method awaits sessionUpdate notifications end-to-end through the stdout pipe. If the response is never sent (e.g., due to stdout backpressure or an unhandled result subtype), the Python ACP client blocks indefinitely. Also fixes test assertions for FinishAction emission added in d72fd08. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The default asyncio.StreamReader limit (64 KiB) is too small for ACP session_update notifications that carry large tool-call outputs. When a JSON-RPC line exceeds the limit, readline() raises LimitOverrunError which silently kills the filter pipeline, leaving the prompt() future unresolved forever and eventually deadlocking the subprocess stdout pipe. Root cause confirmed on v10 eval: 59,496 bytes of unread data stuck in the subprocess pipe, both processes sleeping in ep_poll. Increase both the subprocess pipe and filtered_reader limits to 100 MiB. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Auto-detect ACP server type from InitializeResponse.agent_info.name - Map session modes: claude-code → bypassPermissions, codex-acp → full-access - Add acp_session_mode field for manual override - Install codex-acp alongside claude-code-acp in Docker image - Add acp-codex to run-eval.yml agent_type choices Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

codex-acp requires an explicit `conn.authenticate()` call after initialize and before session creation. Auto-detect the auth method from environment variables (CODEX_API_KEY or OPENAI_API_KEY). Without this, codex-acp returns "Authentication required" when trying to create a session in the remote runtime container. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Aligns naming with 'acp-codex': both now follow the acp-{provider} pattern, making the agent type self-documenting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add elapsed time tracking to _record_usage() via add_response_latency() for both main prompt and fork (ask_agent) paths - Capture agent_version from InitializeResponse.agent_info alongside agent_name - Expose agent_name and agent_version as public properties on ACPAgent so benchmarks can include ACP server identity in eval output Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The _BYPASS_MODE_MAP only matched "claude-code" but the actual agent name is "@zed-industries/claude-agent-acp", which doesn't contain that substring. Add "claude-agent" as an additional key so the mode correctly resolves to "bypassPermissions" instead of falling through to the default "full-access" (which claude-agent-acp rejects). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The npm package was renamed from @zed-industries/claude-code-acp to @zed-industries/claude-agent-acp. Update all references: bypass mode map key, Dockerfile, examples, tests, and docstrings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

all-hands-bot approved these changes Feb 23, 2026

View reviewed changes

simonrosenberg and others added 2 commits February 23, 2026 16:57

Fix Dockerfile: conditional npm install for non-Node.js base images

a01d2d8

The claude-code-acp npm install fails on golang and java base images that don't have npm. Make it conditional on npm availability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

simonrosenberg marked this pull request as draft February 23, 2026 20:04

simonrosenberg and others added 14 commits February 23, 2026 17:37

Fix example: use binary target type to match CI-built images

aa3bad9

The CI Agent Server workflow builds binary target images, not source. Update the example to use target_type="binary" accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix line length in AgentBase.close() docstring

dd7737b

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

rename agent_type 'acp' to 'acp-claude' for consistency

3d9803a

Aligns naming with 'acp-codex': both now follow the acp-{provider} pattern, making the agent type self-documenting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

simonrosenberg force-pushed the feat/acp-remote-runtime branch from 4176666 to b9c42fd Compare February 24, 2026 13:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Enable ACPAgent on RemoteRuntime API#2190

Enable ACPAgent on RemoteRuntime API#2190
simonrosenberg wants to merge 18 commits intomainfrom
feat/acp-remote-runtime

simonrosenberg commented Feb 23, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Feb 23, 2026 •

edited

Loading

Uh oh!

all-hands-bot left a comment

Uh oh!

all-hands-bot Feb 23, 2026

Uh oh!

all-hands-bot Feb 23, 2026

Uh oh!

all-hands-bot Feb 23, 2026

Uh oh!

all-hands-bot Feb 23, 2026

Uh oh!

github-actions bot commented Feb 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

simonrosenberg commented Feb 23, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

github-actions bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

API breakage checks (Griffe)

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

simonrosenberg commented Feb 23, 2026 •

edited by github-actions bot

Loading

github-actions bot commented Feb 23, 2026 •

edited

Loading

github-actions bot commented Feb 23, 2026 •

edited

Loading