Skip to content

Comments

Enable ACPAgent on RemoteRuntime API#2190

Draft
simonrosenberg wants to merge 18 commits intomainfrom
feat/acp-remote-runtime
Draft

Enable ACPAgent on RemoteRuntime API#2190
simonrosenberg wants to merge 18 commits intomainfrom
feat/acp-remote-runtime

Conversation

@simonrosenberg
Copy link
Collaborator

@simonrosenberg simonrosenberg commented Feb 23, 2026

Summary

  • Polymorphic agent deserialization in EventService so ACPAgent payloads work
  • Eager ACPAgent import in api.py to register in discriminated union
  • Agent lifecycle cleanup (close() on AgentBase, called from LocalConversation)
  • Pre-install claude-code-acp in Docker image
  • Add remote runtime example for ACPAgent

Test plan

  • Unit tests pass (72/72): test_acp_agent.py + test_agent_loading.py
  • CI image build (triggered by this PR)
  • SWE-bench validation with ACPAgent on 5 instances

🤖 Generated with Claude Code


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:2b3a7ae-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-2b3a7ae-python \
  ghcr.io/openhands/agent-server:2b3a7ae-python

All tags pushed for this build

ghcr.io/openhands/agent-server:2b3a7ae-golang-amd64
ghcr.io/openhands/agent-server:2b3a7ae-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:2b3a7ae-golang-arm64
ghcr.io/openhands/agent-server:2b3a7ae-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:2b3a7ae-java-amd64
ghcr.io/openhands/agent-server:2b3a7ae-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:2b3a7ae-java-arm64
ghcr.io/openhands/agent-server:2b3a7ae-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:2b3a7ae-python-amd64
ghcr.io/openhands/agent-server:2b3a7ae-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:2b3a7ae-python-arm64
ghcr.io/openhands/agent-server:2b3a7ae-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:2b3a7ae-golang
ghcr.io/openhands/agent-server:2b3a7ae-java
ghcr.io/openhands/agent-server:2b3a7ae-python

About Multi-Architecture Support

  • Each variant tag (e.g., 2b3a7ae-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 2b3a7ae-python-amd64) are also available if needed

- Polymorphic agent deserialization in EventService (type(self.stored.agent)
  instead of hardcoded Agent)
- Eager import of ACPAgent in api.py to register in discriminated union
- Add close() lifecycle method to AgentBase, call from LocalConversation.close()
- Await conversation close in EventService to ensure subprocess cleanup
- Pre-install claude-code-acp in Docker image for ACPAgent support
- Add remote runtime example for ACPAgent

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 23, 2026

API breakage checks (Griffe)

Result: Passed

Action log

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Elegant solution that eliminates special cases

KEY INSIGHT: The polymorphic deserialization (agent_cls = type(self.stored.agent)) is textbook good design - it preserves type information naturally without conditionals, making ACPAgent work seamlessly alongside the base Agent class.

What I like:

  • Data structure wins: Using Python's type system to handle polymorphism instead of switch statements
  • Proper lifecycle management: The close() hook is clean extensibility (no-op by default, override where needed)
  • Correctness fix: The await on run_in_executor (line 609) - you actually need to wait for the cleanup
  • Solves real problem: ACPAgent subprocess cleanup is necessary, not theoretical

Minor notes:

  • The eager import in api.py is a necessary hack for Pydantic discriminated unions - acceptable pragmatism
  • Could add a type guard on agent_cls, but given stored.agent is pre-validated, this is safe

LGTM - Core logic is sound, changes are minimal and well-targeted.

assert isinstance(workspace, LocalWorkspace)
Path(workspace.working_dir).mkdir(parents=True, exist_ok=True)
agent = Agent.model_validate(
agent_cls = type(self.stored.agent)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste: This polymorphic deserialization is elegant - preserves the concrete type naturally instead of hardcoding Agent.model_validate().

The pattern agent_cls = type(obj)agent_cls.model_validate() eliminates special cases and makes ACPAgent work seamlessly.

if self._conversation:
loop = asyncio.get_running_loop()
loop.run_in_executor(None, self._conversation.close)
await loop.run_in_executor(None, self._conversation.close)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Correctness fix: This await was missing - run_in_executor returns a Future that must be awaited. Good catch.

"""
return None

def close(self) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Clean extensibility: Hook pattern done right - no-op by default, subclasses override where needed. No forced coupling.

from openhands.agent_server.tool_router import tool_router
from openhands.agent_server.vscode_router import vscode_router
from openhands.agent_server.vscode_service import get_vscode_service
from openhands.sdk.agent import ACPAgent # noqa: F401 — register in DiscriminatedUnionMixin
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Pragmatic hack: The # noqa: F401 import is necessary for Pydantic discriminated union registration. Not elegant, but solves a real problem. Acceptable trade-off.

simonrosenberg and others added 2 commits February 23, 2026 16:57
The claude-code-acp npm install fails on golang and java base images
that don't have npm. Make it conditional on npm availability.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…yloads

With ACPAgent registered in the discriminated union, agent payloads
now require an explicit "kind" field. Updated 6 test payloads and
renumbered the ACP example from 07 to 09 to avoid duplicate.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@simonrosenberg simonrosenberg marked this pull request as draft February 23, 2026 20:04
@github-actions
Copy link
Contributor

github-actions bot commented Feb 23, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-agent-server/openhands/agent_server
   api.py1591292%73, 85, 100, 106, 267, 270, 274–276, 278, 284, 325
   event_service.py3238773%55–56, 74–76, 80–85, 88–91, 106, 210, 229, 286–287, 291, 299, 302, 348–349, 365, 367, 371–373, 377, 386–387, 389, 393, 399, 401, 409–414, 551, 553–554, 558, 572–574, 576, 580–583, 587–590, 598–601, 620–621, 623–630, 632–633, 642–643, 645–646, 653–654, 656–657, 661, 667, 684–685
openhands-sdk/openhands/sdk/agent
   acp_agent.py3608077%179–181, 235–238, 240–241, 268, 270, 274, 280, 291–292, 297, 364, 456–457, 468, 473, 504, 514, 519, 530–533, 539–541, 544–546, 548, 550–551, 553, 555, 560, 569–570, 574–575, 579, 586–592, 602–606, 609, 611, 618–619, 625–629, 633, 635–637, 673, 677–678, 759–760, 770, 782–783, 870–871
   base.py1931990%200, 289, 293–297, 345–347, 357, 367, 375–376, 486, 523–524, 534–535
openhands-sdk/openhands/sdk/conversation/impl
   local_conversation.py3362193%277, 282, 310, 375, 417, 563–564, 567, 713, 721, 723, 727–728, 739, 741–743, 768, 930, 937–938
TOTAL19350508273% 

simonrosenberg and others added 14 commits February 23, 2026 17:37
The CI Agent Server workflow builds binary target images, not source.
Update the example to use target_type="binary" accordingly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add 'agent_type' input (default/acp) to run-eval.yml and pass it through
to the evaluation repo's eval-job workflow dispatch payload. This enables
dispatching SWE-bench evaluations using ACPAgent (Claude Code via ACP).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SWE-bench prebaked images use Python-only base images without Node.js.
The conditional `if command -v npm` silently skipped claude-code-acp
installation. Now installs Node.js from nodesource when npm is absent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ACPAgent now emits a FinishAction ActionEvent + ObservationEvent
after each step() completes. This makes it compatible with evaluation
frameworks (SWE-bench, etc.) that detect task completion by scanning
for FinishAction in the event history.

Without this, the benchmarks fake_user_response loop didn't detect
completion and sent up to 10 "please continue" messages, and the
AgentFinishedCritic marked results as failed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Claude Code ACP defaults to "ask" mode for tool permissions when
no settings file exists. This causes the agent to hang waiting for
interactive permission approval in headless/eval environments.

Add /etc/claude-code/managed-settings.json with allow rules for
Edit, Read, and Bash tools so Claude Code can operate without
human approval.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Claude Code's internal permission system blocks tools like Write
when running in default mode, causing the agent to hang waiting
for interactive approval. This happens even when the ACP server's
settings allow the tools, because Claude Code checks permissions
before making MCP tool calls.

Call set_session_mode("bypassPermissions") after creating the
session so Claude Code skips all permission checks in headless
mode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add acp_prompt_timeout field (default 1800s) to ACPAgent that wraps
the prompt() call with AsyncExecutor's timeout. This prevents the
eval from hanging forever when claude-code-acp fails to send the
JSON-RPC response after completing its work.

Root cause analysis: claude-code-acp's prompt() method awaits
sessionUpdate notifications end-to-end through the stdout pipe.
If the response is never sent (e.g., due to stdout backpressure
or an unhandled result subtype), the Python ACP client blocks
indefinitely.

Also fixes test assertions for FinishAction emission added in
d72fd08.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The default asyncio.StreamReader limit (64 KiB) is too small for ACP
session_update notifications that carry large tool-call outputs.  When
a JSON-RPC line exceeds the limit, readline() raises LimitOverrunError
which silently kills the filter pipeline, leaving the prompt() future
unresolved forever and eventually deadlocking the subprocess stdout
pipe.

Root cause confirmed on v10 eval: 59,496 bytes of unread data stuck
in the subprocess pipe, both processes sleeping in ep_poll.

Increase both the subprocess pipe and filtered_reader limits to 100 MiB.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Auto-detect ACP server type from InitializeResponse.agent_info.name
- Map session modes: claude-code → bypassPermissions, codex-acp → full-access
- Add acp_session_mode field for manual override
- Install codex-acp alongside claude-code-acp in Docker image
- Add acp-codex to run-eval.yml agent_type choices

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
codex-acp requires an explicit `conn.authenticate()` call after
initialize and before session creation. Auto-detect the auth method
from environment variables (CODEX_API_KEY or OPENAI_API_KEY).
Without this, codex-acp returns "Authentication required" when
trying to create a session in the remote runtime container.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Aligns naming with 'acp-codex': both now follow the
acp-{provider} pattern, making the agent type self-documenting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add elapsed time tracking to _record_usage() via add_response_latency()
  for both main prompt and fork (ask_agent) paths
- Capture agent_version from InitializeResponse.agent_info alongside
  agent_name
- Expose agent_name and agent_version as public properties on ACPAgent
  so benchmarks can include ACP server identity in eval output

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The _BYPASS_MODE_MAP only matched "claude-code" but the actual agent
name is "@zed-industries/claude-agent-acp", which doesn't contain that
substring. Add "claude-agent" as an additional key so the mode correctly
resolves to "bypassPermissions" instead of falling through to the
default "full-access" (which claude-agent-acp rejects).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@simonrosenberg simonrosenberg force-pushed the feat/acp-remote-runtime branch from 4176666 to b9c42fd Compare February 24, 2026 13:55
The npm package was renamed from @zed-industries/claude-code-acp to
@zed-industries/claude-agent-acp. Update all references: bypass mode
map key, Dockerfile, examples, tests, and docstrings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants