Skip to content

Comments

feat(tools): task tool set#2143

Merged
VascoSch92 merged 9 commits intomainfrom
vasco/task-tool-set
Feb 25, 2026
Merged

feat(tools): task tool set#2143
VascoSch92 merged 9 commits intomainfrom
vasco/task-tool-set

Conversation

@VascoSch92
Copy link
Contributor

@VascoSch92 VascoSch92 commented Feb 20, 2026

Summary

Adding the TaskToolSet tool, but without the run_in_background feature, as discussed in #2100.

The arguments currently missing from this API (compared to the Claude Code task tool) are:

  • run_in_background
  • model

A difference between this version of TaskToolSet and DelegateTool is the ability to resume a task. The task_tool_set example demonstrates this mechanism.

Note
The tool description has been updated and tailored to reflect the current functionality (since background tasks are not supported).

(ref issue #2057)

Docs
OpenHands/docs#352

Checklist

  • If the PR is changing/adding functionality, are there tests to reflect this?
  • If there is an example, have you run the example to make sure that it works?
  • If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
  • If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
  • Is the github CI passing?

Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:a866cf8-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-a866cf8-python \
  ghcr.io/openhands/agent-server:a866cf8-python

All tags pushed for this build

ghcr.io/openhands/agent-server:a866cf8-golang-amd64
ghcr.io/openhands/agent-server:a866cf8-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:a866cf8-golang-arm64
ghcr.io/openhands/agent-server:a866cf8-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:a866cf8-java-amd64
ghcr.io/openhands/agent-server:a866cf8-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:a866cf8-java-arm64
ghcr.io/openhands/agent-server:a866cf8-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:a866cf8-python-amd64
ghcr.io/openhands/agent-server:a866cf8-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:a866cf8-python-arm64
ghcr.io/openhands/agent-server:a866cf8-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:a866cf8-golang
ghcr.io/openhands/agent-server:a866cf8-java
ghcr.io/openhands/agent-server:a866cf8-python

About Multi-Architecture Support

  • Each variant tag (e.g., a866cf8-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., a866cf8-python-amd64) are also available if needed

@github-actions
Copy link
Contributor

github-actions bot commented Feb 20, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-tools/openhands/tools/task
   definition.py562653%66, 72–74, 76–78, 80–81, 89, 94–95, 98, 100, 145, 193–194, 196–198, 200, 204–205, 207–208, 214
   manager.py1167337%64–66, 70–72, 79, 81, 84, 87–88, 92–93, 97, 101–104, 107–109, 111, 135–136, 138–139, 144, 150, 157–158, 163–165, 173, 180, 189–191, 199, 205, 215–216, 218–221, 223, 239–240, 242, 244–245, 247, 251–252, 254–257, 259–267, 269, 271, 275–276, 278
TOTAL18933962349% 

@VascoSch92 VascoSch92 marked this pull request as ready for review February 20, 2026 14:04
@enyst enyst added the review-this This label triggers a PR review by OpenHands label Feb 20, 2026
@VascoSch92 VascoSch92 requested a review from enyst February 20, 2026 23:35
@enyst enyst added review-this This label triggers a PR review by OpenHands and removed review-this This label triggers a PR review by OpenHands labels Feb 21, 2026
Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable - Feature works and solves a real problem (task resumption). Some data structure complexity and naming could be cleaner, but nothing blocking.

Key Insight: The state management spreads task tracking across multiple data structures when TaskState already has status - consolidating this would eliminate special case tracking.

@enyst
Copy link
Collaborator

enyst commented Feb 21, 2026

@OpenHands Do a /codereview-roasted on this PR. Take a look at some other tool too to see if we follow the patterns in the codebase.

Post your review and your answer as a comment to this PR. Note that it will be rendered as markdown.

Once you are done, take a step back and look at your review and diff with fresh eyes: section by section, was it more minor or major than it seemed to be? Reanalyze. When you're done, post your reanalysis as a follow-up comment.

@openhands-ai
Copy link

openhands-ai bot commented Feb 21, 2026

I'm on it! enyst can track my progress at all-hands.dev

This comment was marked as outdated.

Copy link
Collaborator

enyst commented Feb 21, 2026

Reanalysis (fresh eyes)

Stepping back: I probably came in hot on a couple of things that are more “taste” than “stop-ship”. Here’s how I’d re-rank it after a second pass.

Still major / correctness

  • Registering TaskTool in the global registry is a real runtime bug. This isn’t style — it’s an API contract violation with register_tool(...). If someone tries Tool(name="task"), it will fail tool resolution immediately. That’s genuinely merge-blocking.
  • TaskObservation.visualize error branch returning super().visualize also looks like plain wrong behavior (dead code / unexpected output). Not as catastrophic as the registry issue, but it’s still a correctness fix, not bikeshedding.

Probably not worth blocking (but worth considering)

  • The big inline TASK_TOOL_DESCRIPTION + # noqa: E501 is ugly, but it’s not inherently wrong. Switching to a template is a quality improvement, not a functional requirement.
  • Using tempfile.mkdtemp(...) for the resumable persistence dir is potentially inconsistent with other persistence paths, but in practice it might be totally acceptable for the initial iteration (especially if this is currently positioned as a standalone/experimental tool).
  • The _visualizer / _name access is a layering smell, but it’s also optional functionality (sender label) and could be tightened later if needed.

Net assessment

The core idea (ToolSet factory that creates a task tool backed by a shared manager, with resume support) is sound and matches existing patterns (BrowserToolSet). The PR is close — it just needs that registry fix (and ideally the visualize bug fix) to avoid shipping a broken API surface.

@openhands-ai

This comment was marked as duplicate.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 23, 2026

API breakage checks (Griffe)

Result: Passed

Action log

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable - Feature works and solves a real problem (task resumption). Some data structure complexity and naming could be cleaner, but nothing blocking.

[IMPROVEMENT OPPORTUNITIES]

The previous review flagged an unresolved issue that's still present in manager.py (see inline comment).

VERDICT:
Worth merging - Adds useful task resumption capability with comprehensive tests. The defensive programming pattern is minor and doesn't affect correctness.

KEY INSIGHT:
Task resumption via conversation persistence is a solid feature. The state management is straightforward - the only complexity comes from defensive checks against known types.

@enyst
Copy link
Collaborator

enyst commented Feb 23, 2026

Thank you! LGTM

@simonrosenberg I'd love to know your thoughts on this PR

VascoSch92 and others added 3 commits February 24, 2026 13:31
Co-authored-by: simonrosenberg <157206163+simonrosenberg@users.noreply.github.com>
Co-authored-by: simonrosenberg <157206163+simonrosenberg@users.noreply.github.com>
@VascoSch92
Copy link
Contributor Author

Merging also if [Optional] Docs example /check-examples are not passing.

This is becausse of another example in the code which has not a documentation now.

The example added in the PR has docs.

@VascoSch92 VascoSch92 merged commit 6a74a8a into main Feb 25, 2026
28 of 30 checks passed
@VascoSch92 VascoSch92 deleted the vasco/task-tool-set branch February 25, 2026 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review-this This label triggers a PR review by OpenHands

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants