Skip to content

[Feature] Agentic toolkit foundation: protocols, parsers, sandbox, REPL#3735

Open
vmoens wants to merge 1 commit intogh/vmoens/266/basefrom
gh/vmoens/266/head
Open

[Feature] Agentic toolkit foundation: protocols, parsers, sandbox, REPL#3735
vmoens wants to merge 1 commit intogh/vmoens/266/basefrom
gh/vmoens/266/head

Conversation

@vmoens
Copy link
Copy Markdown
Collaborator

@vmoens vmoens commented May 10, 2026

Stack from ghstack (oldest at bottom):

Lands the substrate for an async-first, sandboxed tool-calling stack
under torchrl.envs.llm.agentic. ChatEnv is unchanged; the orchestrator
(ToolCompose) and built-in tools come in follow-up commits.

This commit ships:

  • Protocols and value types: Tool, ToolContext, ToolResult (with
    TextPart/JsonPart/ImagePart/FileRefPart), ToolCallParser, ParsedCall,
    ParseResult, with a stable call_id invariant pinned across all
    parsers.
  • JSON Schema helpers (validate_args, json_schema_from_pydantic).
  • Four pluggable parsers: XMLToolCallParser, JSONToolCallParser,
    OpenAIToolCallParser, AnthropicToolUseParser. All round-trip
    parse->render_call and produce stable call_ids.
  • Pluggable Sandbox with file-per-backend layout: BubblewrapSandbox
    (Linux default), SeatbeltSandbox (macOS default),
    UnsafeSubprocessSandbox (warning-loud fallback), and stubs for
    DockerSandbox / E2BSandbox / ModalSandbox tracked in the package
    TODO list. ResourceLimits.narrow() enforces that per-call limits can
    only tighten construction limits.
  • Pluggable Repl: JupyterRepl (gated _has_jupyter_client; rich
    display, kernel restart, interrupt) and SubprocessRepl (no extra
    dep; persistent variables, error capture, timeout).

Tests extend test/llm/test_llm_transforms.py with TestAgenticParsers
(protocol conformance, call_id stability, round-trip), TestAgenticSandbox
(unsafe-warns, timeout, narrow(), platform-skipped fs-escape and
network-deny negatives), and TestAgenticRepl (state persistence, error
capture, restart, timeout, jupyter slow-marked).

Docs add an "Agentic toolkit (preview)" section under
docs/source/reference/llms_envs.rst documenting Tool contracts, parsers,
sandboxing, and stateful REPLs.

Co-Authored-By: Claude Opus 4.7 noreply@anthropic.com

[ghstack-poisoned]
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented May 10, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3735

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures

As of commit 500e59a with merge base d386287 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 10, 2026
@github-actions github-actions Bot added Feature New feature Documentation Improvements or additions to documentation llm/ LLM-related PR, triggers LLM CI tests and removed Feature New feature labels May 10, 2026
Comment on lines +39 to +40
(``ToolCompose``) lands in a follow-up commit; this preview ships the
contracts, parsers, sandboxing, and stateful REPLs that it builds on.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never ever mention commit strategy in a PR

(``ToolCompose``) lands in a follow-up commit; this preview ships the
contracts, parsers, sandboxing, and stateful REPLs that it builds on.

Tool contracts
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comes too fast. I need to see an example before.

Comment on lines +99 to +100
isolation, prefer :class:`DockerSandbox` (real implementation
tracked in the package TODO list).
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to find a better way of doing this than a TODO list...

Comment on lines +723 to +730
# ---------------------------------------------------------------------------
# Agentic toolkit (torchrl.envs.llm.agentic)
# ---------------------------------------------------------------------------

import asyncio # noqa: E402
import socket # noqa: E402
import sys # noqa: E402
import warnings # noqa: E402
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this clearly needs to be put in a separate test file!

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use tensorclass, not dataclasses. we can stack tensorclasses and pass them to tensordicts. They will play nicely with batched envs, trajectories etc.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, use tensorclasses and not dataclasses

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto: tensorclass and dataclass

Comment on lines +21 to +27
{
"message": "Let me search.",
"tools": [
{"tool": "search", "args": {"query": "x"}, "id": "c1"},
{"tool": "summarize", "args": {"text": "..."}}
]
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how will this be displayed in sphinx?
Worried about the lack of code block or something like that

Comment on lines +25 to +33

{
"role": "assistant",
"content": [
{"type": "text", "text": "Let me search."},
{"type": "tool_use", "id": "toolu_1",
"name": "search", "input": {"q": "x"}}
]
}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto: worried about how this will be displayed


- The full message dict::

{"role": "assistant", "content": "...", "tool_calls": [...]}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto


- The choice dict::

{"message": {... "tool_calls": [...]}}
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation Feature New feature llm/ LLM-related PR, triggers LLM CI tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant