[Feature] Agentic toolkit foundation: protocols, parsers, sandbox, REPL#3735
[Feature] Agentic toolkit foundation: protocols, parsers, sandbox, REPL#3735vmoens wants to merge 1 commit intogh/vmoens/266/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3735
Note: Links to docs will display an error until the docs builds have been completed. ❌ 4 New FailuresAs of commit 500e59a with merge base d386287 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
| (``ToolCompose``) lands in a follow-up commit; this preview ships the | ||
| contracts, parsers, sandboxing, and stateful REPLs that it builds on. |
There was a problem hiding this comment.
never ever mention commit strategy in a PR
| (``ToolCompose``) lands in a follow-up commit; this preview ships the | ||
| contracts, parsers, sandboxing, and stateful REPLs that it builds on. | ||
|
|
||
| Tool contracts |
There was a problem hiding this comment.
this comes too fast. I need to see an example before.
| isolation, prefer :class:`DockerSandbox` (real implementation | ||
| tracked in the package TODO list). |
There was a problem hiding this comment.
we need to find a better way of doing this than a TODO list...
| # --------------------------------------------------------------------------- | ||
| # Agentic toolkit (torchrl.envs.llm.agentic) | ||
| # --------------------------------------------------------------------------- | ||
|
|
||
| import asyncio # noqa: E402 | ||
| import socket # noqa: E402 | ||
| import sys # noqa: E402 | ||
| import warnings # noqa: E402 |
There was a problem hiding this comment.
this clearly needs to be put in a separate test file!
There was a problem hiding this comment.
use tensorclass, not dataclasses. we can stack tensorclasses and pass them to tensordicts. They will play nicely with batched envs, trajectories etc.
There was a problem hiding this comment.
same, use tensorclasses and not dataclasses
There was a problem hiding this comment.
ditto: tensorclass and dataclass
| { | ||
| "message": "Let me search.", | ||
| "tools": [ | ||
| {"tool": "search", "args": {"query": "x"}, "id": "c1"}, | ||
| {"tool": "summarize", "args": {"text": "..."}} | ||
| ] | ||
| } |
There was a problem hiding this comment.
how will this be displayed in sphinx?
Worried about the lack of code block or something like that
|
|
||
| { | ||
| "role": "assistant", | ||
| "content": [ | ||
| {"type": "text", "text": "Let me search."}, | ||
| {"type": "tool_use", "id": "toolu_1", | ||
| "name": "search", "input": {"q": "x"}} | ||
| ] | ||
| } |
There was a problem hiding this comment.
ditto: worried about how this will be displayed
|
|
||
| - The full message dict:: | ||
|
|
||
| {"role": "assistant", "content": "...", "tool_calls": [...]} |
|
|
||
| - The choice dict:: | ||
|
|
||
| {"message": {... "tool_calls": [...]}} |
Stack from ghstack (oldest at bottom):
Lands the substrate for an async-first, sandboxed tool-calling stack
under torchrl.envs.llm.agentic. ChatEnv is unchanged; the orchestrator
(ToolCompose) and built-in tools come in follow-up commits.
This commit ships:
TextPart/JsonPart/ImagePart/FileRefPart), ToolCallParser, ParsedCall,
ParseResult, with a stable call_id invariant pinned across all
parsers.
OpenAIToolCallParser, AnthropicToolUseParser. All round-trip
parse->render_call and produce stable call_ids.
(Linux default), SeatbeltSandbox (macOS default),
UnsafeSubprocessSandbox (warning-loud fallback), and stubs for
DockerSandbox / E2BSandbox / ModalSandbox tracked in the package
TODO list. ResourceLimits.narrow() enforces that per-call limits can
only tighten construction limits.
display, kernel restart, interrupt) and SubprocessRepl (no extra
dep; persistent variables, error capture, timeout).
Tests extend test/llm/test_llm_transforms.py with TestAgenticParsers
(protocol conformance, call_id stability, round-trip), TestAgenticSandbox
(unsafe-warns, timeout, narrow(), platform-skipped fs-escape and
network-deny negatives), and TestAgenticRepl (state persistence, error
capture, restart, timeout, jupyter slow-marked).
Docs add an "Agentic toolkit (preview)" section under
docs/source/reference/llms_envs.rst documenting Tool contracts, parsers,
sandboxing, and stateful REPLs.
Co-Authored-By: Claude Opus 4.7 noreply@anthropic.com