contrib: add @temporalio/strands-agents plugin#2091
Conversation
Adds a Strands Agents plugin mirroring the Python sdk's strands branch: models, MCP, activityAsTool/activityAsHook helpers, an interrupt-aware failure converter, and streaming via @temporalio/workflow-streams. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an ava test suite (9 tests) covering the model dispatch path, activityAsTool / activityAsHook, in-workflow tool(), TemporalMCPClient, structured output, the activity-tool interrupt round-trip, the streamingTopic publisher, and a deterministic-history replay assertion. Wiring required to bundle @strands-agents/sdk 1.3.0 in a workflow: - load-polyfills installs crypto.randomUUID using workflow.uuid4 so the @ungap/structured-clone polyfill used by Temporal's sink path is replay safe. - StrandsPlugin.configureBundler ignores `fs`, replaces the SDK's dynamic node:* and MCP-transport imports with an empty stub, and disables async chunks so webpack's JSONP runtime never tries to resolve `self`. - TemporalMCPClient.listTools returns lightweight TemporalMCPTool wrappers (lazy-required to break the cycle) so the workflow bundle never needs the unexported McpTool from @strands-agents/sdk's index. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Aligns with sdk-python's `temporalio[strands-agents]` extras and with the sibling @temporalio/openai-agents contrib package. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ler ignores Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@strands-agents/sdk is a third-party package, so there's no reason to bypass the 2-week supply-chain age gate for it (unlike @temporalio/* and nexus-rpc, which we publish and immediately consume). The ^1.3.0 constraint resolves to an already-aged 1.x version. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The activityAsTool, TemporalMCPClient, and in-workflow tool() tests asserted only the stub model's scripted final string, which can't prove the tool actually ran. Capture the conversation the second stream() call sees and assert the tool's output round-tripped back into the loop. The echo test counts occurrences since echo returns its input verbatim and the input already appears in the tool-use block. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Type the tool/MCP input schemas as JSONSchema, the dynamic activity proxies by their real input types, and JsonBlock payloads as JSONValue, dropping the corresponding `as never` casts. Let the terminal-error table infer its constructor types instead of casting each through never. Left in place the casts that sit on the payload-converter wire boundary or work around SDK types that aren't re-exported (e.g. McpTool). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ON Schema activityAsTool's inputSchema now takes a JSON Schema or a Zod schema, mirroring Strands' own tool() ergonomics. A shared toJsonSchema helper converts Zod via zod 4's native z.toJSONSchema (pure/deterministic, so sandbox-safe) and passes literal JSON Schemas through. TemporalMCPTool routes its schema through the same helper for symmetry; in practice MCP schemas arrive from the server as JSON across the listTools activity boundary, so the Zod branch is only exercised by activityAsTool. Conversion only -- input is not validated against the schema. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Use listToolsActivityName/callToolActivityName instead of re-deriving the
`${server}-listTools`/`${server}-callTool` convention inline, so the
registration keys can't drift from the workflow-side lookup.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The per-server callTool activity reconnected on every invocation (connect + listTools + callTool + disconnect), so an agent making several successive MCP calls paid a full handshake — and a redundant listTools round-trip — per call. Hold a lazily-opened MCP session per server in the activity worker process so successive callTool activities reuse one connection, evicting it after an idle timeout (or on a call error, so a broken session reconnects). Scope the plugin's shutdown eviction to the servers it registered rather than every cached connection. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| // The final string is the stub's scripted second turn, so it can't prove the | ||
| // activity ran. Instead capture the second stream() call |
There was a problem hiding this comment.
I still think this isn't necessarily validating what we need it to. With this not actually validating that the activity literally gets called, we wouldn't notice if some bug were introduced where the model returns a tool call, but this plugin somehow fails to invoke the activity. Right now this test is moreso testing the StubModel impl than the plugin itself.
Can you wrap the underlying activity (getWeather) def in something that logs the args that it was called with so we can assert that here? Should be fairly straightforward to even make an edit directly in ./activities/strands to collect that.
same comment on other similar tests
| counters.disconnects++; | ||
| } | ||
| override async callTool(tool: { name: string }, args: JSONValue): Promise<JSONValue> { | ||
| counters.callTools++; |
There was a problem hiding this comment.
Let's also make sure we track the args to assert it's expected.
| * timer resets on every {@link callToolActivityName | callTool} that reuses the | ||
| * connection. Exported for tests. | ||
| */ | ||
| export const MCP_CONNECTION_IDLE_MS = 5 * 60 * 1000; |
There was a problem hiding this comment.
this is a good idea! How about we wire this idle timeout through as a plugin config option?
| entry = (async () => { | ||
| const client = factory(); | ||
| await client.connect(); | ||
| const tools = await client.listTools(); |
There was a problem hiding this comment.
I'm not sure how i feel about the preemptive client.listTools() call here. If all the sdk asked for was a connection, maybe it's not going to ever call client.listTools() and this cache warming is wasted effort. Could you just extract some generic ttl caching utility and have two separate caches one for tools, and one for clients?
But on another note, is it even right to cache tools at all? MCP servers may redeploy or something and start publishing different tools, no?
| const { client, tools } = await getConnection(server, factory); | ||
| const tool = tools.find((t) => t.name === input.toolName); | ||
| if (tool === undefined) { | ||
| throw new Error(`MCP tool '${input.toolName}' not found on server '${server}'`); | ||
| } |
There was a problem hiding this comment.
I actually don't think you need this tool call activity to be listing the tools before calling the tool. As best as I can tell, the underlying callTool impl in strands makes MCP tool calls using literally just the tool name and the args which you already have here in the CallToolInput.
Is there another specific reason why this running on Temporal requires doing a tool lookup before calling each time? I could just be missing something (also sorry for not catching this on first review pass)
Summary
Adds
@temporalio/strands-agents, a Temporal plugin that runs Strands Agents inside Temporal Workflows. Model invocations, MCP tool calls, andactivityAsTool/activityAsHookcalls all dispatch through Temporal Activities for durable execution and Temporal-managed retries.API
Mirrors
temporalio[strands-agents]from sdk-python (TemporalAgent,TemporalMCPClient,activity_as_tool,activity_as_hook,auto_heartbeater).Workflow-bundle plumbing
@strands-agents/sdk@1.3.0's index transitively pullsfs,node:*, and MCP transport modules into any workflow that importsAgent/McpClient.StrandsPlugin.configureBundlerhandles this by:fs(statically imported fromvended-plugins/skillsandvended-tools/file-editor, both unreachable from workflow code).node:fs/promises/os/path/process/streamand@modelcontextprotocol/sdk/client/{stdio,sse}.jswith an empty stub viaNormalModuleReplacementPlugin(dynamic imports insidemcp-config.js's server-only code paths).self/document) never ships in the bundle.load-polyfills.tsinstallsHeaders,web-streams-polyfill,@ungap/structured-clone, and a deterministiccrypto.randomUUIDbacked byworkflow.uuid4()—@ungap/structured-clonecallscrypto.randomUUID()internally and Temporal's sink path usesstructuredClonefor log payloads, so without the polyfill the firstlogger.warnfrom inside an agent crashes the workflow.TemporalMCPClient.listTools()returns lightweightTemporalMCPToolwrappers (lazy-required to break the cycle) becauseMcpToolisn't re-exported from@strands-agents/sdk's public index. If strands-agents/sdk-typescript#1108 merges,temporal-mcp-tool.tscan be deleted entirely andTemporalMCPClient.listTools()can return realMcpToolinstances bound to itself — picking up the built-in content mapping for images, embedded resources, and URL-elicitation errors that our wrapper currently elides.Tests
9 ava integration tests in
contrib/strands/src/__tests__/test-strands.ts:invokeModelactivityactivityAsToolend-to-endTemporalMCPClientlistTools + callTool through per-server activitiestool()(StrandsFunctionTool) running inside the sandboxactivityAsHookonAfterToolCallEventstructuredOutputSchemavia thestrands_structured_outputtoolstreamingTopicevents consumed viaWorkflowStreamClient.subscribecrypto.randomUUIDpolyfill stays replay-safe)Test plan
pnpm buildcleanpnpm testincontrib/strands— 9/9 passing🤖 Generated with Claude Code