-
Notifications
You must be signed in to change notification settings - Fork 12.2k
Added Code Mode docs & cross-refs. #28343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: production
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,249 @@ | ||
| --- | ||
| title: Code Mode | ||
| pcx_content_type: concept | ||
| tags: | ||
| - AI | ||
| - MCP | ||
| sidebar: | ||
| order: 19 | ||
| --- | ||
|
|
||
| import { | ||
| TypeScriptExample, | ||
| WranglerConfig, | ||
| PackageManagers, | ||
| LinkCard, | ||
| } from "~/components"; | ||
|
|
||
| Code Mode is an alternative approach to using [MCP](/agents/model-context-protocol/) tools with AI agents. Instead of presenting MCP tools directly to the LLM, Code Mode converts MCP tool schemas into a TypeScript API and asks the LLM to write code that calls that API. The generated code runs in a secure, sandboxed [V8 isolate](/workers/reference/how-workers-works/) powered by the [Dynamic Worker Loader API](/workers/runtime-apis/bindings/worker-loader/). | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. /bonk Need to explain that it is not alternative to MCP but better way to use MCP. And make the benefits clearer around how this uses so much less context. You get into that below but I think needs to be moved even further up |
||
|
|
||
| :::note[Dynamic Worker Loading is in closed beta] | ||
|
|
||
| Code Mode relies on the [Dynamic Worker Loader API](/workers/runtime-apis/bindings/worker-loader/), which is available in local development with Wrangler and workerd. To run dynamic Workers on Cloudflare in production, you must [sign up for the closed beta](https://forms.gle/MoeDxE9wNiqdf8ri9). | ||
|
|
||
| ::: | ||
|
|
||
| ## Why Code Mode? | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. /bonk this section needs to call out more clearly that Code Mode means using fewer tokens |
||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. /bonk please use this image: https://cf-assets.www.cloudflare.com/zkvhlag99gkb/6DRERHP138FSj3GG0QYj3M/99e8c09b352560b7d4547ca299482c27/image2.png in this section. You will need to take that image file and copy it into this repository, committing it to this repo. Do not just link to that URL. The image is a diagram of code mode vs. traditional MCP
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. /bonk actually the image should go in the "How it Works" section below |
||
| Traditional MCP tool calling sends every tool schema — name, description, and full parameter definitions — as part of the LLM prompt on every request. When an MCP server exposes tens or hundreds of tools, those schemas consume a large share of the context window and drive up token usage. Each tool invocation also requires a full round-trip through the model, so multi-step workflows multiply that cost further. | ||
|
|
||
| Code Mode drastically reduces token consumption by replacing all of those individual tool schemas with a compact TypeScript API definition. Instead of repeating verbose JSON schemas on every request, the LLM receives a single typed interface and writes code against it. The result is fewer input tokens per request and fewer round-trips overall. | ||
|
|
||
| LLMs have extensive training data covering real-world TypeScript code, but far less exposure to tool-calling conventions. When tools are presented as a TypeScript API instead of tool schemas, LLMs can: | ||
|
|
||
| - **Use far fewer tokens** — a compact TypeScript interface replaces the full JSON schema for every tool, significantly shrinking prompt size. MCP servers can expose hundreds or thousands of tools without blowing up the context window | ||
| - **Handle more tools with higher accuracy** — a familiar TypeScript interface is easier for the model to reason about than abstract tool schemas | ||
| - **Chain multiple calls in a single execution** — instead of round-tripping through the model between each tool call, the LLM writes a single script that calls multiple tools in sequence, eliminating per-step token overhead | ||
| - **Return only final results** — intermediate values stay within the sandbox, and only the data the LLM needs is passed back, keeping response tokens minimal | ||
|
|
||
| ## How it works | ||
|
|
||
| 1. When you connect to an MCP server with Code Mode enabled, the Agents SDK fetches the server's tool schema and converts it into a TypeScript API with doc comments | ||
| 2. Instead of exposing each MCP tool individually, the agent receives a single tool: **execute TypeScript code** | ||
| 3. The LLM generates code that calls the TypeScript API to accomplish the task | ||
| 4. The code runs in an isolated V8 sandbox with no direct Internet access — the only way for the code to interact with the outside world is through the provided API bindings | ||
| 5. API calls from the sandbox are dispatched back to the agent via [RPC](/workers/runtime-apis/bindings/service-bindings/rpc/), which routes them to the appropriate MCP server | ||
| 6. Results are collected via `console.log()` and returned to the agent when the script finishes | ||
|
|
||
| ## Usage | ||
|
|
||
| ### Before: traditional tool calling | ||
|
|
||
| With traditional tool calling, each MCP tool is presented directly to the LLM as a separate tool: | ||
|
|
||
| <TypeScriptExample> | ||
|
|
||
| ```ts | ||
| import { generateText } from "ai"; | ||
| import { openai } from "@ai-sdk/openai"; | ||
|
|
||
| const result = await generateText({ | ||
| model: openai("gpt-4o"), | ||
| system: "You are a helpful assistant", | ||
| messages, | ||
| tools: this.mcp.getAITools(), // Each MCP tool exposed individually | ||
| }); | ||
| ``` | ||
|
|
||
| </TypeScriptExample> | ||
|
|
||
| ### After: with Code Mode | ||
|
|
||
| With Code Mode, the `codemode()` wrapper transforms your tools and system prompt so the LLM writes code instead of making individual tool calls: | ||
|
|
||
| <TypeScriptExample> | ||
|
|
||
| ```ts | ||
| import { generateText } from "ai"; | ||
| import { openai } from "@ai-sdk/openai"; | ||
| import { experimental_codemode as codemode } from "@cloudflare/codemode/ai"; | ||
|
|
||
| // Wrap your existing tools and system prompt with codemode | ||
| const { prompt, tools: wrappedTools } = await codemode({ | ||
| model: openai("gpt-4.1"), | ||
| prompt: "You are a helpful assistant", | ||
| tools: this.mcp.getAITools(), | ||
| loader: this.env.LOADER, | ||
| proxy: this.ctx.exports.CodeModeProxy({ | ||
| props: { | ||
| binding: "MyAgent", | ||
| name: this.name, | ||
| callback: "callTool", | ||
| }, | ||
| }), | ||
| globalOutbound: null, // Block all direct Internet access | ||
| }); | ||
|
|
||
| const result = await generateText({ | ||
| model: openai("gpt-4.1"), | ||
| system: prompt, | ||
| messages, | ||
| tools: wrappedTools, // Single "execute code" tool | ||
| }); | ||
| ``` | ||
|
|
||
| </TypeScriptExample> | ||
|
|
||
| The `codemode()` function returns a modified `prompt` (with the generated TypeScript API documentation injected) and a `tools` object containing a single tool that executes the LLM-generated code in a sandbox. | ||
|
|
||
| ## Configuration | ||
|
|
||
| ### Install dependencies | ||
|
|
||
| <PackageManagers pkg="@cloudflare/codemode ai @ai-sdk/openai" /> | ||
|
|
||
| ### Add the Worker Loader binding | ||
|
|
||
| Code Mode requires a [Dynamic Worker Loader](/workers/runtime-apis/bindings/worker-loader/) binding to create sandboxed isolates: | ||
|
|
||
| <WranglerConfig> | ||
|
|
||
| ```jsonc | ||
| { | ||
| // ... | ||
| "worker_loaders": [ | ||
| { | ||
| "binding": "LOADER", | ||
| }, | ||
| ], | ||
| // ... | ||
| } | ||
| ``` | ||
|
|
||
| </WranglerConfig> | ||
|
|
||
| ### Export the CodeModeProxy | ||
|
|
||
| The `CodeModeProxy` entrypoint handles RPC calls from the sandbox back to your agent. Export it alongside your agent class: | ||
|
|
||
| <TypeScriptExample> | ||
|
|
||
| ```ts | ||
| import { Agent } from "agents"; | ||
| export { CodeModeProxy } from "@cloudflare/codemode/ai"; | ||
|
|
||
| export class MyAgent extends Agent<Env> { | ||
| // Your agent implementation | ||
| } | ||
| ``` | ||
|
|
||
| </TypeScriptExample> | ||
|
|
||
| ## Sandboxing and security | ||
|
|
||
| Code Mode runs LLM-generated code in a fully isolated [V8 isolate](/workers/reference/how-workers-works/). The sandbox provides strong security guarantees: | ||
|
|
||
| - **No Internet access** — setting `globalOutbound: null` blocks all `fetch()` and `connect()` calls from the sandbox. The only way for sandboxed code to communicate with the outside world is through the provided TypeScript API bindings | ||
| - **API keys are never exposed** — bindings provide already-authorized interfaces to MCP servers. Access tokens are held by the parent agent and injected into requests at the RPC layer, so the generated code never sees credentials | ||
| - **Disposable isolates** — each code execution gets a fresh isolate that is discarded after the script finishes. There is no shared state between executions | ||
| - **Controlled access** — the sandbox can only call the MCP tools you explicitly provide. You control what capabilities are available through the bindings you pass in | ||
|
|
||
| For more details on how isolate sandboxing works, refer to the [Dynamic Worker Loader API documentation](/workers/runtime-apis/bindings/worker-loader/). | ||
|
|
||
| ## Generated TypeScript API | ||
|
|
||
| When Code Mode processes an MCP server's tool schema, it generates a TypeScript interface with doc comments derived from the tool descriptions. For example, an MCP server that provides documentation search tools might produce: | ||
|
|
||
| ```ts | ||
| interface SearchDocumentationInput { | ||
| /** The search query to find relevant documentation */ | ||
| query: string; | ||
| } | ||
|
|
||
| interface SearchDocumentationOutput { | ||
| [key: string]: any; | ||
| } | ||
|
|
||
| declare const codemode: { | ||
| /** | ||
| * Semantically search within the fetched documentation. | ||
| * Useful for specific queries. | ||
| */ | ||
| search_documentation: ( | ||
| input: SearchDocumentationInput, | ||
| ) => Promise<SearchDocumentationOutput>; | ||
| }; | ||
| ``` | ||
|
|
||
| This generated API is injected into the system prompt so the LLM knows what functions are available and how to call them. The LLM then writes code using these typed functions rather than making raw tool calls. | ||
|
|
||
| ## Example: multi-step MCP workflow | ||
|
|
||
| Code Mode is particularly useful when a task requires chaining multiple MCP operations. Instead of multiple round-trips through the LLM, the model generates a single script: | ||
|
|
||
| ```js | ||
| // LLM-generated code that runs in the sandbox | ||
| const files = await codemode.list_files({ path: "/projects" }); | ||
|
|
||
| const recentProject = files | ||
| .filter((f) => f.type === "directory") | ||
| .sort((a, b) => new Date(b.modified) - new Date(a.modified))[0]; | ||
|
|
||
| const status = await codemode.get_project_status({ | ||
| name: recentProject.name, | ||
| }); | ||
|
|
||
| if (status.state === "needs_review") { | ||
| await codemode.create_task({ | ||
| title: `Review: ${recentProject.name}`, | ||
| priority: "high", | ||
| }); | ||
| console.log(`Created review task for ${recentProject.name}`); | ||
| } else { | ||
| console.log(`${recentProject.name} is up to date`); | ||
| } | ||
| ``` | ||
|
|
||
| All MCP calls execute within a single sandbox invocation. The LLM reads back only the `console.log()` output. | ||
|
|
||
| ## Current limitations | ||
|
|
||
| - **Experimental** — Code Mode is experimental and may have breaking changes in future releases | ||
| - **Closed beta for production** — The underlying [Dynamic Worker Loader API](/workers/runtime-apis/bindings/worker-loader/) is available locally but requires [closed beta access](https://forms.gle/MoeDxE9wNiqdf8ri9) for production deployment on Cloudflare | ||
| - **JavaScript only** — Sandbox execution is limited to JavaScript (Python support is planned) | ||
|
|
||
| ## Next steps | ||
|
|
||
| <LinkCard | ||
| title="Dynamic Worker Loader API" | ||
| href="/workers/runtime-apis/bindings/worker-loader/" | ||
| description="Learn about the underlying API that powers Code Mode sandboxing." | ||
| /> | ||
|
|
||
| <LinkCard | ||
| title="McpClient" | ||
| href="/agents/api-reference/mcp-client-api/" | ||
| description="Connect your agent to MCP servers." | ||
| /> | ||
|
|
||
| <LinkCard | ||
| title="MCP Tools" | ||
| href="/agents/model-context-protocol/tools/" | ||
| description="Design and add tools to your MCP server." | ||
| /> | ||
|
|
||
| <LinkCard | ||
| title="Using AI Models" | ||
| href="/agents/api-reference/using-ai-models/" | ||
| description="Call AI models from your agent." | ||
| /> | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to call out more directly that you can use Code Mode with EXISTING MCP Servers
and that Code Mode lets MCP servers expose hundreds or thousands of tools without blowing up the agent's context window. Needs to explain benefits more directly