-
Notifications
You must be signed in to change notification settings - Fork 648
RFC: [JS] Agent Primitive #4212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
pavelgj
wants to merge
3
commits into
main
Choose a base branch
from
pj/rfc-agent-primitive
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,225 @@ | ||
| # RFC: Agent Primitive | ||
|
|
||
| ## Summary | ||
|
|
||
| Introduces `defineAgent`, a high-level abstraction built on top of Bidi Actions designed to simplify the creation of stateful, multi-turn agents. It unifies state management, allowing both client-side state handling and server-side persistence via pluggable stores. | ||
|
|
||
| `defineAgent` would replace the current Chat API as there is significant overlap and Agent primitive is more flexible. | ||
|
|
||
| ## Motivation | ||
|
|
||
| Building agents often involves repetitive boilerplate: | ||
| 1. **State Management**: Loading conversation history, updating it with new messages, and persisting it. | ||
| 2. **Session Handling**: Managing session IDs and context. | ||
| 3. **Multi-turn Loops**: Processing a stream of user inputs and generating responses. | ||
| 4. **Interrupts**: Pausing execution for human feedback or tool approval. | ||
|
|
||
| The `Agent` primitive encapsulates these patterns, providing a standard interface for building chatbots and autonomous agents that can run efficiently in both serverless (stateless) and stateful environments. | ||
|
|
||
| ## Design | ||
|
|
||
| ### 1. `defineAgent` | ||
|
|
||
| The `defineAgent` function wraps a Bidi Flow, adding built-in support for initialization, state loading/saving, and standardized input/output schemas. Unlike high-level configuration-based agents, `defineAgent` gives you full control over the execution loop. | ||
|
|
||
| ```typescript | ||
| export const myAgent = ai.defineAgent( | ||
| { | ||
| name: 'myAgent', | ||
| store: myPostgresStore, // Optional: enables server-side state | ||
| }, | ||
| async function* ({ inputStream, init, sendChunk }) { | ||
| // Manually manage the conversation loop | ||
| } | ||
| ); | ||
| ``` | ||
|
|
||
| ### 2. State Management Modes | ||
|
|
||
| The Agent abstraction supports two primary modes of operation, determined by the presence of a `store`. | ||
|
|
||
| #### A. Client-Managed State (Stateless Server) | ||
|
|
||
| In this mode, the server does not persist state. The client is responsible for maintaining the conversation history and passing it to the agent upon each invocation. | ||
|
|
||
| - **Init**: Client sends `messages`, `artifacts`, etc. | ||
| - **Execution**: Agent processes input, generates response. | ||
| - **Output**: Agent returns the *updated* state (new history). | ||
| - **Next Turn**: Client sends the updated history back in `init`. | ||
|
|
||
| **Pros**: Infinite scalability, no database required, REST-friendly. | ||
|
|
||
| #### B. Server-Managed State (Stateful Server) | ||
|
|
||
| In this mode, a `SessionStore` is configured. The server persists the state. | ||
|
|
||
| - **Init**: Client sends `sessionId`. | ||
| - **Execution**: | ||
| 1. Framework loads state from `store` using `sessionId` (populating `init`). | ||
| 2. Agent processes input, generates response. | ||
| 3. Framework saves updated state to `store`. | ||
| - **Output**: Agent returns the result. | ||
| - **Next Turn**: Client sends `sessionId` again. | ||
|
|
||
| **Pros**: Thinner clients, secure context storage, background persistence. | ||
|
|
||
| ### 3. Usage | ||
|
|
||
| #### Basic Example (Manual Loop) | ||
|
|
||
| This example demonstrates the core pattern: receiving input, calling `ai.generate`, and managing the messages array. | ||
|
|
||
| ```typescript | ||
| import { genkit } from 'genkit'; | ||
| import { googleAI } from '@genkit-ai/google-genai'; | ||
|
|
||
| const ai = genkit({ | ||
| plugins: [googleAI()], | ||
| }); | ||
|
|
||
| export const myAgent = ai.defineAgent( | ||
| { name: 'myAgent' }, | ||
| async function* ({ sendChunk, inputStream, init }) { | ||
| // 1. Initialize state from init payload (or empty) | ||
| let messages = init?.messages ?? []; | ||
|
|
||
| // 2. Process the input stream | ||
| for await (const input of inputStream) { | ||
| // 3. Generate response using a model | ||
| const response = await ai.generate({ | ||
| messages: [...messages, input], | ||
| model: googleAI.model('gemini-2.5-flash'), | ||
| onChunk: (chunk) => sendChunk({ sessionId: init?.sessionId, chunk }), | ||
| }); | ||
|
|
||
| messages = response.messages; | ||
|
|
||
| // 4. Handle interrupts (e.g. tool calls) | ||
| if (response.interrupts.length > 0) { | ||
| return { | ||
| sessionId: init?.sessionId, | ||
| messages, | ||
| }; | ||
| } | ||
| } | ||
|
|
||
| // 5. Return final state | ||
| return { | ||
| sessionId: init?.sessionId, | ||
| messages, | ||
| artifacts: [{ name: 'report', parts: [] }], | ||
| }; | ||
| } | ||
| ); | ||
| ``` | ||
|
|
||
| #### Example with Store (Server-Side Persistence) | ||
|
|
||
| Adding a `store` automatically handles state persistence. The implementation logic remains largely the same, but the state is preserved across network calls without the client sending it back. | ||
|
|
||
| ```typescript | ||
| export const persistentAgent = ai.defineAgent( | ||
| { | ||
| name: 'persistentAgent', | ||
| store: postgresSessionStore({ connectionString: '...' }) | ||
| }, | ||
| async function* ({ sendChunk, inputStream, init }) { | ||
| // init.messages is automatically populated from the store if sessionId exists | ||
| let messages = init?.messages ?? []; | ||
|
|
||
| for await (const input of inputStream) { | ||
| const response = await ai.generate({ | ||
| messages: [...messages, input], | ||
| model: googleAI.model('gemini-2.5-flash'), | ||
| onChunk: (chunk) => sendChunk({ sessionId: init?.sessionId, chunk }), | ||
| }); | ||
| messages = response.messages; | ||
|
|
||
| // ... handling interrupts | ||
| } | ||
|
|
||
| // State is automatically saved to the store upon return | ||
| return { | ||
| sessionId: init?.sessionId, | ||
| messages, | ||
| }; | ||
| } | ||
| ); | ||
| ``` | ||
|
|
||
| #### Example: Streaming State Updates | ||
|
|
||
| You can stream intermediate state updates to the client using `sendChunk`. This is useful for providing progress on long-running tasks or tool executions. | ||
|
|
||
| ```typescript | ||
| export const toolAgent = ai.defineAgent( | ||
| { name: 'toolAgent' }, | ||
| async function* ({ sendChunk, inputStream, init, session }) { | ||
| for await (const input of inputStream) { | ||
| // 1. Notify client that we are starting a tool | ||
| sendChunk({ | ||
| statusUpdate: { status: 'executing_tool', tool: 'weather' } | ||
| }); | ||
|
|
||
| // 2. Execute tool (simulated) | ||
| await new Promise(r => setTimeout(r, 1000)); | ||
|
|
||
| // 3. Notify client of completion | ||
| sendChunk({ | ||
| statusUpdate: { status: 'tool_complete', tool: 'weather' } | ||
| }); | ||
|
|
||
| // ... continue generation | ||
| await session.createSnapshot(); | ||
| } | ||
| } | ||
| ); | ||
| ``` | ||
|
|
||
| ### 4. Schemas | ||
|
|
||
| The Agent primitive relies on strict Zod schemas to ensure type safety and compatibility. | ||
|
|
||
| #### Init Schema (`AgentInitSchema`) | ||
| ```typescript | ||
|
|
||
| const AgentSnapshotSchema = z.object({ | ||
| // oneof { | ||
| snapshotId: z.string().optional(), | ||
| // { | ||
| messages: z.array(MessageSchema).optional(), | ||
| state: z.any().optional(), | ||
| artifacts: z.array(AgentArtifactSchema).optional(), | ||
| // } | ||
| }); | ||
|
|
||
| const AgentInitSchema = z.object({ | ||
| snapshot: AgentSnapshotSchema.optional(), | ||
| }); | ||
| ``` | ||
|
|
||
| #### Stream Schema (`AgentStreamSchema`) | ||
| ```typescript | ||
| const AgentStreamSchema = z.object({ | ||
| chunk: GenerateResponseChunkSchema.optional(), // Token generation | ||
| statusUpdate: z.any().optional(), | ||
| artifact: AgentArtifactSchema.optional(), // New artifacts | ||
| snapshotCreated: z.string().optional(), | ||
| }); | ||
| ``` | ||
|
|
||
| #### Output Schema (`AgentResponseSchema`) | ||
| ```typescript | ||
| const AgentResponseSchema = z.object({ | ||
| snapshot: AgentSnapshotSchema, | ||
| }); | ||
| ``` | ||
|
|
||
| #### Artifact Schema (`AgentArtifactSchema`) | ||
| ```typescript | ||
| const AgentArtifactSchema = z.object({ | ||
| name: z.string().optional(), | ||
| parts: z.array(PartSchema), // Media, text, etc. | ||
| metadata: z.record(z.any()).optional(), | ||
| }); | ||
| ``` | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RFC states that
defineAgentwill replace the current Chat API. This is a significant change that will impact existing users. It would be beneficial to add a section discussing the migration strategy from the Chat API todefineAgent. This could cover:Providing this information will help users plan for the transition.