Skip to content

Conversation

@pavelgj
Copy link
Collaborator

@pavelgj pavelgj commented Jan 21, 2026

Introduces defineAgent, a high-level abstraction built on top of Bidi Actions designed to simplify the creation of stateful, multi-turn agents. It unifies state management, allowing both client-side state handling and server-side persistence via pluggable stores.

Relates RFCs:

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @pavelgj, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the defineAgent primitive, a significant abstraction aimed at streamlining the development of conversational AI agents. It provides a flexible framework for managing agent state, supporting both client-side and server-side persistence through configurable stores. This initiative is designed to reduce common boilerplate associated with building multi-turn agents and offers a more robust and flexible alternative to the existing Chat API.

Highlights

  • Introduction of defineAgent Primitive: A new high-level abstraction, defineAgent (renamed from defineSessionFlow), is introduced to simplify the creation of stateful, multi-turn agents by building on Bidi Actions.
  • Unified State Management: The primitive unifies state management, supporting both client-side state handling (stateless server) and server-side persistence via pluggable stores (stateful server).
  • Replacement for Chat API: defineAgent is intended to replace the existing Chat API due to significant overlap and its more flexible design.
  • Boilerplate Reduction: It encapsulates common agent building patterns such as state management, session handling, multi-turn loops, and interrupt handling, reducing repetitive code.
  • Standardized Schemas: The design includes standardized schemas (InitSchema, StreamSchema, OutputSchema) to ensure compatibility across various tools and UI components.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions github-actions bot added the docs Improvements or additions to documentation label Jan 21, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an RFC for a new defineAgent primitive. The document is well-structured and clearly outlines the motivation, design, and usage of the new API. The proposed agent primitive is a powerful abstraction for building stateful, multi-turn agents with flexible state management. My feedback focuses on improving the clarity and completeness of the RFC by suggesting additions to the documentation and examples, particularly around API migration, stream handling, and the usage of custom state and state updates.


Introduces `defineAgent` (renamed from `defineSessionFlow`), a high-level abstraction built on top of Bidi Actions designed to simplify the creation of stateful, multi-turn agents. It unifies state management, allowing both client-side state handling and server-side persistence via pluggable stores.

`defineAgent` would replace the current Chat API as there is significant overlap and Agent primitive is more flexible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The RFC states that defineAgent will replace the current Chat API. This is a significant change that will impact existing users. It would be beneficial to add a section discussing the migration strategy from the Chat API to defineAgent. This could cover:

  • Key differences in usage patterns.
  • A suggested path for refactoring existing chat implementations.
  • Whether there will be a deprecation period for the old API.

Providing this information will help users plan for the transition.


The Agent primitive uses standardized schemas to ensure compatibility across tools and UI.

- **InitSchema**: `sessionId`, `messages`, `state`, `artifacts`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The InitSchema and OutputSchema include a generic state property, which is a great feature for managing custom session data. However, the provided examples only manage the messages array.

To better showcase the capabilities of defineAgent, consider adding a small example that demonstrates managing custom state. For instance, an agent that keeps track of a turn counter or stores user preferences in the state object. This would provide a more complete picture of the state management capabilities.

The Agent primitive uses standardized schemas to ensure compatibility across tools and UI.

- **InitSchema**: `sessionId`, `messages`, `state`, `artifacts`.
- **StreamSchema**: `chunk` (generation), `stateUpdate` (patches), `sessionId`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The StreamSchema includes a stateUpdate field for sending state patches, which is an excellent optimization for streaming. The current examples only demonstrate streaming generation chunks.

It would be very helpful to include an example that uses sendChunk to send a stateUpdate. This could show how to, for example, update a tool's status on the client-side incrementally without waiting for the full model response. This would clarify the intended use of stateUpdate and make the RFC more comprehensive.

@apascal07 apascal07 self-requested a review January 21, 2026 20:37
@xavidop
Copy link
Contributor

xavidop commented Jan 22, 2026

Added the same commment in the GO RFC:

Overall, while this design introduces useful primitives for bidirectional flows and persistence, it does not yet align with the agentic patterns seen in modern frameworks like LangChain or LlamaIndex. The current approach feels closer to a single, persistent agent rather than a true agent system where agents can call, delegate to, or coordinate with other agents.

A key limitation is the lack of clear multi-agent orchestration. Modern agent frameworks treat agents as composable units that can collaborate, delegate tasks, or be orchestrated in workflows.

Additionally, the API surface feels too low-level for end users. Concepts like bidirectional flows and actions require developers to manage execution mechanics explicitly. Many of these concerns should be handled by the framework itself, exposing a cleaner and more minimal interface that is easier to read, reason about, and scale over time.

This contrasts with existing patterns in Genkit JS and ADK, where agent definitions are more declarative and focused on intent rather than wiring. For reference, Genkit JS chat agents provide a much simpler mental model: https://genkit.dev/docs/tutorials/chat-with-pdf/#6-implement-the-ui, and ADK’s Go agent definitions show a clearer agent-first abstraction: https://google.github.io/adk-docs/get-started/go/#define-the-agent-code.

@pavelgj
Copy link
Collaborator Author

pavelgj commented Jan 24, 2026

Added the same commment in the GO RFC:

Overall, while this design introduces useful primitives for bidirectional flows and persistence, it does not yet align with the agentic patterns seen in modern frameworks like LangChain or LlamaIndex. The current approach feels closer to a single, persistent agent rather than a true agent system where agents can call, delegate to, or coordinate with other agents.

A key limitation is the lack of clear multi-agent orchestration. Modern agent frameworks treat agents as composable units that can collaborate, delegate tasks, or be orchestrated in workflows.

Additionally, the API surface feels too low-level for end users. Concepts like bidirectional flows and actions require developers to manage execution mechanics explicitly. Many of these concerns should be handled by the framework itself, exposing a cleaner and more minimal interface that is easier to read, reason about, and scale over time.

This contrasts with existing patterns in Genkit JS and ADK, where agent definitions are more declarative and focused on intent rather than wiring. For reference, Genkit JS chat agents provide a much simpler mental model: https://genkit.dev/docs/tutorials/chat-with-pdf/#6-implement-the-ui, and ADK’s Go agent definitions show a clearer agent-first abstraction: https://google.github.io/adk-docs/get-started/go/#define-the-agent-code.

This RFC is for one step in our overall vision and yeah, we need to find a way to communicate the overall vision as well. I mostly agree with you -- this is not an agent framework, this does not include multi-agent orchestration.

If you look at other "agent frameworks", they usually make opinionated stances on how agent orchestration must be done. They have hardcoded prompts and magical tools that get injected into the model call to facilitate "handoffs" or subagent calls. This is actually very inconsistent with how we've been doing things in Genkit. We've been avoiding hardcoded prompts or doing any unexpected/surprising LLM call alterations. The Chat API is no exception -- it does not inject any hardcoded prompts (it injects prompts that developer provides).

This RFC is all about agent abstraction. One agent. That agent might have sub-agents, but this RFC does not concern itself with that. It's a simple, but extremely flexible abstraction that can support agents of any complexity. At the same time enabling Dev UI tooling for those agents.

The next step, after we have a flexible way to represent one agent (or multiple independent agents), we can design agent orchestration layer on top of the agent abstraction, but the current thinking is that the orchestration layer, by nature, is too opinionated, so it will not be in the core Genkit framework. It will most likely be a plugin, or even even multiple plugins, separate package(s). It will allow users/community to build custom orchestration layers while still taking advantage of genkit tooling that already understands agent building blocks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Improvements or additions to documentation

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants