Python: Add GeminiChatClient by holtvogt · Pull Request #4847 · microsoft/agent-framework

holtvogt · 2026-03-23T07:34:52Z

Motivation and Context

Google Gemini is one of the leading LLM providers and is not yet supported as a first-class integration in Agent Framework. This PR adds a dedicated agent-framework-gemini package that gives developers a native, idiomatic way to use Google Gemini models within Agent Framework.

Description

This PR introduces the agent-framework-gemini package. A new provider integration built on the official google-genai SDK (>=1.0.0,<2.0.0).

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the Contribution Guidelines
All unit tests pass, and I have added new tests where possible
Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Reference

Addresses Gemini client #39

Copilot

Pull request overview

Adds a first-class Google Gemini provider integration to the Python Agent Framework workspace via a new agent-framework-gemini package (built on google-genai), plus runnable samples and tests.

Changes:

Introduces agent_framework_gemini.GeminiChatClient with Gemini-specific GeminiChatOptions/ThinkingConfig and message/tool mapping.
Adds unit + integration tests for text, streaming, tools, thinking config, grounding, code execution, and structured output.
Adds Google Gemini sample scripts and wires the new package/dependency into the Python workspace (uv.lock, .env.example).

Reviewed changes

Copilot reviewed 15 out of 18 changed files in this pull request and generated 12 comments.

Show a summary per file

File	Description
python/uv.lock	Adds `agent-framework-gemini` workspace member and locks `google-genai` dependency.
python/.env.example	Documents `GEMINI_API_KEY` / `GEMINI_CHAT_MODEL_ID` env vars.
python/packages/gemini/pyproject.toml	Defines the new distributable package and its dependencies/tooling config.
python/packages/gemini/agent_framework_gemini/_chat_client.py	Implements the Gemini chat client (streaming, tools, config mapping, response parsing).
python/packages/gemini/agent_framework_gemini/init.py	Exposes the package public API + version.
python/packages/gemini/tests/test_chat_client.py	Adds unit tests for conversion/config/streaming behavior.
python/packages/gemini/tests/test_chat_client_integration.py	Adds integration tests gated by `GEMINI_API_KEY`.
python/samples/02-agents/providers/google/*.py	Adds runnable Gemini provider samples (basic, advanced thinking, grounding, code execution).
python/samples/02-agents/providers/google/README.md	Documents the new Gemini samples.
python/packages/gemini/README.md / LICENSE / AGENTS.md / py.typed	Packaging docs/metadata for the new provider.

python/samples/02-agents/providers/google/gemini_with_code_execution.py

Copilot · 2026-03-23T07:43:36Z

python/packages/gemini/agent_framework_gemini/_chat_client.py

+        model_id = options.get("model_id") or self.model_id
+        if not model_id:
+            raise ValueError(
+                "Gemini model_id is required. Set via model_id parameter or GEMINI_CHAT_MODEL_ID environment variable."
+            )
+
+        system_instruction, contents = self._prepare_gemini_messages(messages)
+
+        if call_instructions := options.get("instructions"):
+            system_instruction = (
+                f"{call_instructions}\n{system_instruction}" if system_instruction else call_instructions
+            )
+
+        config = self._prepare_config(options, system_instruction)
+


_inner_get_response uses the raw options mapping directly. That means common option normalization/validation (including converting callable tools/MCP tools via validate_chat_options / _validate_options) is skipped, and Gemini-specific code may silently ignore tools that aren’t already FunctionTool instances. Consider awaiting self._validate_options(options) (like OllamaChatClient does) and then using the validated options throughout request preparation.

Copilot · 2026-03-23T07:43:36Z

python/packages/gemini/agent_framework_gemini/_chat_client.py

+                ):
+                    yield self._process_chunk(chunk)
+
+            return self._build_response_stream(_stream())


In streaming mode, the ResponseStream is built without passing through response_format from options. This prevents ChatResponse.value / structured output parsing from working for Gemini, even if callers pass a Pydantic model via options['response_format']. Pass response_format=options.get('response_format') to _build_response_stream(...) (consistent with other clients like Anthropic/Ollama).

Suggested change

return self._build_response_stream(_stream())

return self._build_response_stream(_stream(), response_format=options.get("response_format"))

python/samples/02-agents/providers/google/gemini_with_google_search.py

python/packages/gemini/AGENTS.md

Copilot · 2026-03-23T07:43:37Z

python/packages/gemini/agent_framework_gemini/_chat_client.py

+                case "function_call":
+                    call_id = content.call_id or self._generate_tool_call_id()
+                    if content.call_id and content.name:
+                        call_id_to_name[content.call_id] = content.name
+                    parts.append(
+                        types.Part(
+                            function_call=types.FunctionCall(
+                                id=call_id,
+                                name=content.name or "",
+                                args=content.parse_arguments() or {},


When converting function_call content, the call_id→name mapping is only stored when content.call_id is present. If a call ID is missing and you generate a fallback (call_id = ...), any later function_result with that generated ID won’t be resolvable and will get skipped. Record the mapping using the resolved call_id whenever a name is available.

python/samples/02-agents/providers/google/gemini_basic.py

python/samples/02-agents/providers/google/gemini_advanced.py

Copilot · 2026-03-23T07:43:38Z

python/packages/gemini/tests/test_chat_client_integration.py

+    options: GeminiChatOptions = {
+        "response_format": "json",
+        "response_schema": {
+            "type": "object",
+            "properties": {"answer": {"type": "string"}},
+            "required": ["answer"],
+        },
+    }


response_format in core ChatOptions is intended to be a Pydantic model type or a mapping, not a string. Using "json" here makes the integration test rely on an input shape that isn’t consistent with the framework’s typed API. Prefer using response_schema alone for Gemini JSON mode, or use a Pydantic model type for response_format and validate response.value instead of manual json.loads.

python/samples/02-agents/providers/google/gemini_with_google_maps.py

eavanvalkenburg

Automated Code Review

Reviewers: 4 | Confidence: 89%

✓ Correctness

This PR adds a well-structured Google Gemini integration package following the established patterns of other providers (Anthropic, Ollama, Bedrock). The implementation correctly handles message conversion, tool calling, streaming, system instructions, and finish reason mapping. The code is defensively written with proper edge-case handling (missing function call IDs, unmatched function results, consecutive tool messages). No blocking correctness issues were found. One minor suggestion relates to passing response_format through to the stream finalizer for Pydantic-model-based structured output, consistent with how Anthropic and Ollama do it, though the Gemini integration primarily uses response_schema as a dict so this is low-impact.

✗ Security Reliability

This PR adds a new Google Gemini chat client integration. The implementation follows established patterns for settings, API key handling (SecretString), and message conversion. However, there is one reliability issue: the class inheritance (MRO) order for GeminiChatClient swaps ChatMiddlewareLayer and FunctionInvocationLayer compared to every other provider in the codebase, which could cause middleware to not properly wrap function invocations. No secrets are hardcoded, JSON parsing is safely wrapped, and input validation is present at key trust boundaries.

✓ Test Coverage

The new Gemini package has a solid unit test suite (879 lines) covering the main paths: text responses, streaming, tool calling, config mapping, message conversion, finish reasons, thinking config, and built-in tools. However, there are notable test coverage gaps: no test for empty/None candidates in API responses, no test for the streaming get_final_response() finalization path, no test for _coerce_to_dict with a JSON string that parses to a non-dict type (e.g., JSON array), and no test for response_schema set without response_format. These are edge cases that the implementation appears to handle correctly, but without tests they could regress silently.

✗ Design Approach

The Gemini integration is well-structured and follows established patterns from other providers. The main design concern is that _prepare_gemini_messages resolves the required name field for Gemini's FunctionResponse by scanning prior conversation history to build a call_id → name map. The root cause is that Content.from_function_result in the framework does not persist the function name (confirmed: the name field is always None on function_result Contents as created by _execute_function_call). The history-scanning workaround works for the common case but silently drops tool results whenever messages are trimmed — a realistic production scenario (context-window management, memory layers). A straightforward fix exists at the framework level: add a name parameter to Content.from_function_result and pass it from _execute_function_call, then consume content.name directly in the Gemini client. One secondary concern: setting response_schema alone (without response_format) silently enables application/json MIME type, which is undocumented behavior in the option's docstring.

Automated review by eavanvalkenburg's agents

eavanvalkenburg · 2026-03-23T08:47:28Z

python/packages/gemini/agent_framework_gemini/_chat_client.py

+    "STOP": "stop",
+    "MAX_TOKENS": "length",
+    "SAFETY": "content_filter",
+    "RECITATION": "content_filter",
+    "LANGUAGE": "content_filter",
+    "BLOCKLIST": "content_filter",
+    "PROHIBITED_CONTENT": "content_filter",


The inheritance order swaps ChatMiddlewareLayer and FunctionInvocationLayer compared to every other provider in the codebase (Anthropic, OpenAI, Azure, Bedrock, Ollama, AG-UI, FoundryLocal all use ChatMiddlewareLayer first, then FunctionInvocationLayer). This changes the Python MRO and could cause middleware to not properly wrap function invocation. Swap these two to match the established pattern.

Suggested change

"STOP": "stop",

"MAX_TOKENS": "length",

"SAFETY": "content_filter",

"RECITATION": "content_filter",

"LANGUAGE": "content_filter",

"BLOCKLIST": "content_filter",

"PROHIBITED_CONTENT": "content_filter",

class GeminiChatClient(

ChatMiddlewareLayer[GeminiChatOptionsT],

FunctionInvocationLayer[GeminiChatOptionsT],

ChatTelemetryLayer[GeminiChatOptionsT],

BaseChatClient[GeminiChatOptionsT],

Generic[GeminiChatOptionsT],

):

eavanvalkenburg · 2026-03-23T08:47:28Z

python/packages/gemini/tests/test_chat_client.py

+
+def test_service_url() -> None:
+    client, _ = _make_gemini_client()
+    assert client.service_url() == "https://generativelanguage.googleapis.com"


Missing tests for _coerce_to_dict edge cases: a JSON string that parses to a non-dict (e.g., a JSON array '[1,2,3]') exercises the fallback branch where json.loads succeeds but isinstance(parsed, dict) is False.

Suggested change

assert client.service_url() == "https://generativelanguage.googleapis.com"

def test_coerce_to_dict_with_json_array_string() -> None:

assert GeminiChatClient._coerce_to_dict('[1, 2, 3]') == {"result": "[1, 2, 3]"}

def test_coerce_to_dict_with_json_string_literal() -> None:

assert GeminiChatClient._coerce_to_dict('"hello"') == {"result": '"hello"'}

eavanvalkenburg · 2026-03-23T08:47:28Z

python/packages/gemini/tests/test_chat_client.py

+    mock.aio.models.generate_content_stream = AsyncMock(return_value=_async_iter(chunks))
+
+    stream = client.get_response(
+        messages=[Message(role="user", contents=[Content.from_text("Hi")])],


The streaming tests verify iteration via async for update in stream but never call await stream.get_final_response() to verify the finalized ChatResponse is correctly assembled. Consider adding a streaming test that calls get_final_response() and asserts on the returned ChatResponse (messages, usage, finish_reason).

eavanvalkenburg · 2026-03-23T08:47:28Z

python/packages/gemini/tests/test_chat_client.py

+
+def test_service_url() -> None:
+    client, _ = _make_gemini_client()
+    assert client.service_url() == "https://generativelanguage.googleapis.com"


Missing test: empty/None candidates in the API response. The implementation handles this defensively, but a regression test ensures it doesn't break silently.

markwallace-microsoft · 2026-03-23T08:48:30Z

Python Test Coverage Report •

File	Stmts	Miss	Cover
packages/gemini/agent_framework_gemini
_chat_client.py	259	0	100%
TOTAL	27554	3219	88%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
5408	20 💤	0 ❌	0 🔥	1m 29s ⏱️

eavanvalkenburg · 2026-03-23T08:51:06Z

@holtvogt thanks for looking into this, I've added some comments

holtvogt added 8 commits March 23, 2026 08:21

Add agent-framework-gemini package

0e35f04

Add AGENTS.md documentation

33f437f

Add LICENSE file

38d3d28

Add README.md for agent-framework-gemini package

29ed60c

Add Google Gemini API keys to .env.example

0dace12

Add Google Gemini chat client implementation

06de5cf

Add tests for GeminiChatClient

24fdbeb

Add Google Gemini agent examples

dbc1709

Copilot AI review requested due to automatic review settings March 23, 2026 07:34

markwallace-microsoft added documentation Improvements or additions to documentation python labels Mar 23, 2026

Copilot started reviewing on behalf of holtvogt March 23, 2026 07:35 View session

Copilot AI reviewed Mar 23, 2026

View reviewed changes

eavanvalkenburg reviewed Mar 23, 2026

View reviewed changes

	return self._build_response_stream(_stream())
	return self._build_response_stream(_stream(), response_format=options.get("response_format"))

-    assert client.service_url() == "https://generativelanguage.googleapis.com"
+def test_coerce_to_dict_with_json_array_string() -> None:
+    assert GeminiChatClient._coerce_to_dict('[1, 2, 3]') == {"result": "[1, 2, 3]"}
+def test_coerce_to_dict_with_json_string_literal() -> None:
+    assert GeminiChatClient._coerce_to_dict('"hello"') == {"result": '"hello"'}

Conversation

holtvogt commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

Description

Contribution Checklist

Reference

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

eavanvalkenburg left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✗ Security Reliability

✓ Test Coverage

✗ Design Approach

Uh oh!

eavanvalkenburg Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

eavanvalkenburg Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

eavanvalkenburg Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

eavanvalkenburg Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

markwallace-microsoft commented Mar 23, 2026

Python Unit Test Overview

Uh oh!

eavanvalkenburg commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

holtvogt commented Mar 23, 2026 •

edited

Loading