Python: fix(python/google): filter thinking text parts from chat completion responses by dariusFTOS · Pull Request #13711 · microsoft/semantic-kernel

dariusFTOS · 2026-03-26T15:54:47Z

Motivation and Context

When using Gemini 3 Pro (preview) with thinking enabled, the API returns text parts with part.thought = True containing the model's internal reasoning. These thinking parts are incorrectly included in ChatMessageContent.items alongside the actual response text, causing the model's chain-of-thought to leak into application-visible responses.

This breaks downstream processing (e.g. JSON parsing of structured agent responses) because the response contains thinking text instead of the actual answer. The fix in #13609 correctly handled thought_signature on function call parts, but did not filter thinking text parts from the response content.

Description

Response parsing (filter thinking text parts):

google_ai_chat_completion.py: In _create_chat_message_content(), skip parts where part.thought is True before adding them as TextContent
google_ai_chat_completion.py: Same filter in _create_streaming_chat_message_content() for the streaming path

Backward compatible: When part.thought is None or False (thinking disabled or older models), behavior is identical to before. The raw GenerateContentResponse is still available via inner_content for consumers who need access to thinking parts.

Test Coverage

TODO: Add tests for:

test_create_chat_message_content_filters_thought_parts — verifies thinking parts are excluded from response items
test_create_chat_message_content_without_thought_parts — verifies backward compatibility when no thinking parts present
test_create_streaming_chat_message_content_filters_thought_parts — same for streaming path

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the SK Contribution Guidelines
All unit tests pass, and I have added new tests where possible
I didn't break anyone 😄

Gemini models with thinking enabled return text parts with part.thought=True. These thinking/reasoning parts were being included in ChatMessageContent alongside the actual response, causing thinking text to leak into responses. This adds a check to skip parts where part.thought is True in both _create_chat_message_content and _create_streaming_chat_message_content. Fixes microsoft#13710

github-actions

Automated Code Review

Reviewers: 4 | Confidence: 89%

✓ Correctness

The diff adds filtering for Google AI 'thought' parts in both non-streaming and streaming chat completion paths. When the Gemini model returns parts with thought=True (internal chain-of-thought reasoning), these are now skipped before being converted to TextContent or other content types. The Part.thought attribute is present in google-genai ~1.51.0 (the pined SDK version) and defaults to None for non-thought parts, so the truthiness check is safe. The guard is correctly placed before the if part.text: check, since thought parts carry text that should not be surfaced. The Vertex AI connector does not have the same filtering, but it uses a different SDK (vertexai) and may have different behavior for thought parts — this is outside the scope of this PR.

✓ Security Reliability

The diff adds filtering of 'thought' parts (thinking/reasoning tokens) from Google AI chat completion responses, both in streaming and non-streaming paths. The primary reliability concern is that part.thought is accessed via direct attribute access, which is inconsistent with the defensive getattr(part, "thought_signature", None) pattern already used in the same functions for SDK compatibility. If the google-genai SDK version constraint is ever relaxed or the thought attribute is removed/renamed, direct access would raise an unhandled AttributeError, crashing response parsing. The existing test for SDK attribute guards (test_create_chat_message_content_getattr_guard_on_missing_attribute) doesn't cover this new attribute since MagicMock auto-creates attributes on access.

✗ Test Coverage

The PR adds logic to skip 'thought' parts in both _create_chat_message_content and _create_streaming_chat_message_content, but no tests cover this new behavior. The existing test suite only uses parts with thought=None/False (via Part.from_text() and Part.from_function_call()), so the new if part.thought: continue branches have zero test coverage. Tests should verify that thought-only parts are filtered out, that mixed thought/non-thought responses retain only the non-thought items, and that the streaming path behaves identically.

✗ Design Approach

The PR silently discards part.thought (reasoning/thinking) content from Google AI responses by skipping those parts entirely. This is a symptom-level fix that treats thought content as noise to be suppressed, when Semantic Kernel already has a purpose-built ReasoningContent (and StreamingReasoningContent) type that is part of CMC_ITEM_TYPES and STREAMING_ITEM_TYPES. In Gemini's thinking API, part.thought == True with part.text carrying the actual reasoning text — the correct design is to surface these as ReasoningContent items rather than drop them. The Vertex AI connector has the same gap and would need the same fix. Silent discard also breaks any caller that wants to inspect model reasoning or pass it back in multi-turn conversations.

Flagged Issues

Thought parts are silently discarded instead of being surfaced as ReasoningContent / StreamingReasoningContent, which already exist in SK and are part of CMC_ITEM_TYPES. For Gemini thinking models, part.thought == True and part.text holds the reasoning text — the fix should wrap these as ReasoningContent (and the streaming equivalent) rather than dropping them. Silent discard breaks calers that need to inspect model reasoning for display, logging, or multi-turn context.
No tests cover the new part.thought filtering logic in either _create_chat_message_content or _create_streaming_chat_message_content. Add tests that verify: (1) a Part with thought=True and text produces the correct content type, (2) a response mixing thought and non-thought parts returns both appropriately, and (3) the streaming path mirrors the same behavior.

Suggestions

Use getattr(part, "thought", False) instead of direct part.thought access for consistency with the existing getattr(part, "thought_signature", None) pattern, providing resilience against SDK version mismatches where the thought attribute may not exist on Part.
Apply the same thought-part handling to the Vertex AI chat completion connector (vertex_ai_chat_completion.py), which has a parallel code structure and the same gap.
Add a SDK guard test simulating a Part without the thought attribute, similar to the existing test_create_chat_message_content_getattr_guard_on_missing_attribute.

Automated review by dariusFTOS's agents

python/semantic_kernel/connectors/ai/google/google_ai/services/google_ai_chat_completion.py

- Use ReasoningContent for non-streaming and StreamingReasoningContent for streaming thought parts (part.thought == True) - Use getattr(part, "thought", False) for SDK compatibility - Thought parts are now properly typed rather than silently dropped

dariusFTOS · 2026-03-26T16:25:32Z

@dariusFTOS please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
@microsoft-github-policy-service agree [company="{your company}"]
Options:

(default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
(when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"
Contributor License Agreement

@microsoft-github-policy-service agree company="FintechOS"

dariusFTOS requested a review from a team as a code owner March 26, 2026 15:54

dariusFTOS changed the title ~~Python: Filter thought parts from Google AI chat completion responses~~ Python: fix(python/google): filter thinking text parts from chat completion responses Mar 26, 2026

dariusFTOS mentioned this pull request Mar 26, 2026

Python: Bug: Google AI connector leaks thinking/thought text parts into ChatMessageContent #13710

Open

github-actions bot reviewed Mar 26, 2026

View reviewed changes

python/semantic_kernel/connectors/ai/google/google_ai/services/google_ai_chat_completion.py Outdated Show resolved Hide resolved

python/semantic_kernel/connectors/ai/google/google_ai/services/google_ai_chat_completion.py Outdated Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: fix(python/google): filter thinking text parts from chat completion responses#13711

Python: fix(python/google): filter thinking text parts from chat completion responses#13711
dariusFTOS wants to merge 2 commits intomicrosoft:mainfrom
dariusFTOS:fix/python-google-filter-thought-parts

dariusFTOS commented Mar 26, 2026 •

edited

Loading

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

Uh oh!

dariusFTOS commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dariusFTOS commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

Description

Test Coverage

Contribution Checklist

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✗ Test Coverage

✗ Design Approach

Flagged Issues

Suggestions

Uh oh!

Uh oh!

Uh oh!

dariusFTOS commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dariusFTOS commented Mar 26, 2026 •

edited

Loading