-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Description
🔴 Required Information
Describe the Bug:
The num_invocations_to_keep parameter in ContextFilterPlugin is misleading. According to the official InvocationContext definition, an invocation is:
"Starts with a user message and ends with a final response. Can contain one or multiple agent calls."
However, ContextFilterPlugin actually counts model turns (every role == "model" content), not complete invocations. This means if a single invocation contains multiple LLM calls (e.g., function_call + final response), they are counted as multiple "invocations".
Steps to Reproduce:
- Create an agent with tools
- Configure
ContextFilterPlugin(num_invocations_to_keep=1) - Send a user message that triggers a tool call
- Observe the context filtering behavior
from google.adk.agents import LlmAgent
from google.adk.apps import App
from google.adk.plugins import ContextFilterPlugin
# Assume we have a weather tool
agent = LlmAgent(
model=model,
name="assistant",
tools=[get_weather],
)
app = App(
name="test_app",
root_agent=agent,
plugins=[
ContextFilterPlugin(num_invocations_to_keep=1),
],
)
# User asks: "What's the weather in Beijing?"
# This creates ONE invocation but TWO model turns:
# 1. model: function_call(get_weather)
# 2. model: "The weather in Beijing is sunny, 25°C"Expected Behavior:
With num_invocations_to_keep=1, the entire invocation should be preserved:
[user: "What's the weather?"]
[model: function_call]
[user: function_response]
[model: "It's sunny"]
Observed Behavior:
The implementation in context_filter_plugin.py (lines 98-112) counts by role == "model":
num_model_turns = sum(1 for c in contents if c.role == "model")
if num_model_turns >= self._num_invocations_to_keep:
model_turns_to_find = self._num_invocations_to_keep
...
if contents[i].role == "model": # Counts every model response
model_turns_to_find -= 1This means num_invocations_to_keep=1 actually keeps only the last model turn, potentially breaking the invocation.
Note: _adjust_split_index_to_avoid_orphaned_function_responses partially mitigates this by keeping function_call/response pairs together, but the fundamental counting logic still differs from the official invocation definition.
Evidence - Official Invocation Definition:
From invocation_context.py (lines 98-136):
class InvocationContext(BaseModel):
"""An invocation context represents the data of a single invocation of an agent.
An invocation:
1. Starts with a user message and ends with a final response.
2. Can contain one or multiple agent calls.
3. Is handled by runner.run_async().
...
┌─────────────────────── invocation ──────────────────────────┐
┌──────────── llm_agent_call_1 ────────────┐ ┌─ agent_call_2 ─┐
┌──── step_1 ────────┐ ┌───── step_2 ──────┐
[call_llm] [call_tool] [call_llm] [transfer]"""
The diagram clearly shows **one invocation can contain multiple `[call_llm]`** operations.
**Environment Details:**
- ADK Library Version: 1.23.0
- Desktop OS: Windows 11
- Python Version: 3.13.5
**Model Information:**
- Are you using LiteLLM: Yes
- Which model is being used: deepseek/deepseek-chat (but this is model-agnostic)
---
## 🟡 Optional Information
**Suggested Fix:**
Option 1: **Rename the parameter** to accurately reflect its behavior:
```python
# Change from
num_invocations_to_keep: int
# To
num_model_turns_to_keep: int
Option 2: Fix the implementation to count actual invocations using invocation_id:
# Count unique invocation_ids instead of model turns
invocation_ids = []
for c in contents:
if hasattr(c, 'invocation_id') and c.invocation_id not in invocation_ids:
invocation_ids.append(c.invocation_id)
num_invocations = len(invocation_ids)Related Issues:
- Smart Context Pruning with Token-Aware Filtering #3829 - Smart Context Pruning with Token-Aware Filtering (mentions limitations of current approach)
- BUG -
ContextFilterPluginsends orphaned tool call message to LLM causing failed API requests. #4027 - Orphaned function response bug (fixed, but related to this counting issue)
How often has this issue occurred?:
- Always (100%) - This is a code logic issue, reproducible every time