Skip to content

Compaction breaks LongRunningFunctionTool resume #5602

@AntonDirani

Description

@AntonDirani

🔴 Required Information

EventsCompactionConfig corrupts sessions that use LongRunningFunctionTool. The intermediate "pending" function_response that ADK emits when a long-running function returns shares the same id as the original function_call. The compactor's pending-call guard treats the call as already resolved, folds the function_call event into a text summary, and the eventual client-side resume FunctionResponse then fails contents validation with No function call event found for function responses ids: {...}.

Steps to Reproduce:

  1. Build an App with both EventsCompactionConfig and ResumabilityConfig(is_resumable=True):
    App(
        name="repro",
        root_agent=agent,
        events_compaction_config=EventsCompactionConfig(
            compaction_interval=3,
            overlap_size=1,
            summarizer=LlmEventSummarizer(llm=Gemini(model="gemini-2.5-flash-lite")),
        ),
        resumability_config=ResumabilityConfig(is_resumable=True),
    )
  2. Register a LongRunningFunctionTool whose wrapped function returns {"status": "pending", ...} (the standard LRF intermediate-response pattern).
  3. Chat with the agent until the turn that triggers the LRF is the Nth invocation, where N ≥ compaction_interval. Compaction runs at the end of that turn, immediately after the LRF emits its pending intermediate response and pauses and then folds the still-open function_call event into a summary, all before the client gets a chance to resume.
  4. Send the resume:
    Content(role="user", parts=[Part(function_response=FunctionResponse(
        id=function_call_id, name=function_name, response={"status": "confirmed", ...}
    ))])
  5. ADK raises during contents assembly: No function call event found for function responses ids: {'adk-<uuid>'}.

Expected Behavior:

The function_call event of a LongRunningFunctionTool should not be eligible for compaction until a final FunctionResponse (the client-supplied resume payload) has been appended to the session. Sending the resume should succeed and the agent should continue normally.

Observed Behavior:

The compactor folds the LRF function_call + intermediate function_response pair into a summary text event. When the resume FunctionResponse arrives, the contents request processor cannot find a matching raw function_call event in session history and raises:

No function call event found for function responses ids: {'adk-e5f8e450-3db7-4fef-8afa-d315774a593d'}

Environment Details:

  • ADK Library Version: google-adk 1.31.0
  • Desktop OS: Windows 11
  • Python Version: 3.11

Model Information:

  • Are you using LiteLLM: No
  • Which model is being used: gemini-2.5-pro for the agent, gemini-2.5-flash-lite for the compaction summarizer

🟡 Optional Information

Regression:

N/A — observed on 1.31.0.

Logs:

ValueError: No function call event found for function responses ids: {'adk-…'}

Additional Context:

Root cause is in google/adk/apps/compaction.py:

_pending_function_call_ids (compaction.py:280-294) computes pending calls as all_call_ids - all_response_ids. For a LongRunningFunctionTool, ADK emits an intermediate function_response event sharing the call's id as soon as the wrapped function returns its pending dict. That id ends up in all_response_ids, so the call is excluded from the pending set. _truncate_events_before_pending_function_call (compaction.py:303-310) consequently allows the call event to be compacted.

The signal needed for the fix is already on the call event: Event.long_running_tool_ids: set[str]. The compactor just isn't consulting it.

Suggested fix — treat any call id present in long_running_tool_ids as pending until the framework can distinguish the intermediate response from the final one:

def _pending_function_call_ids(events: list[Event]) -> set[str]:
    all_call_ids: set[str] = set()
    all_response_ids: set[str] = set()
    long_running_ids: set[str] = set()
    for event in events:
        all_call_ids.update(_event_function_call_ids(event))
        all_response_ids.update(_event_function_response_ids(event))
        if event.long_running_tool_ids:
            long_running_ids.update(event.long_running_tool_ids)
    return (all_call_ids - all_response_ids) | long_running_ids

But once an LRF call exists in a session, every compaction window that contains it will skip it forever (since long_running_tool_ids is permanently set on the call event). The cleaner long-term fix is to mark the intermediate function_response event with a flag (e.g. actions.is_intermediate_long_running_response) so _pending_function_call_ids can distinguish intermediate from final responses and unblock compaction once the resume is delivered. Happy to open a PR if the framing is right.

Workaround: Subclass LlmEventSummarizer and override maybe_summarize_events to return None whenever the candidate window contains an event whose long_running_tool_ids are not all matched by a non-pending function_response in the same window. ADK skips appending a compaction event for that round; the next round retries on a wider window. Relies on the convention that LRF intermediates use {"status": "pending", ...} and final/resume payloads do not .

How often has this issue occurred?:

  • Always (100%) — deterministic given a compaction interval that fires while an LRF is unresolved.

Metadata

Metadata

Labels

core[Component] This issue is related to the core interface and implementationrequest clarification[Status] The maintainer need clarification or more information from the author

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions