-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
Is your feature request related to a specific problem?
Yes.
google.adk.models.lite_llm.LiteLlm.generate_content_async() does not forward request-scoped headers set on llm_request.config.http_options.headers.
In ADK, before_model_callback plugins can attach per-request headers to llm_request.config.http_options.headers for tracing, routing, proxying, or request-scoped auth. GoogleLlm preserves request-scoped headers as part of the outbound request path, but LiteLlm currently ignores them.
As a result, any LiteLLM-backed model cannot reliably use request-scoped headers injected by plugins.
Describe the Solution You'd Like
LiteLlm.generate_content_async() should read llm_request.config.http_options.headers and merge those headers into the outgoing LiteLLM request for that call only.
Expected behavior:
- preserve static headers configured on the model
- merge request-scoped headers from
http_options.headers - let request-scoped headers override duplicate static headers on that request
- avoid mutating shared instance state
Impact on your work
We use ADK before_model_callback plugins to attach per-request tracing metadata to model calls. This works for models that preserve request headers, but not for LiteLlm.
That forces us to maintain a local LiteLlm subclass that overrides generate_content_async() only to forward http_options.headers. We would like to remove that forked behavior and rely on upstream ADK directly.
This impacts production tracing and request correlation for LiteLLM-backed agents.
In our flow:
a before_model_callback plugin sets headers on llm_request.config.http_options.headers
for Portkey, those headers include things like:
- x-portkey-trace-id
- x-portkey-metadata
LiteLlm should forward those headers on the outbound request
Portkey then reads them and groups the request under the provided trace id
So the missing upstream behavior is not Portkey-specific, but Portkey is a concrete case that benefits immediately:
- use A2A/ADK session.id as x-portkey-trace-id
- attach agent/session metadata via x-portkey-metadata
- get grouped traces in Portkey observability for all LiteLLM-backed calls
Without this fix:
- the plugin can write the headers but LiteLlm drops them so Portkey never sees the trace id
With the fix:
- Portkey tracing works the same way it already does in paths that correctly forward request headers, like our PortkeyLlm wrapper
Willingness to contribute
Yes.
Describe Alternatives You've Considered
We considered:
- mutating
self._additional_args["extra_headers"]inside aLiteLlmsubclass before callingsuper() - using a lock around that mutation
- using different callback hooks
- relying on OpenTelemetry alone
These were not good long-term solutions:
- mutating shared model state is concurrency-sensitive
- locks serialize requests and are still a workaround
- other callbacks do not solve the transport-layer gap
- OpenTelemetry still depends on the client forwarding headers
Our current workaround is a local LiteLlm subclass that builds request-local completion args and forwards http_options.headers into extra_headers.
Proposed API / Implementation
In src/google/adk/models/lite_llm.py, when building completion_args inside generate_content_async(), merge request-scoped headers into extra_headers:
request_headers = (
llm_request.config.http_options.headers
if llm_request.config
and llm_request.config.http_options
and llm_request.config.http_options.headers
else {}
)
if completion_args.get("extra_headers") or request_headers:
completion_args["extra_headers"] = {
**(completion_args.get("extra_headers") or {}),
**request_headers,
}This keeps the behavior request-local and avoids mutating shared instance state.
Suggested behavior:
- static extra_headers remain intact
- request-scoped headers override duplicates
- no behavior change when http_options.headers is absent
Tests that would help:
- forwards http_options.headers into the outgoing LiteLLM call
- merges static and request headers correctly
- request headers override duplicate static keys
- concurrent calls on the same model do not leak headers across requests
Additional Context
There is already a similar conceptual precedent in GoogleLlm, which treats request-scoped headers as part of the outbound request path.