π― Affected Service(s)
controller (RemoteMCPServer reconcile / tool discovery)
π¦ Impact/Severity
High β affected MCP server is unusable, zero tools discovered, Accepted=False permanently.
π Bug Description
When reconciling a RemoteMCPServer with protocol: STREAMABLE_HTTP, the controller opens a GET request to the MCP endpoint (the serverβclient SSE "listening channel") and applies the request timeout/context deadline to it. If the remote server accepts that GET and keeps it open idle (returns 200 text/event-stream and sends nothing until a server-initiated message β which is spec-compliant behavior), the controller's GET hits the deadline and the whole discovery fails with context deadline exceeded, even though initialize + tools/list over POST succeed in tens of milliseconds.
Servers that don't support that GET channel (return 405/401 immediately) work fine, because the controller moves on and discovers tools via POST.
π Steps To Reproduce
- Point a
RemoteMCPServer (STREAMABLE_HTTP) at any MCP server that supports the GET SSE listening channel and keeps it open idle (i.e. a GET with Accept: text/event-stream returns 200 text/event-stream and blocks waiting for server-initiated messages).
apiVersion: kagent.dev/v1alpha2
kind: RemoteMCPServer
metadata:
name: example-mcp
namespace: kagent
spec:
protocol: STREAMABLE_HTTP
url: https://<your-mcp-server>/mcp
timeout: 5s
sseReadTimeout: 5m0s
terminateOnClose: true
headersFrom:
- name: Authorization
value: "Bearer ***REDACTED***"
kubectl get remotemcpserver example-mcp -o yaml β Accepted=False, no discoveredTools.
π€ Expected Behavior
Tool discovery should complete via POST (initialize + tools/list). The GET listening channel is optional per the Streamable HTTP transport β it should be opened asynchronously/non-blocking and must not fail discovery if it stays idle. sseReadTimeout (not the request timeout) should govern that long-lived stream.
π± Actual Behavior
failed to fetch tools for toolServer kagent/example-mcp: failed to connect client for
toolServer kagent/example-mcp: Get "https://<your-mcp-server>/mcp": context deadline exceeded
(with SSE protocol the equivalent error is Client.Timeout exceeded while awaiting headers.)
π» Environment
kagent 0.9.4 (chart + controller cr.kagent.dev/kagent-dev/kagent/controller:0.9.4), EKS / Kubernetes 1.3x.
π Evidence (isolated, network ruled out)
Raw GET (with Accept: application/json, text/event-stream) to several MCP endpoints, from a pod on the same node as the controller leader:
| MCP server |
GET response |
Discovery result |
Datadog MCP (mcp.datadoghq.com) |
HTTP 405 in 0.28s |
OK β 29 tools |
GitHub Copilot MCP (api.githubcopilot.com/mcp) |
HTTP 401 in 0.22s |
OK β 43 tools |
| Affected server (spec-compliant) |
HTTP 200 text/event-stream in 0.11s, then idle / read blocks until timeout |
FAIL β context deadline exceeded, 0 tools |
Additional checks proving it's not network/auth/protocol/timeout:
POST initialize to the affected server from the controller-leader's node β HTTP 200 in 0.04s with a valid serverInfo, so connectivity + token are fine.
- Raising
spec.timeout from 5s β 60s β still fails (the GET never receives data, so a longer deadline just delays the failure).
- Switching
protocol to SSE β same failure (the server implements Streamable HTTP, not the legacy SSE transport; it never emits the endpoint event the SSE client waits for).
Additional context
The differentiator is purely how each server answers the GET listening channel: 405/401 β discovery proceeds; 200 text/event-stream held open idle (spec-compliant) β discovery deadlocks on the request deadline. Suggested fix: open the GET listening channel non-blocking and never let it block/fail tool discovery; govern it with sseReadTimeout rather than timeout.
π― Affected Service(s)
controller (RemoteMCPServer reconcile / tool discovery)
π¦ Impact/Severity
High β affected MCP server is unusable, zero tools discovered,
Accepted=Falsepermanently.π Bug Description
When reconciling a
RemoteMCPServerwithprotocol: STREAMABLE_HTTP, the controller opens a GET request to the MCP endpoint (the serverβclient SSE "listening channel") and applies the requesttimeout/context deadline to it. If the remote server accepts that GET and keeps it open idle (returns200 text/event-streamand sends nothing until a server-initiated message β which is spec-compliant behavior), the controller's GET hits the deadline and the whole discovery fails withcontext deadline exceeded, even thoughinitialize+tools/listover POST succeed in tens of milliseconds.Servers that don't support that GET channel (return
405/401immediately) work fine, because the controller moves on and discovers tools via POST.π Steps To Reproduce
RemoteMCPServer(STREAMABLE_HTTP) at any MCP server that supports the GET SSE listening channel and keeps it open idle (i.e. aGETwithAccept: text/event-streamreturns200 text/event-streamand blocks waiting for server-initiated messages).kubectl get remotemcpserver example-mcp -o yamlβAccepted=False, nodiscoveredTools.π€ Expected Behavior
Tool discovery should complete via POST (
initialize+tools/list). The GET listening channel is optional per the Streamable HTTP transport β it should be opened asynchronously/non-blocking and must not fail discovery if it stays idle.sseReadTimeout(not the requesttimeout) should govern that long-lived stream.π± Actual Behavior
(with
SSEprotocol the equivalent error isClient.Timeout exceeded while awaiting headers.)π» Environment
kagent
0.9.4(chart + controllercr.kagent.dev/kagent-dev/kagent/controller:0.9.4), EKS / Kubernetes 1.3x.π Evidence (isolated, network ruled out)
Raw
GET(withAccept: application/json, text/event-stream) to several MCP endpoints, from a pod on the same node as the controller leader:mcp.datadoghq.com)api.githubcopilot.com/mcp)text/event-streamin 0.11s, then idle / read blocks until timeoutcontext deadline exceeded, 0 toolsAdditional checks proving it's not network/auth/protocol/timeout:
POST initializeto the affected server from the controller-leader's node β HTTP 200 in 0.04s with a validserverInfo, so connectivity + token are fine.spec.timeoutfrom5sβ60sβ still fails (the GET never receives data, so a longer deadline just delays the failure).protocoltoSSEβ same failure (the server implements Streamable HTTP, not the legacy SSE transport; it never emits theendpointevent the SSE client waits for).Additional context
The differentiator is purely how each server answers the GET listening channel:
405/401β discovery proceeds;200 text/event-streamheld open idle (spec-compliant) β discovery deadlocks on the request deadline. Suggested fix: open the GET listening channel non-blocking and never let it block/fail tool discovery; govern it withsseReadTimeoutrather thantimeout.