Skip to content

RemoteMCPServer (STREAMABLE_HTTP) tool discovery fails with "context deadline exceeded" when the server keeps the GET SSE listening channel open idleΒ #1955

@chiquiff

Description

@chiquiff

🎯 Affected Service(s)

controller (RemoteMCPServer reconcile / tool discovery)

🚦 Impact/Severity

High β€” affected MCP server is unusable, zero tools discovered, Accepted=False permanently.

πŸ› Bug Description

When reconciling a RemoteMCPServer with protocol: STREAMABLE_HTTP, the controller opens a GET request to the MCP endpoint (the server→client SSE "listening channel") and applies the request timeout/context deadline to it. If the remote server accepts that GET and keeps it open idle (returns 200 text/event-stream and sends nothing until a server-initiated message — which is spec-compliant behavior), the controller's GET hits the deadline and the whole discovery fails with context deadline exceeded, even though initialize + tools/list over POST succeed in tens of milliseconds.

Servers that don't support that GET channel (return 405/401 immediately) work fine, because the controller moves on and discovers tools via POST.

πŸ”„ Steps To Reproduce

  1. Point a RemoteMCPServer (STREAMABLE_HTTP) at any MCP server that supports the GET SSE listening channel and keeps it open idle (i.e. a GET with Accept: text/event-stream returns 200 text/event-stream and blocks waiting for server-initiated messages).
apiVersion: kagent.dev/v1alpha2
kind: RemoteMCPServer
metadata:
  name: example-mcp
  namespace: kagent
spec:
  protocol: STREAMABLE_HTTP
  url: https://<your-mcp-server>/mcp
  timeout: 5s
  sseReadTimeout: 5m0s
  terminateOnClose: true
  headersFrom:
    - name: Authorization
      value: "Bearer ***REDACTED***"
  1. kubectl get remotemcpserver example-mcp -o yaml β†’ Accepted=False, no discoveredTools.

πŸ€” Expected Behavior

Tool discovery should complete via POST (initialize + tools/list). The GET listening channel is optional per the Streamable HTTP transport β€” it should be opened asynchronously/non-blocking and must not fail discovery if it stays idle. sseReadTimeout (not the request timeout) should govern that long-lived stream.

πŸ“± Actual Behavior

failed to fetch tools for toolServer kagent/example-mcp: failed to connect client for
toolServer kagent/example-mcp: Get "https://<your-mcp-server>/mcp": context deadline exceeded

(with SSE protocol the equivalent error is Client.Timeout exceeded while awaiting headers.)

πŸ’» Environment

kagent 0.9.4 (chart + controller cr.kagent.dev/kagent-dev/kagent/controller:0.9.4), EKS / Kubernetes 1.3x.

πŸ“‹ Evidence (isolated, network ruled out)

Raw GET (with Accept: application/json, text/event-stream) to several MCP endpoints, from a pod on the same node as the controller leader:

MCP server GET response Discovery result
Datadog MCP (mcp.datadoghq.com) HTTP 405 in 0.28s OK β€” 29 tools
GitHub Copilot MCP (api.githubcopilot.com/mcp) HTTP 401 in 0.22s OK β€” 43 tools
Affected server (spec-compliant) HTTP 200 text/event-stream in 0.11s, then idle / read blocks until timeout FAIL β€” context deadline exceeded, 0 tools

Additional checks proving it's not network/auth/protocol/timeout:

  • POST initialize to the affected server from the controller-leader's node β†’ HTTP 200 in 0.04s with a valid serverInfo, so connectivity + token are fine.
  • Raising spec.timeout from 5s β†’ 60s β†’ still fails (the GET never receives data, so a longer deadline just delays the failure).
  • Switching protocol to SSE β†’ same failure (the server implements Streamable HTTP, not the legacy SSE transport; it never emits the endpoint event the SSE client waits for).

Additional context

The differentiator is purely how each server answers the GET listening channel: 405/401 β†’ discovery proceeds; 200 text/event-stream held open idle (spec-compliant) β†’ discovery deadlocks on the request deadline. Suggested fix: open the GET listening channel non-blocking and never let it block/fail tool discovery; govern it with sseReadTimeout rather than timeout.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions