Skip to content

fix: MCP server hanging, stale context, and pagination issues#49

Open
Donach wants to merge 1 commit into
cfx/agorfrom
fix/mcp-server-reliability-17280555621895635601
Open

fix: MCP server hanging, stale context, and pagination issues#49
Donach wants to merge 1 commit into
cfx/agorfrom
fix/mcp-server-reliability-17280555621895635601

Conversation

@Donach
Copy link
Copy Markdown

@Donach Donach commented May 18, 2026

What

Resolves several critical reliability issues with the Agor MCP server implementation, specifically around request hanging, stale session contexts, and empty data lists returned to external agents.

Why

  • Hanging requests: The existing POST /mcp stateless fallback block created an uninitialized McpServer and directly forwarded requests. The official SDK drops tools/call messages if the server isn't initialized, leaving the HTTP response hanging indefinitely.
  • Stale contexts: Agents reusing a stateful transport connection for different sessions could change their target x-agor-session-id, but the SDK caches the original McpContext at initialization.
  • Empty Results: Tool limits were parsed using args.limit ?? 50. If an agent requested unlimited items by sending limit: 0, the value 0 bypassed the nullish check, pushing limit: 0 directly to Drizzle, yielding 0 results.

Fix

  • Replaced the stateless fallback with a direct HTTP 400 (-32000) JSON-RPC error.
  • Wrapped transport.handleRequest to intercept and dynamically update McpContext.sessionId on every request.
  • Extracted mcpSessionId from both headers and query strings (req.query.sessionId).
  • Replaced ?? with || for limit parsing to correctly default explicit 0s.

Impact

Agents properly receive errors rather than hanging timeouts. Transport connections safely handle rapid context-swapping. Tools gracefully ignore 0-limits and return actual list datasets.

Verification

  • Deployed locally and tested POST /mcp errors.
  • Confirmed strict undefined and fallback limits behave as expected via unit test suites.
  • Removed debugging and temporary testing scripts prior to commit.

PR created automatically by Jules for task 17280555621895635601 started by @Donach

* Remove stateless fallback mode in server.ts to prevent tools/call messages from hanging uninitialized servers. Replaced with explicit HTTP 400 error.
* Update session ID extraction to check req.query.sessionId as a fallback for the transport connection.
* Fix stale session context on stateful transport reuse by injecting the latest sessionId into the cached McpContext before transport.handleRequest().
* Replace nullish coalescing `??` with `||` logic in MCP tools list parameter extraction to correctly handle agents passing a 0 limit, preventing empty list results.

Co-authored-by: Donach <39565367+Donach@users.noreply.github.com>
@google-labs-jules
Copy link
Copy Markdown

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant