Conversation
Add response write timeout middleware that destroys sockets when HTTP response writes stall due to dead TCP peers. When a client disconnects mid-flight (laptop sleep, network change), the TCP connection stays half-open and res.write()/res.end() blocks for ~125s until TCP gives up. The middleware intercepts writeHead() and starts a timer (default 10s). If the response doesn't finish within the timeout, the socket is destroyed. SSE streams are excluded (handled by existing heartbeat). - Add GITLAB_RESPONSE_WRITE_TIMEOUT_MS config (default 10s) - Add responseWriteTimeoutMiddleware to Express pipeline - Add 'write_timeout' to ConnectionCloseReason for logging - Set res.locals.writeTimedOut flag for close handler reason detection Closes #391
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds a configurable response-write timeout (env Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client (MCP)
participant Server as Express Server
participant Logger as Logger
Client->>Server: send HTTP request
Server->>Server: handler runs\nheaders written (res.writeHead)
Server->>Server: start write timeout timer (N ms)
alt response finishes before timeout
Server->>Server: res.finish / res.close\nclear timer
else timeout fires
Server->>Logger: logWarn({ reason: "write_timeout", timeoutMs, method, path, session })
Server->>Server: set res.locals.writeTimedOut = true
Server->>Client: destroy socket (res.destroy)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Test Coverage ReportOverall Coverage: 96.49%
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Pull request overview
Adds an Express middleware to detect stalled HTTP response writes (zombie TCP peers) and proactively destroy the connection after a configurable timeout, improving server resilience and observability.
Changes:
- Introduce
responseWriteTimeoutMiddleware()withGITLAB_RESPONSE_WRITE_TIMEOUT_MSconfiguration (default 10s, 0 disables). - Wire middleware into server startup and enrich disconnect reason logging with
write_timeout. - Add unit tests and document the new environment variable.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/middleware/response-write-timeout.ts |
New middleware that starts a timer on writeHead and destroys stalled responses. |
src/middleware/index.ts |
Exports the new middleware from the middleware barrel. |
src/server.ts |
Registers the middleware early and logs write_timeout as a disconnect reason. |
src/logging/types.ts |
Extends ConnectionCloseReason union with write_timeout. |
src/config.ts |
Adds RESPONSE_WRITE_TIMEOUT_MS config parsed from GITLAB_RESPONSE_WRITE_TIMEOUT_MS. |
tests/unit/middleware/response-write-timeout.test.ts |
New unit tests validating timeout behavior, cleanup, SSE skip, and disable flag. |
tests/unit/server.test.ts |
Updates config mocks and middleware selection logic to account for the new middleware. |
README.md.in |
Documents GITLAB_RESPONSE_WRITE_TIMEOUT_MS in the config table. |
README.md |
Documents GITLAB_RESPONSE_WRITE_TIMEOUT_MS in the config table. |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/config.ts`:
- Around line 449-452: RESPONSE_WRITE_TIMEOUT_MS currently uses parseTimerMs
which rejects zero, preventing setting GITLAB_RESPONSE_WRITE_TIMEOUT_MS=0 to
disable the timeout; update parseTimerMs to accept an allowZero boolean (default
false) that forwards to parseStrictInt, then call
parseTimerMs(process.env.GITLAB_RESPONSE_WRITE_TIMEOUT_MS, 10000, true) when
setting RESPONSE_WRITE_TIMEOUT_MS so zero is preserved; reference parseTimerMs,
parseStrictInt, and RESPONSE_WRITE_TIMEOUT_MS (and confirm
response-write-timeout.ts checks for <= 0 remain effective).
In `@src/middleware/response-write-timeout.ts`:
- Around line 51-54: Replace the unsafe any[] rest parameter for the
monkey-patched writeHead with a proper typed tuple: declare a local alias type
using Parameters<Response['writeHead']> (e.g., WriteHeadArgs) and change the
patched signature in the writeHead assignment from ...args: any[] to ...args:
WriteHeadArgs while keeping the return type Response and preserving the existing
logic inside the function.
- Around line 55-83: The middleware's patched writeHead handler uses a loose
any[] signature and checks res.getHeader('content-type') before calling
originalWriteHead, so Content-Type passed as an argument can be missed; change
the handler signature to match Node's writeHead types instead of ...args: any[]
(use the proper IncomingHttpHeaders/number/string overload types), call
originalWriteHead(...args) first (or compute the effective headers by merging
res.getHeader with headers passed in args) and only then determine isSSE by
inspecting the final Content-Type before starting writeTimer, and keep the
existing logic that sets res.locals.writeTimedOut and calls res.destroy() on
timeout (referencing originalWriteHead, writeTimer, res.getHeader, args,
RESPONSE_WRITE_TIMEOUT_MS, res.locals.writeTimedOut).
In `@tests/unit/middleware/response-write-timeout.test.ts`:
- Around line 7-10: Update the header comment in the response-write-timeout test
to reflect the actual default timeout: change the text fragment "default 30s" to
"default 10s" (matching the PR behavior of 10000ms) so the comment accurately
documents the middleware's default timeout.
- Around line 201-217: The test for responseWriteTimeoutMiddleware should also
assert that only one timer is registered when writeHead is called multiple
times: spy on global.setTimeout (e.g., jest.spyOn(global, 'setTimeout')) before
invoking middleware(req, res, next), call res.writeHead twice, then expect
setTimeout toHaveBeenCalledTimes(1) in addition to the existing
expect(res.destroy).toHaveBeenCalledTimes(1); restore the spy after the test to
avoid cross-test pollution. This references responseWriteTimeoutMiddleware and
res.writeHead to locate the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1a9414c7-1cd0-4392-b0cb-f9de424473d9
📒 Files selected for processing (9)
README.mdREADME.md.insrc/config.tssrc/logging/types.tssrc/middleware/index.tssrc/middleware/response-write-timeout.tssrc/server.tstests/unit/middleware/response-write-timeout.test.tstests/unit/server.test.ts
- Call originalWriteHead() before reading Content-Type so headers
passed via writeHead(200, {headers}) are visible to SSE detection
- Replace any[] with Parameters<typeof res.writeHead> for type safety
- Add allowZero to parseTimerMs so GITLAB_RESPONSE_WRITE_TIMEOUT_MS=0
actually disables the timeout
- Correct stale default value in test docblock (30s -> 10s)
- Add setTimeout spy assertion in double-writeHead test
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/middleware/response-write-timeout.ts (1)
61-63:⚠️ Potential issue | 🟠 MajorNormalize
Content-Typebefore SSE detection.Line 62 compares media type case-sensitively. A valid header like
Text/Event-Streamis then treated as non-SSE, so the middleware can incorrectly destroy long-lived SSE responses.Suggested fix
- const contentType = res.getHeader('content-type'); - const isSSE = typeof contentType === 'string' && contentType.includes('text/event-stream'); + const contentType = res.getHeader('content-type'); + const normalizedContentType = + typeof contentType === 'string' + ? contentType.toLowerCase() + : Array.isArray(contentType) + ? contentType.join(',').toLowerCase() + : ''; + const isSSE = normalizedContentType.includes('text/event-stream');🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/middleware/response-write-timeout.ts` around lines 61 - 63, The SSE detection is case-sensitive; update the content-type check so it normalizes the header before testing: get the header into contentType (as currently) but compute isSSE by coercing/normalizing the value to a lowercase string (e.g. if Array.isArray(contentType) join it, or use String(contentType).toLowerCase()) and then check .includes('text/event-stream') so headers like "Text/Event-Stream" are recognized; change the isSSE expression accordingly (referencing the contentType and isSSE symbols and the 'text/event-stream' literal).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/unit/middleware/response-write-timeout.test.ts`:
- Around line 140-145: Add a regression test to response-write-timeout.test.ts
that exercises SSE delivered via res.writeHead(status, headers) so the
middleware sees the 'text/event-stream' header when headers are passed in the
writeHead call (not just via getHeader()); specifically, create a test that uses
responseWriteTimeoutMiddleware() with a mock res whose writeHead is invoked with
a headers object containing 'content-type': 'text/event-stream', and assert the
SSE path is skipped (mirroring the existing GET SSE test); update or add a
second test covering the same scenario for the lines noted (149-155) to ensure
the fix for originalWriteHead/header visibility is covered.
---
Duplicate comments:
In `@src/middleware/response-write-timeout.ts`:
- Around line 61-63: The SSE detection is case-sensitive; update the
content-type check so it normalizes the header before testing: get the header
into contentType (as currently) but compute isSSE by coercing/normalizing the
value to a lowercase string (e.g. if Array.isArray(contentType) join it, or use
String(contentType).toLowerCase()) and then check .includes('text/event-stream')
so headers like "Text/Event-Stream" are recognized; change the isSSE expression
accordingly (referencing the contentType and isSSE symbols and the
'text/event-stream' literal).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 09a96849-9844-45a5-96d0-317c21891978
📒 Files selected for processing (3)
src/config.tssrc/middleware/response-write-timeout.tstests/unit/middleware/response-write-timeout.test.ts
- Use toLowerCase() before checking for text/event-stream so
mixed-case headers like Text/Event-Stream are recognized
- Handle array Content-Type values (Node.js allows string[])
- Add regression test for SSE via writeHead(200, {headers})
- Add test for mixed-case Content-Type header
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/middleware/response-write-timeout.ts`:
- Around line 78-79: The code directly casts req.headers['mcp-session-id'] to
string | undefined and passes it to logWarn; instead create a Zod schema (e.g.,
z.string().optional()) to validate and normalize the header (handle string[] by
selecting the first element or rejecting), use schema.safeParse or parse to
obtain a validated sessionId, and then pass that validated value into the
logWarn call in response-write-timeout.ts (update the variable referenced as
sessionId and the logWarn invocation accordingly).
In `@tests/unit/middleware/response-write-timeout.test.ts`:
- Around line 188-202: Add a new Jest test to cover the array-valued
Content-Type branch: call responseWriteTimeoutMiddleware() with a GET request
via createMockReq and a response from createMockRes whose
getHeader('content-type') is mocked to return ['text/event-stream']; invoke
middleware(req, res, next), call res.writeHead(200), advance timers
(jest.advanceTimersByTime) past the timeout, and assert res.destroy was not
called to ensure SSE is correctly detected when Content-Type is an array; place
this alongside the existing SSE tests that use mixed-case/string headers and
mirror their structure (reuse middleware, req, res, next names).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 09594f6f-abb3-43a7-a250-2cba09fb90f7
📒 Files selected for processing (2)
src/middleware/response-write-timeout.tstests/unit/middleware/response-write-timeout.test.ts
- Replace unsafe 'as string' cast on mcp-session-id header with runtime typeof check handling string | string[] | undefined - Add test for array-valued Content-Type SSE detection
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/middleware/response-write-timeout.ts`:
- Around line 63-68: The nested ternary that computes normalizedContentType from
contentType is hard to read; extract that logic into a small helper function
(e.g., normalizeContentType or getNormalizedContentType) that accepts
contentType and returns the lowercase string (handle string, Array, or other ->
''). Replace the ternary expression that assigns normalizedContentType with a
call to the new helper and add a short unit test or inline comment to document
behavior for string, array, and non-string inputs.
In `@tests/unit/middleware/response-write-timeout.test.ts`:
- Line 265: The test currently spies on the Node-only global using
jest.spyOn(global, 'setTimeout') which SonarCloud flags; replace the reference
to use the standard ES2020 globalThis (i.e., change the spy target from global
to globalThis) so setTimeoutSpy is created via jest.spyOn(globalThis,
'setTimeout') for cross-environment compatibility while keeping the existing spy
name and assertions intact.
- Line 12: The import of the built-in EventEmitter should use the Node.js
built-in module protocol; locate the import that brings in EventEmitter and
change its module specifier from 'events' to use the node:events protocol so the
test imports the built-in module via node:events rather than the bare 'events'
specifier.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5a85342a-cd04-407c-86f9-3046059d147f
📒 Files selected for processing (2)
src/middleware/response-write-timeout.tstests/unit/middleware/response-write-timeout.test.ts
- Extract normalizeContentType() from nested ternary in middleware - Extract setupMiddleware() helper in tests to eliminate repeated middleware/req/res/next setup across 11 test cases - Consolidate 4 SSE detection tests into it.each + 1 writeHead test - Use node: protocol for built-in module import - Use globalThis instead of global for setTimeout spy
Move responseWriteTimeoutMiddleware() before express.json() so that parse/size error responses (4xx) are also protected against stalled writes to dead TCP peers.
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@README.md.in`:
- Line 102: The documentation for GITLAB_RESPONSE_WRITE_TIMEOUT_MS should
explicitly state that setting it to 0 disables the timeout; update the README
entry for GITLAB_RESPONSE_WRITE_TIMEOUT_MS to mention "set to 0 to disable
timeout" (or equivalent), so operators know 0 turns off enforcement. Reference
the environment variable name GITLAB_RESPONSE_WRITE_TIMEOUT_MS in the wording.
In `@src/server.ts`:
- Around line 621-627: Extract the duplicated nested ternary that resolves
disconnect reasons into a single shared helper (e.g., getDisconnectReason(res))
and replace both occurrences with calls to that helper; implement
getDisconnectReason to accept the response/connection object and return one of
the strings 'write_timeout', 'heartbeat_failed', 'destroyed', or
'client_disconnect' by checking res.locals?.writeTimedOut,
res.locals?.heartbeatFailed, and res.destroyed in that priority order, and
update both call sites to use the helper to improve readability and
maintainability.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 787b073f-3e8b-4bc0-a3dc-47dcb12dad59
📒 Files selected for processing (9)
README.mdREADME.md.insrc/config.tssrc/logging/types.tssrc/middleware/index.tssrc/middleware/response-write-timeout.tssrc/server.tstests/unit/middleware/response-write-timeout.test.tstests/unit/server.test.ts
- Extract duplicated nested ternary for disconnect reason into resolveCloseReason() shared helper (fixes SonarCloud warning) - Clear mockSkipFn after probe in access log middleware test - Document that GITLAB_RESPONSE_WRITE_TIMEOUT_MS=0 disables timeout
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@README.md.in`:
- Line 102: Update the README row for GITLAB_RESPONSE_WRITE_TIMEOUT_MS to note
that SSE streams are excluded: change the description for
GITLAB_RESPONSE_WRITE_TIMEOUT_MS (currently "Max time to flush response to
client before destroying zombie connection (`0` to disable)") to explicitly
state that responses with Content-Type "text/event-stream" (SSE) are skipped and
remain on the heartbeat path, so the timeout does not apply to SSE connections.
In `@tests/unit/server.test.ts`:
- Around line 369-375: The test uses explicit any casts for mockJsonFn and
middleware function handling which violates strict TypeScript; replace the any
casts by typing mockExpress.json() as the Express JSON middleware type and type
middlewareCalls elements as tuples matching middleware signatures, then update
the .filter/.map/.find pipeline to use properly typed predicates and callbacks
(referencing mockJsonFn, middlewareCalls, accessLogMiddleware, and
shouldSkipAccessLogRequest) so the probe finds the access log middleware without
using any casts.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0b3e9784-ac3a-400c-9d9b-6f0438a90ee9
📒 Files selected for processing (4)
README.mdREADME.md.insrc/server.tstests/unit/server.test.ts
- Document that RESPONSE_WRITE_TIMEOUT_MS applies to non-SSE responses only (SSE uses heartbeat-based detection) - Replace any casts in server test middleware probe with typed aliases
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/middleware/response-write-timeout.ts (1)
80-81:⚠️ Potential issue | 🟡 MinorUse a Zod schema for
mcp-session-idheader normalization.Line 80-Line 81 narrows untrusted header data with
typeof/array indexing, but this still bypasses the project rule requiring Zod-based validation for external input.Suggested fix
import type { Request, Response, NextFunction } from 'express'; +import { z } from 'zod'; import { RESPONSE_WRITE_TIMEOUT_MS } from '../config'; import { logWarn } from '../logger'; + +const mcpSessionIdSchema = z + .union([z.string(), z.array(z.string())]) + .optional() + .transform((value) => (Array.isArray(value) ? value[0] : value)); @@ - const rawSessionId = req.headers['mcp-session-id']; - const sessionId = typeof rawSessionId === 'string' ? rawSessionId : rawSessionId?.[0]; + const parsedSessionId = mcpSessionIdSchema.safeParse(req.headers['mcp-session-id']); + const sessionId = parsedSessionId.success ? parsedSessionId.data : undefined;#!/bin/bash # Verify whether external header extraction is currently validated via Zod in this middleware. fd -t f response-write-timeout.ts src | while read -r f; do echo "== $f ==" rg -n -C2 "mcp-session-id|safeParse|parse\\(|from 'zod'|from \"zod\"" "$f" doneAs per coding guidelines, “All external data must be validated with Zod schemas”.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/middleware/response-write-timeout.ts` around lines 80 - 81, Replace the ad-hoc header normalization for rawSessionId/sessionId with a Zod-based validation: introduce a Zod schema (e.g., using z.preprocess to accept string or string[] and return the first string, then z.string().min(1)) and use schema.safeParse or schema.parse on req.headers['mcp-session-id'] to produce a validated sessionId; update the code where rawSessionId and sessionId are referenced to use the parsed value and handle validation failure (log/return) per existing middleware error handling. Ensure you import Zod (from 'zod') and reference the schema name (e.g., mcpSessionIdSchema) so reviewers can locate the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/middleware/response-write-timeout.ts`:
- Around line 80-81: Replace the ad-hoc header normalization for
rawSessionId/sessionId with a Zod-based validation: introduce a Zod schema
(e.g., using z.preprocess to accept string or string[] and return the first
string, then z.string().min(1)) and use schema.safeParse or schema.parse on
req.headers['mcp-session-id'] to produce a validated sessionId; update the code
where rawSessionId and sessionId are referenced to use the parsed value and
handle validation failure (log/return) per existing middleware error handling.
Ensure you import Zod (from 'zod') and reference the schema name (e.g.,
mcpSessionIdSchema) so reviewers can locate the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 57559e88-f077-4859-880d-8e80c9226ce9
📒 Files selected for processing (9)
README.mdREADME.md.insrc/config.tssrc/logging/types.tssrc/middleware/index.tssrc/middleware/response-write-timeout.tssrc/server.tstests/unit/middleware/response-write-timeout.test.tstests/unit/server.test.ts
…ess log - Validate mcp-session-id header via Zod schema instead of typeof cast - Log write-timed-out responses as 'write_timeout' error in access log instead of treating as normal completion (observability gap) - Add code note explaining why timer starts at writeHead (intentional)
|



Summary
GITLAB_RESPONSE_WRITE_TIMEOUT_MS(default 10s, set 0 to disable)Problem
POST responses hang for ~125s (TCP retransmit timeout) when the downstream client disconnects mid-flight. GitLab responds in <200ms but
res.write()/res.end()blocks until TCP gives up retransmitting to the dead peer. The server accumulates hanging requests, each holding a Node event loop callback for 125s.Solution
Express middleware intercepts
writeHead()and starts a timer. If the responsefinishevent doesn't fire within 10s, the socket is destroyed. Thewrite_timeoutclose reason is logged for observability.Test plan
Closes #391