fix(httpserver): quiet client-disconnect log path, return 499 by sufubao · Pull Request #1288 · ModelTC/LightLLM

sufubao · 2026-05-06T03:03:07Z

Summary

Add ClientDisconnected exception (lightllm/utils/error_utils.py) and raise it from the polling loops in both httpserver/manager.py and httpserver_for_pd_master/manager.py instead of generic Exception("... disconnected").
Catch ClientDisconnected ahead of the generic handler in HttpServerManager.generate: log a single WARNING line and re-raise without calling self.abort again (the disconnect path already aborted).
Catch ClientDisconnected in the FastAPI route handlers (/generate, /generate_stream, /v1/chat/completions, /v1/completions, /v1/messages) and return HTTP 499 instead of falling through to the generic Exception handler that emits 417 + a traceback.
Catch ClientDisconnected in the SSE wrapper (api_openai._safe_stream_wrapper) and exit quietly so a closed client doesn't trigger a stream-tail traceback.

Context

The underlying root cause — BaseHTTPMiddleware's access-log decorator swallowing http.disconnect so Request.is_disconnected() never flipped — was already fixed upstream in #1282 (the ASGI-class _AccessLogMiddleware). Once disconnect detection actually fires, the existing code path raises a generic Exception that falls through every layer's logger; this PR is the cleanup that follows from that fix.

Server logs (before vs after)

Before

WARNING 05-06 02:51:38 [manager.py:767] aborted group_request_id 3912
ERROR 05-06 02:51:38 [manager.py:449] group_request_id: 3912 has exception req_id 3912 disconnected
WARNING 05-06 02:51:38 [manager.py:767] aborted group_request_id 3912
ERROR 05-06 02:51:38 [api_http.py:248] An error occurred: req_id 3912 disconnected
ERROR 05-06 02:51:38 [api_http.py:248] Traceback (most recent call last):
ERROR 05-06 02:51:38 [api_http.py:248]   File "/lightllm/lightllm/server/api_http.py", line 241, in generate
ERROR 05-06 02:51:38 [api_http.py:248]     return await g_objs.g_generate_func(request, g_objs.httpserver_manager)
ERROR 05-06 02:51:38 [api_http.py:248]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 05-06 02:51:38 [api_http.py:248]   File "/lightllm/lightllm/server/api_lightllm.py", line 56, in lightllm_generate
ERROR 05-06 02:51:38 [api_http.py:248]     async for sub_req_id, request_output, metadata, finish_status in results_generator:
ERROR 05-06 02:51:38 [api_http.py:248]   File "/lightllm/lightllm/server/httpserver/manager.py", line 457, in generate
ERROR 05-06 02:51:38 [api_http.py:248]     raise e
ERROR 05-06 02:51:38 [api_http.py:248]   File "/lightllm/lightllm/server/httpserver/manager.py", line 440, in generate
ERROR 05-06 02:51:38 [api_http.py:248]     async for sub_req_id, request_output, metadata, finish_status in results_generator:
ERROR 05-06 02:51:38 [api_http.py:248]   File "/lightllm/lightllm/server/httpserver/manager.py", line 667, in _wait_to_token_package
ERROR 05-06 02:51:38 [api_http.py:248]     raise Exception(f"req_id {group_request_id} disconnected")
ERROR 05-06 02:51:38 [api_http.py:248] Exception: req_id 3912 disconnected

After

WARNING 05-06 02:57:24 [manager.py:774] aborted group_request_id 1792
WARNING 05-06 02:57:24 [manager.py:453] group_request_id: 1792 client disconnected
INFO 05-06 02:57:24 [api_http.py:149] POST /generate 499

Test plan

python -c "import ast; [ast.parse(open(f).read()) for f in (...)]" passes on all five touched files.
Manual: hit /generate with a long-running prompt, close the client mid-stream, confirm server logs match the "After" block above and that no traceback is emitted.
Manual: same drill against a streaming endpoint (/v1/chat/completions with stream=true) — confirm SSE wrapper exits without dumping a traceback.
Manual: PD-master mode — disconnect during prefill/decode and confirm pd-master manager raises ClientDisconnected instead of Exception.

Introduce ClientDisconnected exception class and use it instead of a generic Exception when the polling loop sees the client gone or another module aborted the request. Special-case it in HttpServerManager.generate (log a one-liner, skip the duplicate self.abort) and in the FastAPI route handlers + SSE wrapper (return 499, swallow at the stream tail). This collapses the prior aborted-request log from a multi-line Python traceback to a single WARNING line.

gemini-code-assist

Code Review

This pull request introduces a custom ClientDisconnected exception to handle client disconnects gracefully across the server. It updates the HTTP API handlers to return a 499 status code and modifies the server managers to catch this exception for quiet logging, replacing generic exceptions previously used for disconnects. The review feedback correctly identifies that in HttpServerManagerForPDMaster, the abort() method should be explicitly called before raising ClientDisconnected to ensure immediate resource release on remote nodes, maintaining consistency with the HttpServerManager implementation.

… debug Address review feedback on ModelTC#1288: - HttpServerManagerForPDMaster: call self.abort(p_node, d_node) before raising ClientDisconnected at every detection site so remote P/D nodes release resources immediately, matching HttpServerManager's behavior. - Add except ClientDisconnected branch in HttpServerManagerForPDMaster.generate so disconnects no longer fall through the generic BaseException handler that logs ERROR + traceback. - Add logger.debug(..., exc_info=True) in both managers' ClientDisconnected handlers so ops can pinpoint which phase tripped the disconnect when DEBUG logging is enabled.

…vent waits Address further review feedback on ModelTC#1288: - api_http.get_score: add except ClientDisconnected branch before the generic handler so a client disconnect returns 499 instead of 417, matching the /generate behavior. lightllm_get_score also drives httpserver_manager.generate so it can raise ClientDisconnected the same way. - HttpServerManagerForPDMaster: add _wait_event_with_disconnect_check helper that waits on an asyncio.Event with the original 60s budget but polls request.is_disconnected() every 1s, aborting P/D nodes immediately on disconnect. Use it at the three asyncio.wait_for(event.wait(), timeout=60) sites (fetch_stream upkv wait, fetch_nixl_stream np-prompt-ids wait, and fetch_nixl_stream upkv wait). Previously a disconnect during these waits could leave PD nodes holding work for up to 60s.

…sconnects req_status.aborted can be set by internal module failures (e.g. the visual proxy marks shm_req.is_aborted = True on exceptions, which the recycle loop converts to req_status.aborted), not just by client disconnects. The earlier quiet-disconnect refactor mapped that branch to ClientDisconnected, which caused real server-side failures to be swallowed as 499 in the API handlers. Restore the pre-refactor behavior for that branch: raise a generic Exception so it falls through HttpServerManager.generate's existing handler (logs ERROR + releases multimodal resources) and surfaces to the client as 417 instead of 499. ClientDisconnected is now reserved for request.is_disconnected() paths only; tighten its docstring to reflect that.

… preload Previously fetch_resource raised a generic Exception on client disconnect during URL download, which AudioItem.preload / ImageItem.preload then wrapped into ValueError. That happened before HttpServerManager.generate's polling loop, so a real request.is_disconnected() during multimodal download took the noisy error path (logger.error + 400/417) instead of the new quiet ClientDisconnected/499 contract. - multimodal_utils.fetch_resource: raise ClientDisconnected on disconnect. - multimodal_params.{AudioItem,ImageItem}.preload: re-raise ClientDisconnected before the generic except branch so it isn't wrapped as ValueError. - api_http /tokens: add except ClientDisconnected -> 499 ahead of the generic handler (verify_and_preload runs there too). - error_utils.ClientDisconnected: make group_request_id optional (default None) so call sites without that context, like fetch_resource, can raise it.

The exception message and PR commit history (39a4e80) carry the same rationale; the comment was redundant per review feedback.

针对 review 中 Q4/Q5 的疑问加注释，把两处修复的旧 bug 写清楚： - _wait_event_with_disconnect_check: 解释原先 60s wait_for 的盲区，以及现在按 poll_interval 切片轮询断连后立刻 abort + raise ClientDisconnected 的语义，最坏占资源时间从 60s 降到 ~1s。 - fetch_stream 第一处 abort + raise 站点（作为同模式的代表性入口）：解释原先 raise Exception 只让 master 退出、P/D 节点继续占资源的 bug，以及为什么必须先 self.abort(p_node, d_node) 发 ABORT_REQ 再 raise ClientDisconnected。

gemini-code-assist Bot reviewed May 6, 2026

View reviewed changes

ModelTC deleted a comment from gemini-code-assist Bot May 6, 2026

sufubao added 5 commits May 6, 2026 17:22

style(httpserver): drop verbose NOTE comment on internal-abort branch

dec672b

The exception message and PR commit history (39a4e80) carry the same rationale; the comment was redundant per review feedback.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(httpserver): quiet client-disconnect log path, return 499#1288

fix(httpserver): quiet client-disconnect log path, return 499#1288
sufubao wants to merge 7 commits intoModelTC:mainfrom
sufubao:httpserver-abort-on-disconnect

sufubao commented May 6, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sufubao commented May 6, 2026

Summary

Context

Server logs (before vs after)

Before

After

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant