Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 19 additions & 2 deletions lmdeploy/serve/core/async_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -269,9 +269,26 @@ async def safe_run(self, handle, session, **kwargs):
yield generator
except (Exception, asyncio.CancelledError, GeneratorExit) as e: # noqa
logger.exception(f'[safe_run] session {session.session_id} exception caught: {e}')
await session.async_abort()
# Use asyncio.shield to protect cleanup coroutines from being cancelled.
# When a task is in cancelling state, bare `await` raises CancelledError
# immediately. shield ensures the inner coroutine runs to completion.
# The outer `except (asyncio.CancelledError, Exception)` catches the
# CancelledError that shield itself re-raises at the await point.
Comment on lines +272 to +276
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explanatory comment around asyncio.shield is misleading: shield prevents the inner awaitable from being cancelled, but it does not ensure the outer task waits for it to finish once the outer task is cancelled (it can still raise CancelledError immediately). Please adjust the comment to match actual asyncio.shield behavior so future changes don’t rely on an incorrect guarantee.

Suggested change
# Use asyncio.shield to protect cleanup coroutines from being cancelled.
# When a task is in cancelling state, bare `await` raises CancelledError
# immediately. shield ensures the inner coroutine runs to completion.
# The outer `except (asyncio.CancelledError, Exception)` catches the
# CancelledError that shield itself re-raises at the await point.
# Use asyncio.shield to prevent the inner cleanup coroutine itself from
# being cancelled when this task is cancelled. Note that awaiting a
# shielded coroutine can still raise CancelledError immediately if the
# *outer* task is already in a cancelling state; shield only protects
# the wrapped awaitable from receiving the cancellation.

Copilot uses AI. Check for mistakes.
try:
await asyncio.shield(handle.async_cancel(session.session_id))
except (asyncio.CancelledError, Exception) as cancel_e:
logger.debug(f'[safe_run] session {session.session_id} async_cancel exception caught: {cancel_e}')
if self.backend == 'pytorch':
await handle.async_end(session.session_id)
logger.info(f'[safe_run] session {session.session_id} ending session')
try:
await asyncio.shield(handle.async_end(session.session_id))
except (asyncio.CancelledError, Exception) as end_e:
logger.debug(f'[safe_run] session {session.session_id} async_end exception caught: {end_e}')
Comment on lines +277 to +286
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exceptions from handle.async_cancel() / handle.async_end() are now swallowed and only logged at DEBUG level. If these cleanup operations fail (e.g., RPC failure in mp engine), the session may remain alive/leak resources with little visibility. Consider logging these as warning/error (ideally with stack trace) and/or recording a failure metric so operational issues are detectable.

Copilot uses AI. Check for mistakes.
# Wrap as SafeRunException so that the outer `request_handle` context
# manager in `session_manager.py` can distinguish a handled cancellation (caught by
# `except SafeRunException: pass`) from an unexpected CancelledError.
# Without this, the suppressed exception leaves the task in cancelling
# state, causing a second CancelledError at the next await point.
raise SafeRunException(f'Safe run exception for session {session.session_id}') from e
finally:
await generator.aclose()
Expand Down
Loading