fix: replace pickle deserialization with RestrictedUnpickler in PD WebSocket endpoints (CVE-2026-26220)#1306
Conversation
…ints (CVE-2026-26220) The PD disaggregation WebSocket endpoints (/pd_register, /kv_move_status) and the config-server HTTP response handler called bare pickle.loads() on untrusted network data, enabling unauthenticated remote code execution. Introduce lightllm/utils/safe_pickle.py with a RestrictedUnpickler that whitelists only the internal LightLLM dataclass modules that legitimately flow through these channels. All four vulnerable callsites in api_http.py and httpserver/pd_loop.py are replaced with safe_loads(). No protocol changes — pickle wire format is preserved; only class instantiation is restricted. Fixes CVE-2026-26220 (CVSS 9.3 Critical, CWE-502). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request implements a restricted unpickling mechanism to mitigate CVE-2026-26220, replacing insecure pickle.loads calls with an allowlist-based approach across WebSocket and HTTP handlers. Feedback indicates that ImageItem and AudioItem should be added to the allowlist to support multimodal data and suggests using typing module generics for compatibility with Python versions prior to 3.9.
| "lightllm.server.core.objs.sampling_params": {"SamplingParams"}, | ||
| "lightllm.server.core.objs.start_args_type": {"StartArgs"}, | ||
| # Multimodal params can accompany REQ messages | ||
| "lightllm.server.multimodal_params": {"MultimodalParams"}, |
There was a problem hiding this comment.
The MultimodalParams class (defined in lightllm/server/multimodal_params.py) contains lists of ImageItem and AudioItem objects. When a MultimodalParams instance is pickled, these nested objects are recorded with their respective classes. To allow successful deserialization of multimodal requests in PD mode, ImageItem and AudioItem must be added to the allowlist for the lightllm.server.multimodal_params module.
| "lightllm.server.multimodal_params": {"MultimodalParams"}, | |
| "lightllm.server.multimodal_params": {"MultimodalParams", "ImageItem", "AudioItem"}, |
| import io | ||
| import pickle | ||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # Allowlist: (module, name) pairs permitted to be deserialized. | ||
| # Keep this list minimal — add entries only when a new type is deliberately | ||
| # added to a PD WebSocket protocol message. | ||
| # --------------------------------------------------------------------------- | ||
| _ALLOWED_MODULES: dict[str, set[str]] = { |
There was a problem hiding this comment.
To maintain compatibility with Python versions earlier than 3.9 and ensure consistency with the rest of the codebase (e.g., lightllm/server/pd_io_struct.py and lightllm/server/httpserver/pd_loop.py), please use typing.Dict and typing.Set for type annotations instead of the built-in generic types.
| import io | |
| import pickle | |
| # --------------------------------------------------------------------------- | |
| # Allowlist: (module, name) pairs permitted to be deserialized. | |
| # Keep this list minimal — add entries only when a new type is deliberately | |
| # added to a PD WebSocket protocol message. | |
| # --------------------------------------------------------------------------- | |
| _ALLOWED_MODULES: dict[str, set[str]] = { | |
| import io | |
| import pickle | |
| from typing import Dict, Set | |
| # --------------------------------------------------------------------------- | |
| # Allowlist: (module, name) pairs permitted to be deserialized. | |
| # Keep this list minimal — add entries only when a new type is deliberately | |
| # added to a PD WebSocket protocol message. | |
| # --------------------------------------------------------------------------- | |
| _ALLOWED_MODULES: Dict[str, Set[str]] = { |
Security Fix: CVE-2026-26220 — Unauthenticated RCE via pickle.loads() in PD WebSocket Endpoints
Vulnerability
LightLLM's PD (prefill-decode) disaggregation mode exposes WebSocket endpoints that called
pickle.loads()directly on binary frames received from the network, with no authentication. Any attacker who can reach the PD master network port can send a crafted pickle payload and achieve arbitrary code execution on the host.CVSS 4.0: 9.3 Critical (AV:N/AC:L/AT:N/PR:N/UI:N)
CWE: CWE-502 — Deserialization of Untrusted Data
Vulnerable callsites (all replaced by this PR)
lightllm/server/api_http.py/pd_registerWebSocket —pickle.loads(data)lightllm/server/api_http.py/kv_move_statusWebSocket —pickle.loads(data)lightllm/server/httpserver/pd_loop.pypickle.loads(recv_bytes)lightllm/server/httpserver/pd_loop.pypickle.loads(base64.b64decode(...))Note: PR #979 (July 2025) only patched
embed_cache/manager.pyand left these WebSocket callsites untouched.Fix
Introduces
lightllm/utils/safe_pickle.py— aRestrictedUnpickler(subclass ofpickle.Unpickler) that overridesfind_class()to raiseUnpicklingErrorfor any module/class not in an explicit allowlist. The allowlist contains only the internal LightLLM dataclass modules that legitimately flow through these channels (pd_io_struct,py_sampling_params,multimodal_params, and safe builtins).All four vulnerable
pickle.loads()calls are replaced withsafe_loads()from this module. The pickle wire format is unchanged — no protocol migration required, no breaking changes.No breaking changes
pickle.dumps()calls are not touched (they are not a vulnerability).Related