Skip to content

security: remove allow_pickle=True from NumpyLayout (CVE-2019-6446 bypass)#3013

Open
SnailSploit wants to merge 1 commit intogoogle:mainfrom
SnailSploit:fix/numpy-layout-pickle-rce
Open

security: remove allow_pickle=True from NumpyLayout (CVE-2019-6446 bypass)#3013
SnailSploit wants to merge 1 commit intogoogle:mainfrom
SnailSploit:fix/numpy-layout-pickle-rce

Conversation

@SnailSploit
Copy link

Summary

Remove allow_pickle=True from np.load() in NumpyLayout to prevent arbitrary code execution via malicious .npz checkpoint files.

Security Issue

_load_numpy() calls np.load(path, allow_pickle=True), which overrides the safety default introduced by NumPy in response to CVE-2019-6446. Combined with _reconstruct_npz_contents() calling .item() on object-dtype arrays, this creates a remote code execution path: an attacker who controls a checkpoint file can embed a pickled Python object that executes arbitrary code when loaded.

Attack scenario: A user downloads a model checkpoint from an untrusted source (e.g., a public model hub). The .npz file contains a crafted object-dtype array whose .item() triggers pickle.loads() internally, executing attacker-controlled Python code.

Changes

  1. _load_numpy(): Removed allow_pickle=True from np.load() call, restoring NumPy's safe default (allow_pickle=False).
  2. _reconstruct_npz_contents(): Replaced the dtype == object / .item() deserialization path with a ValueError that clearly explains why object arrays are rejected.

Backward Compatibility

Checkpoints containing only numeric/string-typed arrays (the standard case for ML model weights) are unaffected. Checkpoints that relied on pickling arbitrary Python objects into .npz files will now raise a clear error. This is the intended behavior — loading arbitrary pickled objects from untrusted files is the vulnerability being fixed.

…pass)

np.load(path, allow_pickle=True) overrides NumPy's safety default
(CVE-2019-6446), enabling arbitrary code execution via malicious .npz
checkpoint files. The _reconstruct_npz_contents() function then calls
.item() on object-dtype arrays, which triggers pickle deserialization
of attacker-controlled data.

Changes:
- Remove allow_pickle=True from np.load() call in _load_numpy()
- Replace .item() deserialization path with ValueError for object arrays
- Add security note to _reconstruct_npz_contents docstring

Backward compatible: checkpoints with numeric/string arrays (standard
for ML model weights) are unaffected.
@google-cla
Copy link

google-cla bot commented Mar 23, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant