security: add class-loading allowlist to PyTorch unpicklers (CWE-502) by SnailSploit · Pull Request #3014 · google/orbax

SnailSploit · 2026-03-23T20:47:49Z

Summary

Add class-loading allowlist to MetadataUnpickler and CustomTorchUnpickler in PyTorchLayout to prevent arbitrary code execution via malicious .pt/.pth checkpoint files.

Security Issue

Both MetadataUnpickler and CustomTorchUnpickler extend pickle.Unpickler. While MetadataUnpickler.find_class() intercepts specific torch/numpy reconstruction functions, unrecognized classes fall through to super().find_class(), which resolves and returns any class from any importable module. CustomTorchUnpickler does not override find_class() at all.

This allows a crafted .pt checkpoint file to embed pickle opcodes that instantiate dangerous classes (e.g., os.system, subprocess.Popen, builtins.eval), achieving arbitrary code execution when the checkpoint is loaded.

Attack scenario: A user loads a PyTorch checkpoint from an untrusted source. The data.pkl inside the .pt zip archive contains pickle opcodes referencing os.system or subprocess.Popen. The unpickler resolves and calls these classes, executing attacker-controlled commands.

Changes

Added _SAFE_UNPICKLE_CLASSES allowlist: An explicit set of (module, name) pairs covering torch storage types, tensor reconstruction functions, numpy array reconstruction, and standard container types (OrderedDict, _codecs.encode).
MetadataUnpickler.find_class(): After checking intercepted classes, falls back to the allowlist instead of unrestricted super().find_class(). Raises pickle.UnpicklingError for any class not in the allowlist.
CustomTorchUnpickler.find_class() (new override): Restricts class loading to the same allowlist.

Backward Compatibility

Legitimate PyTorch checkpoints contain only tensor reconstruction functions, storage types, and standard containers — all included in the allowlist. Checkpoints embedding arbitrary Python classes will now raise a clear UnpicklingError with guidance to file an issue if a legitimate class is missing from the allowlist.

MetadataUnpickler.find_class() falls through to super().find_class() for unrecognized classes, allowing arbitrary class instantiation from malicious .pt checkpoint files. CustomTorchUnpickler has no find_class override at all, making it equally vulnerable. Changes: - Add _SAFE_UNPICKLE_CLASSES allowlist covering torch storage types, tensor reconstruction functions, numpy array reconstruction, and standard container types (OrderedDict, _codecs.encode) - Add find_class() override to CustomTorchUnpickler with allowlist - Replace MetadataUnpickler's super().find_class() fallthrough with allowlist check and clear UnpicklingError for blocked classes Backward compatible: legitimate PyTorch checkpoints only use tensor reconstruction, storage types, and standard containers.

google-cla · 2026-03-23T20:47:54Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

security: add class-loading allowlist to PyTorch unpicklers (CWE-502)#3014

security: add class-loading allowlist to PyTorch unpicklers (CWE-502)#3014
SnailSploit wants to merge 1 commit intogoogle:mainfrom
SnailSploit:fix/pytorch-layout-unpickler-rce

SnailSploit commented Mar 23, 2026

Uh oh!

google-cla bot commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SnailSploit commented Mar 23, 2026

Summary

Security Issue

Changes

Backward Compatibility

Uh oh!

google-cla bot commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant