fix(picklescanner): handle GET/BINGET to missing memo without aborting scan by palios-taey · Pull Request #348 · protectai/modelscan

palios-taey · 2026-05-24T19:07:58Z

Summary

When PickleUnsafeOpScan walks a pickle stream and encounters a STACK_GLOBAL whose operands resolve via GET / BINGET / LONG_BINGET into a memo index that was never PUT, the unhandled KeyError propagates and the scanner returns no issues for the whole file — even when unsafe globals were already seen earlier in the byte stream.

This PR adds the missing handling: when a memo reference is missing, treat the resolved value as "unknown" (the same convention already used at the next branch of the same conditional for unrecognized opcodes) and continue scanning. A regression test reproduces the case.

Refs #338 (Invalid Memo Reference category).

Why

This pattern is one instance of the EDP (Exception-Directed Programming) class described in arXiv:2508.19774 and credited via GHSA-9gvj-pp9x-gcfr: a malformed-but-parseable pickle whose sole purpose is to crash the scanner before the malicious payload is reported. Without this fix, a CI gate using modelscan lets affected files through silently — the JSON output shows "total_issues": 0, and the only signal is a one-character errors[].description field (the missing memo key as a bare string) which most consumers ignore.

The fix is the minimal, locally-correct change: a single conditional inside the existing STACK_GLOBAL operand-resolution loop. It does not alter behavior for well-formed pickles. It does not bias toward any particular architectural direction (denylist expansion vs allowlist) — the test asserts only that unsafe globals appearing before the malformed reference are still reported.

Testing

Added tests/test_modelscan.py::test_scan_picklescanner_resilient_to_invalid_memo_binget. The test generator (edp_invalid_memo_binget_gen) constructs a minimal pickle:

GLOBAL + REDUCE invoking os.system("echo pwned") (benign demo command)
Two BINGET opcodes referencing memo indices 7 and 8 (never PUT)
STACK_GLOBAL (would consume those references)
STOP

Verified locally:

Before fix: len(ms.issues.all_issues) == 0 — the os.system global is silently dropped.
After fix: len(ms.issues.all_issues) == 2 — the os.system CRITICAL is reported, plus an unknown/unknown CRITICAL from the malformed STACK_GLOBAL itself.

The fixture is self-contained, runs offline (no network fetch), and contains no exploitation primitives beyond what is already public via the Hide-and-Seek paper, GHSA-9gvj-pp9x-gcfr, and the corresponding artifact repo.

…g scan PickleUnsafeOpScan currently raises KeyError when STACK_GLOBAL analysis encounters a GET/BINGET/LONG_BINGET opcode referencing a memo index that was never PUT. The unhandled exception propagates up and the scanner returns zero issues for the whole file, even when unsafe globals appear earlier in the byte stream. This is one instance of the EDP (Exception-Directed Programming) class discussed in arXiv:2508.19774 ("Hide and Seek"; GHSA-9gvj-pp9x-gcfr) and the "Invalid Memo Reference" category in protectai#338: a malformed-but-parseable pickle whose only function is to crash the scanner before the malicious payload is reported. Fix: when GET/BINGET references a missing memo index, treat the resolved value as "unknown" (matching the existing convention at the next branch of the same conditional for unrecognized opcodes) and continue scanning the remainder of the stream. Test: tests/test_modelscan.py::test_scan_picklescanner_resilient_to_invalid_memo_binget generates a minimal pickle that calls os.system via GLOBAL+REDUCE and follows it with STACK_GLOBAL referencing memo indices 7 and 8 (never PUT). Before this fix, modelscan returns 0 issues. After, it reports the os.system CRITICAL. Refs protectai#338.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(picklescanner): handle GET/BINGET to missing memo without aborting scan#348

fix(picklescanner): handle GET/BINGET to missing memo without aborting scan#348
palios-taey wants to merge 1 commit into
protectai:mainfrom
palios-taey:fix-edp-invalid-memo-binget

palios-taey commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

palios-taey commented May 24, 2026

Summary

Why

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant