fix(flat): handle multi-byte advance in consClose decoder#7683
fix(flat): handle multi-byte advance in consClose decoder#7683
Conversation
consClose only advanced currPtr by at most 1 byte, so when constructor_bits + usedBits >= 16 (e.g., 9-bit tag after 7 used bits), usedBits overflowed to 8+. This corrupted the decoder state, causing the Filler decoder to loop infinitely and consume all memory. Use divMod to correctly advance by any number of bytes, matching the pattern already used by dropBits_. Fixes #7542
929cfd0 to
59688d5
Compare
|
Fixing dead code instead of removing it is moronic. |
| else return $ GetResult (s {currPtr = currPtr s `plusPtr` 1, usedBits = u' - 8}) () | ||
| let (bytes, bits) = (n + usedBits s) `divMod` 8 | ||
| s' = s {currPtr = currPtr s `plusPtr` bytes, usedBits = bits} | ||
| if currPtr s' > endPtr |
There was a problem hiding this comment.
This doesn't seem correct. I don't know whether endPtr points to end of buffer, or end of buffer + 1, but either way it doesn't seem correct.
In the former case, currPtr s' > endPtr is not an error if usedBits = 0.
In the latter case, currPtr s' == endPtr is an error if usedBits /= 0.
There was a problem hiding this comment.
Good catch, you're right. currPtr > endPtr misses the case where currPtr == endPtr with usedBits > 0, since that state points past the buffer but claims bits remain.
Rewrote consClose to use ensureBits + dropBits_, same pattern dropBits already uses. ensureBits checks (endPtr - currPtr) * 8 - usedBits < n, so the pointer and bit offset are both accounted for.
The previous divMod fix handled multi-byte advances correctly but the bounds check (currPtr > endPtr) missed the case where currPtr == endPtr with usedBits > 0. Rewrite consClose to use ensureBits + dropBits_, matching the pattern dropBits already uses. This ensures consClose itself rejects overruns rather than relying on downstream decoders to catch the corrupted state.
There was a problem hiding this comment.
Also added a test for this: decoding a 9-bit constructor tag from a 1-byte buffer. It asserts the error is NotEnoughSpace (from consClose via ensureBits), not TooMuchSpace (from strictDecoder catching the bad state after the fact). See 1fed628.
Context
Fix
consClosein the Flat decoder so it correctly advances the stream pointer when constructor tag bits + already-used bits span 2+ bytes. Without this, Generic-derived Flat instances of large enums (>256 constructors) corrupt the decoder state during deserialization, consuming all available memory.Fixes #7542
What went wrong
The Generic-derived decoder uses a
ConsStatemachine for constructor tags. After decoding,consCloseadvances the byte stream by the number of bits consumed. The problem: it only handled advancing by at most 1 byte. Whenconstructor_bits + usedBits >= 16(say, a 9-bit constructor tag after 7 bits already consumed in the current byte),usedBitsoverflowed to 8+, outside the valid range of[0, 7].Once corrupted,
dBoolcomputes128 >> usedBitsas 0, so it always returnsFalsewithout moving the stream pointer. TheFillerdecoder reads 0-bits asFillerBitand a 1-bit asFillerEnd, so it just keeps building an infinite chain ofFillerBitconstructors until memory runs out.The fix swaps the hand-rolled
if/elsefordivMod, the same patterndropBits_already uses 80 lines below in the same file.Changes
Bug fix:
consCloseinDecoder/Prim.hs: usedivModinstead of single-byte advance logic, so any number of bytes is handled correctlyReproduction test (two new cases in
testLargeEnuminSpec.hs):E258_256(8-bit tag), 8+7=15 < 16, works fineE258_258(9-bit tag), 9+7=16, previously caused infinite loop; protected with 5-second timeout