feat: Add Snappy support by Erouan50 · Pull Request #466 · Nullus157/async-compression

Erouan50 · 2026-03-27T13:54:21Z

Hello there!

This pull request introduces support for framed snappy decoder and encoder. I'm not an expert in compression/decompression, but I feel the snappy one is quite different from the others. My work was inspired by the gzip decoding (for the state machine), the work done in the snap crate in the framing reader and writer, and of course the actual spec. One big difference is the whole frame must be buffered before compressing or decompressing it.

I have a few questions/notes:

The "buffering" can be skipped if the input and/or output buffers are the right size. Should I implement that in this pull request?
One test is failing (snappy::tokio::bufread::decompress::trailer), but I don't know how to fix it. If I understand the test correctly, it checks if the decoder can stop decoding with some trailing data in the buffer. But with the framing format, I don't know how to address that. Should I stop decoding when the chunk type is invalid, or should I stop and fail (current behavior)?
Should I add more tests? Because the test data ([1, 2, 3...]) are not compressible in snappy, the encoder and decoder never compress or decompress anything. But is it the job of the proptests to create some random test data?

Implement the Snappy framing format as a new compression algorithm, following the same pattern as the existing codecs (zstd, lz4, etc.). The encoder produces Snappy frames with stream identifiers, compressed/uncompressed chunks (falling back to uncompressed when compression ratio is poor), and masked CRC32C checksums. The decoder is a state machine that parses frame headers, buffers chunk data, and verifies checksums before emitting decompressed output. New dependencies: `snap` for raw Snappy compression and `crc32c` for checksum computation. Both are optional behind the `snappy` feature flag.

NobodyXu · 2026-03-27T14:42:46Z

The "buffering" can be skipped if the input and/or output buffers are the right size. Should I implement that in this pull request?

I'm ok with either, though based on the size of the PR and that its CI is failing, I think getting a baseline implementation merged is a more sensible decision

Erouan50 · 2026-03-27T14:51:15Z

The "buffering" can be skipped if the input and/or output buffers are the right size. Should I implement that in this pull request?

I'm ok with either, though based on the size of the PR and that its CI is failing, I think getting a baseline implementation merged is a more sensible decision

That's fair. Let me fix the proptest. I forgot them...

Erouan50 · 2026-03-27T15:20:05Z

The "buffering" can be skipped if the input and/or output buffers are the right size. Should I implement that in this pull request?

I'm ok with either, though based on the size of the PR and that its CI is failing, I think getting a baseline implementation merged is a more sensible decision

@NobodyXu I do agree to implement that in other PR. I fixed the tests except the one I mentioned in the description:

     Summary [  41.303s] 884 tests run: 882 passed, 2 failed, 0 skipped
        FAIL [   0.013s] async-compression::snappy snappy::futures::bufread::decompress::trailer
        FAIL [   0.008s] async-compression::snappy snappy::tokio::bufread::decompress::trailer

Do you have any opinion on how should I fix it? Or should I just skip this test for snappy?

NobodyXu

Thank you, I'm really unhappy with all the repeated vec creation which resulted in repeated copying and heap allocation

crates/compression-codecs/src/snappy/decoder.rs

crates/compression-codecs/src/snappy/encoder.rs

NobodyXu · 2026-03-27T15:20:37Z

crates/compression-codecs/src/snappy/mod.rs

+            0xFF => Self::Stream,
+            0x00 => Self::Compressed,
+            0x01 => Self::Uncompressed,
+            0xFE => Self::Padding,
+            0x02..=0x7f => Self::ReservedUnskippable(value),
+            0x80..=0xFD => Self::ReservedSkippable(value),


Do we have to hardcode it here?

Ideally the ChunkType should be in snap

It is but the enum is not public.

I think it'd be better if snap makes it public under the raw module

I agree. I'll open a pull request to the repository, but I can't guarantee they'll accept the changes.

It's ok! We should at least try it and I believe the motivation for exposing this is reasonable

NobodyXu · 2026-03-27T15:22:25Z

crates/compression-codecs/src/snappy/mod.rs

+}
+
+fn mask_crc(crc: u32) -> u32 {
+    (crc.wrapping_shr(15) | crc.wrapping_shl(17)).wrapping_add(0xA282EAD8)


Another magic value, better in snap, I assume this is probably duplicate code?

Yes it's duplicate code. The function is not public.

Can we have it public under snap::raw module?

Hardcoding it here really feels wrong

I'll open a pull request.

crates/compression-codecs/src/snappy/decoder.rs

NobodyXu

Question regarding why encoder is designed with extra buffering

I can understand that the decoder needs it as we cannot control on bow the chunks are created

crates/compression-codecs/src/snappy/encoder.rs

NobodyXu · 2026-03-27T15:42:26Z

Do you have any opinion on how should I fix it? Or should I just skip this test for snappy?

Does snappy have a "file header" that recorded the size of the entire stream?

If we have it, then we can configure the decoder to stop decoding once that limit is matched, and only way to continue would be reinit

Otherwise we cannot fix it

Erouan50 · 2026-03-27T16:10:41Z

Do you have any opinion on how should I fix it? Or should I just skip this test for snappy?

Does snappy have a "file header" that recorded the size of the entire stream?

If we have it, then we can configure the decoder to stop decoding once that limit is matched, and only way to continue would be reinit

Otherwise we cannot fix it

No, the first "stream" chunk is just there to mark the start of the file, but it doesn't include the length of the whole stream.

Erouan50 · 2026-03-27T21:49:15Z

Thank you, I'm really unhappy with all the repeated vec creation which resulted in repeated copying and heap allocation

You're right, I apologize for that... I tried to apply your suggestions. It's not perfect (I don't like the reset() + clear() + resize() in the decoder for example), but I hope I'm going to the right direction?

NobodyXu

Thanks!

My feedback is that the copying around of uncompressed sections is unnecessary and should be directly copied into the output

And that snap::raw should have more public items to prevent duplicate, and that it needs to support uninitialised buffer for maximum efficiency

crates/compression-codecs/src/snappy/decoder.rs

NobodyXu · 2026-03-28T02:14:47Z

crates/compression-codecs/src/snappy/decoder.rs

+                out_buf.resize(uncompress_length, 0);
+                let mut decoder = snap::raw::Decoder::new();
+                decoder.decompress(data, out_buf)?;


It's a shame that snap does not support writing to uninitialised buffer, zeroing the whole vec would indeed take quite some time

I'll try to support it. It could be a good challenge to tackle :).

crates/compression-codecs/src/snappy/decoder.rs

crates/compression-codecs/src/snappy/mod.rs

NobodyXu · 2026-03-28T02:30:46Z

crates/compression-codecs/src/snappy/decoder.rs

+    Skipping(usize),
+    Buffering {
+        remaining: usize,


Suggested change

Skipping(usize),

Buffering {

remaining: usize,

Skipping(NonZeroUize),

Buffering {

remaining: NonZeroUsize,

Using NonZeroUsize would make it clear the invariant of the state

I don't think I can do that because remaining and skipping can be 0.

That's the neat part: if it's 0, then we just skip that stage and go to the next stage

Indeed! That's better that way.

NobodyXu · 2026-03-28T02:34:35Z

crates/compression-codecs/src/snappy/encoder.rs

+        out_buf.resize(max_compress_size + 8, 0);
+
+        let mut encoder = snap::raw::Encoder::new();
+        let compress_data = encoder.compress(in_buffer, &mut out_buf[8..])?;


It's a shame snap::raw doesn't support uninitialised buffer

I'll try to support it as well.

crates/compression-codecs/src/snappy/encoder.rs

NobodyXu · 2026-03-28T02:40:48Z

It's not perfect (I don't like the reset() + clear() + resize() in the decoder for example), but I hope I'm going to the right direction?

Yeah it's the right direction, it looks ugly because snap::raw does not support uninitialised buffer, speaking of which the {de, en}code_vec is also very inefficient due to that

Erouan50 · 2026-03-30T21:07:35Z

It's not perfect (I don't like the reset() + clear() + resize() in the decoder for example), but I hope I'm going to the right direction?

Yeah it's the right direction, it looks ugly because snap::raw does not support uninitialised buffer, speaking of which the {de, en}code_vec is also very inefficient due to that

I agree. And even the whole decoding/encoding of the frames should be in the snap crate. I did it here because the crate supports only io::Read and io::Write.

If you want, I can give a shot at contributing to the crate with all the changes you mentioned and we pause/close that pull request?

NobodyXu · 2026-03-30T23:10:25Z

If you want, I can give a shot at contributing to the crate with all the changes you mentioned and we pause/close that pull request?

Yeah it'd be awesome great to get them upstreamed, thank you!

If you can share the PR/tracking issue so that I can be in the loop that'd even better

NobodyXu

Thanks just some minor nit

crates/compression-codecs/src/snappy/encoder.rs

crates/compression-codecs/src/snappy/decoder.rs

NobodyXu

I think it looks good to me now, except for the NonZeroUsize and the upstream changes required (more pub and uninitialised buffer

cc @robjtede

Erouan50 and others added 4 commits March 27, 2026 09:53

Merge branch 'main' into feat/snappy

71d868e

fix: Address Clippy complains

dc0a8c8

fix: Address Cargo formatting complains

130777e

Fix regression from previous commit

ad114ab

NobodyXu requested changes Mar 27, 2026

View reviewed changes

crates/compression-codecs/src/snappy/encoder.rs Show resolved Hide resolved

Apply review from @NobodyXu

fba07bb

Erouan50 added 3 commits March 27, 2026 17:53

Apply missing review from @NobodyXu

3b20b9c

Remove unused import

f61f7b4

Address Clippy issues

04edb10

This comment has been minimized.

Sign in to view

NobodyXu requested changes Mar 28, 2026

View reviewed changes

Erouan50 added 3 commits March 30, 2026 16:45

Stop copy if uncompress in decoder

cae4589

Stop copy if uncompress in Encoder

6e96ede

Apply review from @NobodyXu

9e740a5

NobodyXu requested changes Mar 30, 2026

View reviewed changes

crates/compression-codecs/src/snappy/encoder.rs Show resolved Hide resolved

crates/compression-codecs/src/snappy/decoder.rs Show resolved Hide resolved

NobodyXu reviewed Mar 31, 2026

View reviewed changes

Use NonZeroUsize in decoder

a1e2112

Conversation

Erouan50 commented Mar 27, 2026

Uh oh!

NobodyXu commented Mar 27, 2026

Uh oh!

Erouan50 commented Mar 27, 2026

Uh oh!

Erouan50 commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NobodyXu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

NobodyXu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

NobodyXu commented Mar 27, 2026

Uh oh!

Erouan50 commented Mar 27, 2026

Uh oh!

Erouan50 commented Mar 27, 2026

Uh oh!

This comment has been minimized.

NobodyXu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

NobodyXu commented Mar 28, 2026

Uh oh!

Erouan50 commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NobodyXu commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Erouan50 commented Mar 27, 2026 •

edited

Loading

Erouan50 commented Mar 30, 2026 •

edited

Loading

NobodyXu commented Mar 30, 2026 •

edited

Loading