chore(cpp): reorganize codec structure and improve error handling#75
chore(cpp): reorganize codec structure and improve error handling#75nyurik merged 1 commit intofast-pack:mainfrom
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
First step of moving towards unified C++ and Rust model of codecs.
There was a problem hiding this comment.
Pull request overview
This PR reorganizes codec APIs to better align the C++ and Rust codec models by introducing shared codec traits and centralizing error handling at the crate level, while updating tests/benchmarks/fuzzing harnesses to use the new any-length codec interface.
Changes:
- Introduces crate-level
AnyLenCodec/BlockCodectraits (plusBlockCodec64for select C++ codecs) and shared helpers for bounds/length validation. - Refactors C++ codec wrappers to a Vec-based any-length encode/decode API with improved error types.
- Updates compatibility tests, benchmarks, and fuzz targets to use the new codec interface and adjusts CI/just recipes accordingly.
Reviewed changes
Copilot reviewed 28 out of 28 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/cpp_compat_tests.rs | Migrates C++/Rust compatibility tests to the new AnyLenCodec Vec-based API. |
| tests/benchmark_smoke.rs | Updates C++ smoke test to use mutable codec + new decode helper signature. |
| src/rust/mod.rs | Re-exports error types from crate root instead of a Rust-only module. |
| src/rust/integer_compression/variable_byte.rs | Switches AsUsize import to new crate-level helpers module. |
| src/rust/integer_compression/mod.rs | Stops exposing the old integer_compression helpers module. |
| src/rust/integer_compression/helpers.rs | Removes the old helpers module (moved to src/helpers.rs). |
| src/rust/integer_compression/fastpfor.rs | Uses crate-level helpers (bits, greatest_multiple, GetWithErr) and crate-level errors. |
| src/rust/integer_compression/bitpacking.rs | Adds pack/unpack roundtrip tests. |
| src/rust/error.rs | Removes Rust-only error definition (moved to crate root). |
| src/lib.rs | Moves error types to crate root; exposes new codec traits and helpers module. |
| src/helpers.rs | New shared helper utilities (AsUsize, GetWithErr, bits, greatest_multiple). |
| src/error.rs | New crate-level FastPForError with additional C++/length-validation variants. |
| src/cpp/wrappers.rs | Adds FFI helpers for Vec-based encode/decode with expected-length validation. |
| src/cpp/tests.rs | Adds broad C++ codec roundtrip + Default smoke tests. |
| src/cpp/README.md | Removes module-level README (replaced with module docs + crate README usage). |
| src/cpp/mod.rs | Replaces README include with module docs; restructures module into submodules. |
| src/cpp/codecs.rs | New macro-generated C++ codec wrapper types implementing AnyLenCodec (+ BlockCodec64 for some). |
| src/codec.rs | New shared codec trait definitions and slice_to_blocks helper. |
| README.md | Updates public C++ usage example to the new AnyLenCodec API. |
| justfile | Extends checks to fuzz crate; changes CI recipe composition and docs recipe args. |
| fuzz/justfile | Reworks fuzz just recipes (target selection, CI smoke test loop, etc.). |
| fuzz/fuzz_targets/rust_decompress_oracle.rs | Updates C++ oracle compression to use AnyLenCodec encode. |
| fuzz/fuzz_targets/rust_compress_oracle.rs | Updates C++ oracle compression to use AnyLenCodec encode. |
| fuzz/fuzz_targets/cpp_roundtrip.rs | Migrates fuzz roundtrip to Vec-based AnyLenCodec API. |
| fuzz/fuzz_targets/common.rs | Switches boxed C++ codec trait object from old Codec32 to AnyLenCodec. |
| Cargo.toml | Makes thiserror and bytemuck non-optional; adjusts feature wiring accordingly. |
| benches/fastpfor_benchmark.rs | Updates C++ benchmark harness to pass mutable codec into helpers. |
| benches/bench_utils.rs | Refactors C++ bench helpers to use AnyLenCodec Vec-based encode/decode. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// fn encode_blocks(&self, blocks: &[[u32; 256]], out: &mut Vec<u32>) | ||
| /// -> Result<(), FastPForError> { ... } | ||
| /// fn decode_blocks(&self, input: &[u32], expected_len: Option<u32>, | ||
| /// out: &mut Vec<u32>) -> Result<usize, FastPForError> { ... } |
There was a problem hiding this comment.
The BlockCodec doc example doesn’t match the actual trait signature (e.g., encode_blocks takes &mut self in the trait, and decode_blocks’ parameters/return type differ). This is likely to confuse users copying the example; please update the snippet to reflect the current method signatures.
| /// fn encode_blocks(&self, blocks: &[[u32; 256]], out: &mut Vec<u32>) | |
| /// -> Result<(), FastPForError> { ... } | |
| /// fn decode_blocks(&self, input: &[u32], expected_len: Option<u32>, | |
| /// out: &mut Vec<u32>) -> Result<usize, FastPForError> { ... } | |
| /// | |
| /// fn encode_blocks( | |
| /// &mut self, | |
| /// blocks: &[Self::Block], | |
| /// out: &mut Vec<u32>, | |
| /// ) -> Result<(), FastPForError> { | |
| /// // implementation here | |
| /// } | |
| /// | |
| /// fn decode_blocks( | |
| /// &mut self, | |
| /// input: &[u32], | |
| /// expected_len: Option<u32>, | |
| /// out: &mut Vec<u32>, | |
| /// ) -> Result<usize, FastPForError> { | |
| /// // implementation here | |
| /// } |
| /// Codec that supports compressing 64-bit integers into a 32-bit word stream. | ||
| /// | ||
| /// Only three C++ codecs implement this trait: `CppFastPFor128`, | ||
| /// `CppFastPFor256`, and `CppVarInt`. For simple use, call | ||
| /// `encode64` / `decode64` directly on the struct — no trait import required. | ||
| /// | ||
| /// Import `BlockCodec64` only when writing generic code over multiple codecs | ||
| /// that support 64-bit compression. |
There was a problem hiding this comment.
BlockCodec64 docs reference codec names that don’t exist in this crate (CppFastPFor128, CppFastPFor256, CppVarInt). This can also trigger broken-intra-doc-link warnings under -D warnings. Update the text to the actual exported type names (e.g. cpp::FastPFor128Codec, cpp::FastPFor256Codec, cpp::VarIntCodec).
| //! All C++ codecs are composite (any-length) and implement [`AnyLenCodec`] only. | ||
| //! Codecs marked with `@ 64` also implement [`BlockCodec64`] for 64-bit integers. |
There was a problem hiding this comment.
These intra-doc links ([AnyLenCodec], [BlockCodec64]) are unqualified in this module and will likely resolve to nothing, producing broken_intra_doc_links warnings (CI uses -D warnings). Use fully-qualified links like [crate::AnyLenCodec] / [crate::BlockCodec64] (or import them into scope for the module docs). Also, the note about codecs being “marked with @ 64” looks outdated now that the @ 64 marker isn’t present in the current generation macro.
| //! All C++ codecs are composite (any-length) and implement [`AnyLenCodec`] only. | |
| //! Codecs marked with `@ 64` also implement [`BlockCodec64`] for 64-bit integers. | |
| //! All C++ codecs are composite (any-length) and implement [`crate::AnyLenCodec`] only. | |
| //! Some codecs also implement [`crate::BlockCodec64`] for 64-bit integers. |
| // for 64-bit loads/stores and will SIGSEGV on unaligned access). The codec is | ||
| // otherwise correct on x86/x86_64 where unaligned access is handled in hardware. | ||
| // Tracked upstream; skip on aarch64 until fixed in the submodule. | ||
| // #[cfg(not(target_arch = "aarch64"))] |
There was a problem hiding this comment.
The comment says Simple8bRleCodec must be skipped on aarch64 due to alignment UB/SIGSEGV, but the #[cfg(not(target_arch = "aarch64"))] guard is commented out. As written, this test will run on ARM64 and can crash the test process. Re-enable the cfg gate (or otherwise exclude this test on aarch64).
| // #[cfg(not(target_arch = "aarch64"))] | |
| #[cfg(not(target_arch = "aarch64"))] |
First step of moving towards unified C++ and Rust model of codecs done in #73