chore(cpp): reorganize codec structure and improve error handling by nyurik · Pull Request #75 · fast-pack/FastPFOR-rs

nyurik · 2026-03-22T09:00:35Z

First step of moving towards unified C++ and Rust model of codecs done in #73

codecov · 2026-03-22T09:02:25Z

Codecov Report

❌ Patch coverage is 87.93774% with 31 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/codec.rs	12.00%	22 Missing ⚠️
src/cpp/wrappers.rs	89.85%	0 Missing and 7 partials ⚠️
src/helpers.rs	94.11%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

First step of moving towards unified C++ and Rust model of codecs.

Copilot

Pull request overview

This PR reorganizes codec APIs to better align the C++ and Rust codec models by introducing shared codec traits and centralizing error handling at the crate level, while updating tests/benchmarks/fuzzing harnesses to use the new any-length codec interface.

Changes:

Introduces crate-level AnyLenCodec / BlockCodec traits (plus BlockCodec64 for select C++ codecs) and shared helpers for bounds/length validation.
Refactors C++ codec wrappers to a Vec-based any-length encode/decode API with improved error types.
Updates compatibility tests, benchmarks, and fuzz targets to use the new codec interface and adjusts CI/just recipes accordingly.

Reviewed changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
tests/cpp_compat_tests.rs	Migrates C++/Rust compatibility tests to the new `AnyLenCodec` Vec-based API.
tests/benchmark_smoke.rs	Updates C++ smoke test to use mutable codec + new decode helper signature.
src/rust/mod.rs	Re-exports error types from crate root instead of a Rust-only module.
src/rust/integer_compression/variable_byte.rs	Switches `AsUsize` import to new crate-level helpers module.
src/rust/integer_compression/mod.rs	Stops exposing the old integer_compression helpers module.
src/rust/integer_compression/helpers.rs	Removes the old helpers module (moved to `src/helpers.rs`).
src/rust/integer_compression/fastpfor.rs	Uses crate-level helpers (`bits`, `greatest_multiple`, `GetWithErr`) and crate-level errors.
src/rust/integer_compression/bitpacking.rs	Adds pack/unpack roundtrip tests.
src/rust/error.rs	Removes Rust-only error definition (moved to crate root).
src/lib.rs	Moves error types to crate root; exposes new codec traits and helpers module.
src/helpers.rs	New shared helper utilities (`AsUsize`, `GetWithErr`, `bits`, `greatest_multiple`).
src/error.rs	New crate-level `FastPForError` with additional C++/length-validation variants.
src/cpp/wrappers.rs	Adds FFI helpers for Vec-based encode/decode with expected-length validation.
src/cpp/tests.rs	Adds broad C++ codec roundtrip + Default smoke tests.
src/cpp/README.md	Removes module-level README (replaced with module docs + crate README usage).
src/cpp/mod.rs	Replaces README include with module docs; restructures module into submodules.
src/cpp/codecs.rs	New macro-generated C++ codec wrapper types implementing `AnyLenCodec` (+ `BlockCodec64` for some).
src/codec.rs	New shared codec trait definitions and `slice_to_blocks` helper.
README.md	Updates public C++ usage example to the new `AnyLenCodec` API.
justfile	Extends checks to fuzz crate; changes CI recipe composition and docs recipe args.
fuzz/justfile	Reworks fuzz just recipes (target selection, CI smoke test loop, etc.).
fuzz/fuzz_targets/rust_decompress_oracle.rs	Updates C++ oracle compression to use `AnyLenCodec` encode.
fuzz/fuzz_targets/rust_compress_oracle.rs	Updates C++ oracle compression to use `AnyLenCodec` encode.
fuzz/fuzz_targets/cpp_roundtrip.rs	Migrates fuzz roundtrip to Vec-based `AnyLenCodec` API.
fuzz/fuzz_targets/common.rs	Switches boxed C++ codec trait object from old `Codec32` to `AnyLenCodec`.
Cargo.toml	Makes `thiserror` and `bytemuck` non-optional; adjusts feature wiring accordingly.
benches/fastpfor_benchmark.rs	Updates C++ benchmark harness to pass mutable codec into helpers.
benches/bench_utils.rs	Refactors C++ bench helpers to use `AnyLenCodec` Vec-based encode/decode.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/helpers.rs

src/codec.rs

Copilot · 2026-03-22T09:19:50Z

src/codec.rs

+///     fn encode_blocks(&self, blocks: &[[u32; 256]], out: &mut Vec<u32>)
+///         -> Result<(), FastPForError> { ... }
+///     fn decode_blocks(&self, input: &[u32], expected_len: Option<u32>,
+///         out: &mut Vec<u32>) -> Result<usize, FastPForError> { ... }


The BlockCodec doc example doesn’t match the actual trait signature (e.g., encode_blocks takes &mut self in the trait, and decode_blocks’ parameters/return type differ). This is likely to confuse users copying the example; please update the snippet to reflect the current method signatures.

Suggested change

/// fn encode_blocks(&self, blocks: &[[u32; 256]], out: &mut Vec<u32>)

/// -> Result<(), FastPForError> { ... }

/// fn decode_blocks(&self, input: &[u32], expected_len: Option<u32>,

/// out: &mut Vec<u32>) -> Result<usize, FastPForError> { ... }

///

/// fn encode_blocks(

/// &mut self,

/// blocks: &[Self::Block],

/// out: &mut Vec<u32>,

/// ) -> Result<(), FastPForError> {

/// // implementation here

/// }

///

/// fn decode_blocks(

/// &mut self,

/// input: &[u32],

/// expected_len: Option<u32>,

/// out: &mut Vec<u32>,

/// ) -> Result<usize, FastPForError> {

/// // implementation here

/// }

Copilot · 2026-03-22T09:19:50Z

src/codec.rs

+/// Codec that supports compressing 64-bit integers into a 32-bit word stream.
+///
+/// Only three C++ codecs implement this trait: `CppFastPFor128`,
+/// `CppFastPFor256`, and `CppVarInt`. For simple use, call
+/// `encode64` / `decode64` directly on the struct — no trait import required.
+///
+/// Import `BlockCodec64` only when writing generic code over multiple codecs
+/// that support 64-bit compression.


BlockCodec64 docs reference codec names that don’t exist in this crate (CppFastPFor128, CppFastPFor256, CppVarInt). This can also trigger broken-intra-doc-link warnings under -D warnings. Update the text to the actual exported type names (e.g. cpp::FastPFor128Codec, cpp::FastPFor256Codec, cpp::VarIntCodec).

Copilot · 2026-03-22T09:19:50Z

src/cpp/mod.rs

+//! All C++ codecs are composite (any-length) and implement [`AnyLenCodec`] only.
+//! Codecs marked with `@ 64` also implement [`BlockCodec64`] for 64-bit integers.


These intra-doc links ([AnyLenCodec], [BlockCodec64]) are unqualified in this module and will likely resolve to nothing, producing broken_intra_doc_links warnings (CI uses -D warnings). Use fully-qualified links like [crate::AnyLenCodec] / [crate::BlockCodec64] (or import them into scope for the module docs). Also, the note about codecs being “marked with @ 64” looks outdated now that the @ 64 marker isn’t present in the current generation macro.

Suggested change

//! All C++ codecs are composite (any-length) and implement [`AnyLenCodec`] only.

//! Codecs marked with `@ 64` also implement [`BlockCodec64`] for 64-bit integers.

//! All C++ codecs are composite (any-length) and implement [`crate::AnyLenCodec`] only.

//! Some codecs also implement [`crate::BlockCodec64`] for 64-bit integers.

Copilot · 2026-03-22T09:19:51Z

src/cpp/tests.rs

+// for 64-bit loads/stores and will SIGSEGV on unaligned access). The codec is
+// otherwise correct on x86/x86_64 where unaligned access is handled in hardware.
+// Tracked upstream; skip on aarch64 until fixed in the submodule.
+// #[cfg(not(target_arch = "aarch64"))]


The comment says Simple8bRleCodec must be skipped on aarch64 due to alignment UB/SIGSEGV, but the #[cfg(not(target_arch = "aarch64"))] guard is commented out. As written, this test will run on ARM64 and can crash the test process. Re-enable the cfg gate (or otherwise exclude this test on aarch64).

Suggested change

// #[cfg(not(target_arch = "aarch64"))]

#[cfg(not(target_arch = "aarch64"))]

justfile

nyurik requested review from CommanderStorm and Copilot and removed request for Copilot March 22, 2026 09:00

Copilot started reviewing on behalf of nyurik March 22, 2026 09:01 View session

chore(cpp): reorganize codec structure and improve error handling

c41d357

First step of moving towards unified C++ and Rust model of codecs.

nyurik force-pushed the block-size-cpp branch from 4a0ddb7 to c41d357 Compare March 22, 2026 09:13

Copilot AI review requested due to automatic review settings March 22, 2026 09:13

Copilot started reviewing on behalf of nyurik March 22, 2026 09:14 View session

Copilot AI reviewed Mar 22, 2026

View reviewed changes

CommanderStorm approved these changes Mar 22, 2026

View reviewed changes

nyurik merged commit 6ba6e0b into fast-pack:main Mar 22, 2026
19 checks passed

nyurik deleted the block-size-cpp branch March 22, 2026 15:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(cpp): reorganize codec structure and improve error handling#75

chore(cpp): reorganize codec structure and improve error handling#75
nyurik merged 1 commit intofast-pack:mainfrom
nyurik:block-size-cpp

nyurik commented Mar 22, 2026

Uh oh!

codecov bot commented Mar 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Copilot AI Mar 22, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-///     fn encode_blocks(&self, blocks: &[[u32; 256]], out: &mut Vec<u32>)
-///         -> Result<(), FastPForError> { ... }
-///     fn decode_blocks(&self, input: &[u32], expected_len: Option<u32>,
-///         out: &mut Vec<u32>) -> Result<usize, FastPForError> { ... }
+///
+///     fn encode_blocks(
+///         &mut self,
+///         blocks: &[Self::Block],
+///         out: &mut Vec<u32>,
+///     ) -> Result<(), FastPForError> {
+///         // implementation here
+///     }
+///
+///     fn decode_blocks(
+///         &mut self,
+///         input: &[u32],
+///         expected_len: Option<u32>,
+///         out: &mut Vec<u32>,
+///     ) -> Result<usize, FastPForError> {
+///         // implementation here
+///     }

		//! All C++ codecs are composite (any-length) and implement [`AnyLenCodec`] only.
		//! Codecs marked with `@ 64` also implement [`BlockCodec64`] for 64-bit integers.

	// #[cfg(not(target_arch = "aarch64"))]
	#[cfg(not(target_arch = "aarch64"))]

Conversation

nyurik commented Mar 22, 2026

Uh oh!

codecov bot commented Mar 22, 2026

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants