⚠️ Deferred — post-Phase 6 drop-in parity (2026-05-19)
Project priority sequence per #28: complete encoder rewrite (#111 incl. #23) → speed/ratio optimizations (#178, #180) → params API (#27) → magicless (#26) → Phase 6 C-ABI / CLI drop-in (#126/#127/#128/#130/#131/#132) → THEN this track. lsm-tree bilateral coordination accepted 2026-05-18 — commitment preserved, but execution defers until drop-in parity ships. Pre-Phase-6 work on this issue will not be scheduled.
⚠️ Feature gate (mandatory): all Rust code added by this issue is compiled only when the lsm Cargo feature is enabled (#[cfg(feature = "lsm")] on every new public item — module, struct, enum variant, impl block, function). Feature is default off, opt-in for downstream consumers. Without lsm: build is byte-identical to today, no new public symbols, cdylib from Phase 6 stays strict drop-in for donor libzstd v1.5.7. C FFI surface is unaffected regardless of feature state.
Status
Investigation / spike. Largest of the structural-vocabulary task set, lowest priority. Lands only after lsm-tree's ECC layer (LSM-T5) is in production and demonstrates an operational need for "decode-as-much-as-possible-past-corruption" beyond the targeted block repair from #174.
Acceptable outcome: "ECC repair already covers the cases we care about, partial-decode is over-engineered, close without landing." That's a valid result — the analysis itself is the deliverable.
Context
When the lsm-tree ECC layer (LSM-T5 + LSM-T6) has exhausted its parity budget on an SSTable block and a zstd block is still unrecoverable, downstream callers asking for a range query may still benefit from "give me whatever decoded successfully before the corruption." Today FrameDecoder::decode_blocks treats any block-decode failure as terminal: any decompressed bytes already in the buffer are discarded along with the error.
A best-effort partial-decode mode would let the caller extract bytes_decoded worth of plaintext + a positional error pointing at the unrecoverable block, then proceed with degraded results.
Proposed scope
impl FrameDecoder {
/// Decode blocks from `src`, emitting decoded bytes via the existing
/// read interface, stopping at the first block-decode failure. Returns
/// what was decoded plus the error position. Unlike `decode_blocks`,
/// the decoded prefix is preserved and accessible via `read()`.
pub fn decode_blocks_partial(
&mut self,
src: &mut impl Read,
) -> Result<PartialDecode, FrameDecoderError>;
}
pub struct PartialDecode {
pub bytes_decoded: u64,
pub blocks_decoded: u32,
pub stopped_at: Option<(u32 /* block_index */, FrameDecoderError)>,
}
Why this is hard
The current decode state machine treats a block failure as fatal and may leave internal state mid-update (block buffer half-written, window not advanced, FSE state mid-reload). Resumable abort needs careful unwinding so the caller can still drain bytes_decoded from prior blocks without corrupted state poisoning the read interface. Estimated ~300-500 LoC + nontrivial test surface (need to validate buffered-output integrity after each terminal abort variant).
Kill-switch criteria
Close without landing if any of:
- lsm-tree's ECC layer + LSM-T6 lazy repair handles every operationally observed case → partial decode never triggered in practice.
- Adding the resumable-abort plumbing visibly slows the happy-path decode benches by > 1%.
- The state-machine cleanup turns out to require touching encoding/blocks internals — would expand scope beyond a single decode-path PR.
Acceptance criteria (if it lands)
Related
ADDENDUM (2026-05-18): feature gating
If this lands, decode_blocks_partial and PartialDecode are gated behind the lsm Cargo feature (default off) — same gate as #171/#172/#173/#174.
#[cfg(feature = "lsm")]
impl FrameDecoder {
pub fn decode_blocks_partial(...) -> Result<PartialDecode, FrameDecoderError>;
}
#[cfg(feature = "lsm")]
pub struct PartialDecode { ... }
Default-build cdylib from #126/#127 remains strict drop-in for donor v1.5.7 — donor has no partial-decode primitive, so its absence in the no-feature build is correct.
Bilateral cross-reference
Status
Investigation / spike. Largest of the structural-vocabulary task set, lowest priority. Lands only after lsm-tree's ECC layer (LSM-T5) is in production and demonstrates an operational need for "decode-as-much-as-possible-past-corruption" beyond the targeted block repair from #174.
Acceptable outcome: "ECC repair already covers the cases we care about, partial-decode is over-engineered, close without landing." That's a valid result — the analysis itself is the deliverable.
Context
When the lsm-tree ECC layer (LSM-T5 + LSM-T6) has exhausted its parity budget on an SSTable block and a zstd block is still unrecoverable, downstream callers asking for a range query may still benefit from "give me whatever decoded successfully before the corruption." Today
FrameDecoder::decode_blockstreats any block-decode failure as terminal: any decompressed bytes already in the buffer are discarded along with the error.A best-effort partial-decode mode would let the caller extract
bytes_decodedworth of plaintext + a positional error pointing at the unrecoverable block, then proceed with degraded results.Proposed scope
Why this is hard
The current decode state machine treats a block failure as fatal and may leave internal state mid-update (block buffer half-written, window not advanced, FSE state mid-reload). Resumable abort needs careful unwinding so the caller can still drain
bytes_decodedfrom prior blocks without corrupted state poisoning the read interface. Estimated ~300-500 LoC + nontrivial test surface (need to validate buffered-output integrity after each terminal abort variant).Kill-switch criteria
Close without landing if any of:
Acceptance criteria (if it lands)
decode_blocksoutput.PartialDecode { bytes_decoded: <sum of bodies 0..N>, blocks_decoded: N, stopped_at: Some((N, _)) }andFrameDecoder::read()returns exactly the decompressed bytes of blocks 0..N.compare_ffidecompress benches (< 1% delta).Related
stopped_atfield reuses theblock_indexplumbing.ADDENDUM (2026-05-18): feature gating
If this lands,
decode_blocks_partialandPartialDecodeare gated behind thelsmCargo feature (default off) — same gate as #171/#172/#173/#174.Default-build
cdylibfrom #126/#127 remains strict drop-in for donor v1.5.7 — donor has no partial-decode primitive, so its absence in the no-feature build is correct.Bilateral cross-reference