Context
Current compare_ffi bench suite covers compress + decompress vs FFI across multiple corpora and levels, but does NOT exercise the dictionary-builder path (fastcover / cover) nor compress/decompress with a pre-trained dictionary. Dictionary-driven workflows are a major real-world use case (database column compression, log shipping, RPC payloads) where our Rust implementation must hold up against libzstd on both:
- Dictionary training throughput —
ZSTD_trainFromBuffer_fastCover vs our dict_builder feature
- Compress/decompress with dictionary — every level matrix, vs FFI's
ZSTD_compress_usingCDict / ZSTD_decompress_usingDDict
Problem
Without coverage, regressions in either path land silently. The bench-dashboard tracks compress/decompress deltas but the dictionary track is invisible — a 30% regression in dict training would never be flagged.
Proposed scope
Bench additions
bench_dict_training in zstd/benches/compare_ffi.rs:
- Train dict on each corpus (sample budget = realistic ~1 MiB pool, dict size = 32 KiB)
- Two sides: pure Rust
dict_builder::fastcover vs FFI ZSTD_trainFromBuffer_fastCover
- Throughput metric: input bytes per second (training is bound by sample size, not output size)
bench_compress_with_dict and bench_decompress_with_dict:
- For every level in the existing compress matrix
- Train dict once per corpus (re-use across levels)
- Compress and decompress with that dict, both sides
- Existing per-stage report-line shape
Dashboard wiring
- New stage labels:
dict_train, compress_with_dict, decompress_with_dict
benchmark-relative.json records the same (rust_value, ffi_value, delta_ratio) triple per (level, scenario, target) for these stages
- Dashboard's Level Profile + Aggregate chart pick them up via existing
profileLogicalMetric after extending its switch
Acceptance criteria
Files involved
zstd/benches/compare_ffi.rs — add the three new bench groups
.github/scripts/run-benchmarks.sh — emit JSON rows for the new stages
.github/bench-dashboard/index.html — extend profileLogicalMetric to map the new stages
Out of scope
- Custom
cover algorithm bench (only fastcover for now — that's what we ship)
- EE-only dict policies (this is CE-side parity work)
Context
Current
compare_ffibench suite covers compress + decompress vs FFI across multiple corpora and levels, but does NOT exercise the dictionary-builder path (fastcover / cover) nor compress/decompress with a pre-trained dictionary. Dictionary-driven workflows are a major real-world use case (database column compression, log shipping, RPC payloads) where our Rust implementation must hold up againstlibzstdon both:ZSTD_trainFromBuffer_fastCovervs ourdict_builderfeatureZSTD_compress_usingCDict/ZSTD_decompress_usingDDictProblem
Without coverage, regressions in either path land silently. The bench-dashboard tracks compress/decompress deltas but the dictionary track is invisible — a 30% regression in dict training would never be flagged.
Proposed scope
Bench additions
bench_dict_traininginzstd/benches/compare_ffi.rs:dict_builder::fastcovervs FFIZSTD_trainFromBuffer_fastCoverbench_compress_with_dictandbench_decompress_with_dict:Dashboard wiring
dict_train,compress_with_dict,decompress_with_dictbenchmark-relative.jsonrecords the same(rust_value, ffi_value, delta_ratio)triple per (level, scenario, target) for these stagesprofileLogicalMetricafter extending its switchAcceptance criteria
benchmark-relative.jsoncarries the new stages on every snapshotFiles involved
zstd/benches/compare_ffi.rs— add the three new bench groups.github/scripts/run-benchmarks.sh— emit JSON rows for the new stages.github/bench-dashboard/index.html— extendprofileLogicalMetricto map the new stagesOut of scope
coveralgorithm bench (only fastcover for now — that's what we ship)