Skip to content

test(bench): add dictionary training + dict-driven compress/decompress benches vs FFI #230

@polaz

Description

@polaz

Context

Current compare_ffi bench suite covers compress + decompress vs FFI across multiple corpora and levels, but does NOT exercise the dictionary-builder path (fastcover / cover) nor compress/decompress with a pre-trained dictionary. Dictionary-driven workflows are a major real-world use case (database column compression, log shipping, RPC payloads) where our Rust implementation must hold up against libzstd on both:

  1. Dictionary training throughputZSTD_trainFromBuffer_fastCover vs our dict_builder feature
  2. Compress/decompress with dictionary — every level matrix, vs FFI's ZSTD_compress_usingCDict / ZSTD_decompress_usingDDict

Problem

Without coverage, regressions in either path land silently. The bench-dashboard tracks compress/decompress deltas but the dictionary track is invisible — a 30% regression in dict training would never be flagged.

Proposed scope

Bench additions

  • bench_dict_training in zstd/benches/compare_ffi.rs:
    • Train dict on each corpus (sample budget = realistic ~1 MiB pool, dict size = 32 KiB)
    • Two sides: pure Rust dict_builder::fastcover vs FFI ZSTD_trainFromBuffer_fastCover
    • Throughput metric: input bytes per second (training is bound by sample size, not output size)
  • bench_compress_with_dict and bench_decompress_with_dict:
    • For every level in the existing compress matrix
    • Train dict once per corpus (re-use across levels)
    • Compress and decompress with that dict, both sides
    • Existing per-stage report-line shape

Dashboard wiring

  • New stage labels: dict_train, compress_with_dict, decompress_with_dict
  • benchmark-relative.json records the same (rust_value, ffi_value, delta_ratio) triple per (level, scenario, target) for these stages
  • Dashboard's Level Profile + Aggregate chart pick them up via existing profileLogicalMetric after extending its switch

Acceptance criteria

  • CI bench job runs the new benches without regression in compute time (cap dict-training measurement at ≤ 30 s per scenario)
  • benchmark-relative.json carries the new stages on every snapshot
  • Dashboard Aggregate chart shows dict-train ratio and dict-driven compress/decompress alongside the existing series
  • Sanity: rust dict-train delta vs FFI lands within reasonable parity band (initial measurement, no SLO yet)

Files involved

  • zstd/benches/compare_ffi.rs — add the three new bench groups
  • .github/scripts/run-benchmarks.sh — emit JSON rows for the new stages
  • .github/bench-dashboard/index.html — extend profileLogicalMetric to map the new stages

Out of scope

  • Custom cover algorithm bench (only fastcover for now — that's what we ship)
  • EE-only dict policies (this is CE-side parity work)

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2-mediumMedium priority — important improvementenhancementNew feature or requestperformancePerformance optimization

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions