Krunch is a neural codec for text. It works on any NVIDIA GPU and beats traditional compression algorithms (like zstd-22) by 30-40% on natural-language text (chat, prose, code).
Run it on one machine or parallelize across a cluster with any batch system you already use.
Run on any host with an NVIDIA GPU + Docker:
# 1. Install (~5-10 min one-time — downloads CLI + pulls 3.5 GB image)
curl -fsSL https://raw.githubusercontent.com/dmatth1/krunch/main/install.sh | sudo bash
# For a pinned, reproducible install:
# curl -fsSL https://raw.githubusercontent.com/dmatth1/krunch/main/install.sh | sudo KRUNCH_VERSION=v0.1.1 bash
# 2. Use it (instant — image is cached)
krunch compress data.jsonl -o data.krunch
krunch decompress data.krunch -o data.jsonl
# Or pipe-style (Unix idiom)
krunch compress < data.jsonl > data.krunch
krunch decompress < data.krunch > data.jsonlFor large files / archival workloads, run krunch as parallel tasks on
whatever batch system you already use. krunch plan emits a
ready-to-run artifact for the target you pick.
# Compress
krunch plan --target aws-batch --mode compress \
--source s3://… --dest s3://… --workers 16 > compress.json
# Decompress
krunch plan --target aws-batch --mode decompress \
--source s3://… --dest s3://… --workers 16 > decompress.json
# Planned targets — same flag shape, not yet implemented
krunch plan --target k8s --mode compress --source … --dest … --workers 16 > job.yaml
krunch plan --target modal --mode compress --source … --dest … --workers 16 > run.py
krunch plan --target ray --mode compress --source … --dest … --workers 16 > run.py
krunch plan --target slurm --mode compress --source … --dest … --workers 16 > run.sbatch
krunch plan --target gcp-batch --mode compress --source … --dest … --workers 16 > job.jsonThen submit with your own tooling and credentials:
aws batch submit-job --cli-input-json file://compress.json,
kubectl apply -f job.yaml, modal run run.py, etc.
Only
--target aws-batchworks today; the rest are illustrative of the intended UX. Contributions welcome — see CONTRIBUTING.md.
See deploy/aws-cdk/ for a working AWS Batch
reference stack you can cdk deploy as-is.
Measured on AWS Batch (A10G g5.xlarge, 100 MB WildChat-English) —
real-work elapsed inside compress_all / decompress_all, excluding
cold-start container init:
Note: cold-start tax may increase runtimes on the first job, but amortizes to zero on warm fleets and on large jobs.
Compressed-size ratio (smaller = better) on a single A10G g5.xlarge, 1 MB chunks.
| corpus | krunch | zstd -22 --long | krunch vs zstd |
|---|---|---|---|
| Chat — WildChat-English (100 MB) | 0.114 | 0.170 | −33% |
| Wikipedia — enwik8 (100 MB) | 0.146 | 0.253 | −42% |
| Python code — CodeParrot (100 MB) | 0.097 | 0.154 | −37% |
| Support tickets — Bitext (19 MB) | 0.099 | 0.083 | +20% |
| HTTP logs — NASA Apache (100 MB) | 0.157 | 0.061 | +158% |
krunch wins decisively on natural-language text (chat, prose, code) and loses to zstd-22 on highly-repetitive structured text (templated logs, intent labels).
- RWKV-4-Pile-169M pretrained language model (Apache-2.0, BlinkDL) — the next-byte predictor.
- Custom WKV CUDA kernel — fused recurrence op, ~1000× faster than HF transformers' eval-mode fallback.
- constriction arithmetic coder — turns the model's next-token distribution into a bitstream.
The artifact krunch plan emits contains both the worker tasks (each
computes its byte range from a framework-injected index) and a
finalize task that stitches partial blobs into the final output. The
container contract (KRUNCH_INPUT_URL, KRUNCH_PART_INDEX,
KRUNCH_PART_COUNT, …) is documented and stable — you can wire krunch
into a batch system we don't have a template for in ~30 lines.
Krunch is a neural compressor for text. Avoid it when:
- Your data is highly repetitive structured text (templated logs, intent labels, repeating timestamps). zstd-22's 128 MB dictionary window catches that pattern far more cheaply than a 169 M-parameter language model — see the ratio table above.
- Arbitrary binary, mixed media, or already-compressed payloads. A 169 M-parameter language model has no advantage predicting randomness; krunch will produce larger output than the input.
See CONTRIBUTING.md.
Apache-2.0. See NOTICE for upstream attributions.
