Kompact

Context compression proxy for LLM agents. Reduces token usage 40-70% with zero code changes.

uv run kompact proxy --port 7878
export ANTHROPIC_BASE_URL=http://localhost:7878
# That's it. Your agent now uses fewer tokens.

Results

Evaluated on BFCL (1,431 real API schemas), the standard benchmark for tool-calling agents. End-to-end through Claude via claude-relay, scored with context-bench.

Does compression hurt quality?

Quality impact vs no compression (positive = better, negative = worse):

Model	Kompact	Headroom	LLMLingua-2
Haiku	-2.6%	-3.0%	-23.4%
Sonnet	-3.9%	-3.5%	-20.6%
Opus	-0.5%	-0.5%	-27.3%

Kompact and Headroom both preserve quality within ~3% of baseline — negligible degradation. LLMLingua-2 destroys tool schemas regardless of model (-20 to -27%). Headroom tested with their recommended SmartCrusher production defaults.

How much does it save?

Offline compression measured on 12,795 examples across 3 datasets:

Dataset	Examples	Kompact	Headroom	LLMLingua-2
BFCL (tool schemas)	1,431	55.3%	~0%	55.4%
Glaive (tool calling)	3,959	56.6%	~0%	~50%
HotpotQA (prose QA)	7,405	17.9%	~0%	49.9%

Headroom's SmartCrusher doesn't compress JSON — it's designed for prose. LLMLingua-2 compresses aggressively but destroys information (see quality table above).

What does that mean in dollars?

For a team running 1,000 agentic requests/day with ~10K token contexts:

Model	Without Kompact	With Kompact	Monthly Savings
Sonnet ($3/M)	$900/mo	$405/mo	$495/mo
Opus ($15/M)	$4,500/mo	$2,025/mo	$2,475/mo
GPT-4o ($2.50/M)	$750/mo	$338/mo	$412/mo

Savings scale linearly. 10K requests/day = 10x the numbers above.

How it works

Kompact is a transparent HTTP proxy. It intercepts LLM API requests, compresses the context, then forwards to the provider. No agent code changes needed.

        ┌──────────────────────────────────────────────┐
        │           Kompact Proxy (:7878)              │
        │                                              │
Agent ─>│  1. Schema Optimizer    (TF-IDF selection)   │─> LLM Provider
        │  2. Content Compressors (TOON, JSON, code)   │
        │  3. Extractive Compress (TF-IDF sentences)   │
        │  4. Observation Masker  (history mgmt)       │
        │  5. Cache Aligner       (prefix caching)     │
        │                                              │
        └──────────────────────────────────────────────┘

8 transforms, each targeting a different content type. The pipeline adapts automatically — short contexts get light compression, long contexts get aggressive optimization. Sub-millisecond overhead.

Running benchmarks

# Offline compression (no LLM calls, measures compression + needle preservation)
uv run python benchmarks/run_dataset_eval.py --dataset bfcl

# End-to-end quality (sends through proxy chain, measures LLM answer quality)
# Requires: claude-relay running on :8084, kompact on :7878
uv run python benchmarks/run_e2e_eval.py --dataset bfcl --model haiku --workers 20

See benchmarks/README.md for full methodology.

Development

uv sync --extra dev
uv run pytest          # 48 tests
uv run ruff check src/ tests/

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
benchmarks		benchmarks
docs		docs
src/kompact		src/kompact
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kompact

Results

Does compression hurt quality?

How much does it save?

What does that mean in dollars?

How it works

Running benchmarks

Development

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

npow/kompact

Folders and files

Latest commit

History

Repository files navigation

Kompact

Results

Does compression hurt quality?

How much does it save?

What does that mean in dollars?

How it works

Running benchmarks

Development

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages