A unified framework for running zero-knowledge circuit security analysis tools. zkHydra orchestrates multiple analysis tools (circomspect, circom_civer, Picus, EcneProject, zkFuzz) to detect vulnerabilities in Circom circuits.
# Pull the image
docker pull ghcr.io/zksecurity/zkhydra:latest
docker-compose run --rm zkhydra uv run python -m zkhydra.main analyze \
--input examples/test_bug_2/circuits/circuit.circom \
--tools circomspect,circom_civer,picus,zkfuzz
# or
docker-compose run --rm zkhydra /bin/bash# Inside the container, analyze the example circuit
uv run python -m zkhydra.main analyze \
--input examples/test_bug/circuits/circuit.circom \
--tools circomspectEdit docker-compose.yml to mount your circuit directory:
volumes:
- ./output:/zkhydra/output
- ./examples:/zkhydra/examples
- ./my-circuits:/zkhydra/my-circuits # Add this lineThen analyze:
docker-compose run --rm zkhydra uv run python -m zkhydra.main analyze \
--input my-circuits/circuit.circom \
--tools circomspect,circom_civer,picusThe zkbugs dataset contains real-world Circom vulnerabilities. zkhydra does NOT vendor it as a submodule — clone it yourself outside the zkhydra tree and point --dataset at it. Each bug's entrypoint, input JSON, ptau, codebase path, and -l link flags are resolved from the runner contract (scripts/print_bug_vars.sh inside the zkbugs repo).
The full flow is: clone → download sources → run tools → triage Undecided → print results. Commands below assume zkhydra is your cwd and zkbugs is cloned alongside at ../zkbugs.
git clone https://github.com/zksecurity/zkbugs.git ../zkbugs
# Required for --zkbugs-mode original and for any direct-mode bug whose
# wrapper pulls files from the project codebase via -l. Run once; rerun
# with --force to refresh.
(cd ../zkbugs && ./scripts/download_sources.sh)A small number of bugs reference private project codebases that
download_sources.sh cannot pull. Those bugs will stay skipped in every
run with reason "codebase not available locally (run scripts/download_sources.sh, or source is private)" — expected behavior;
nothing to fix on the zkhydra side.
Pick one of the three patterns below. All of them write per-bug outputs
under <output>/<bug_name>/ plus a dataset-level summary.json.
# Docker (recommended): mount both your zkhydra source and the zkbugs clone.
docker-compose run --rm \
-v $(pwd)/zkhydra:/zkhydra/zkhydra \
-v $(pwd)/../zkbugs:/zkhydra/zkbugs \
zkhydra uv run python -m zkhydra.main zkbugs \
--dataset zkbugs/dataset/circom \
--zkbugs-mode direct \
--tools all \
--jobs 4 \
--timeout 600 \
--log-file \
--output output/zkbugs-run
# Local (no Docker, tools installed on the host):
uv run python -m zkhydra.main zkbugs \
--dataset ../zkbugs/dataset/circom \
--zkbugs-mode direct \
--tools all \
--jobs 4 \
--timeout 600 \
--output output/zkbugs-run
# Quick smoke — 6 random bugs, reproducible via --random-seed:
uv run python -m zkhydra.main zkbugs \
--dataset ../zkbugs/dataset/circom \
--tools circomspect,circom_civer \
--jobs 4 --random-bugs 6 --random-seed 42 \
--timeout 120 --output output/zkbugs-smoke
# Run both direct AND original for every bug (original is skipped when
# it would be identical to direct). Writes output/zkbugs-run/{direct,original}/.
uv run python -m zkhydra.main zkbugs \
--dataset ../zkbugs/dataset/circom \
--zkbugs-mode both \
--tools all \
--jobs 4 \
--timeout 600 \
--output output/zkbugs-runWhat this does:
- Walks
--datasetforzkbugs_config.jsonfiles (excludingdataset/codebases/anddataset/*/dependencies/). - Builds each bug's
Inputviascripts/print_bug_vars.sh(located by walking up from--dataset). - Skips bugs whose
Compiled Direct=false(orCompiled Original=falsein original mode), and bugs whose codebase is missing. Skipped rows land insummary.jsonwith a reason. - Runs the requested tools with per-bug precompilation; per-worker detail logs land at
<output>/<bug>/run.log.
evaluate_zkbugs_ground_truth is conservative — anything that isn't a trivial match is reported Undecided. The triage-zkbugs-finding skill + driver script promote those to TruePositive / FalseNegative (or keep Undecided with a reason when evidence is genuinely thin).
# Dry-run: collect every Undecided case into triage_queue.json for inspection.
python3 scripts/triage_zkbugs_run.py output/zkbugs-run \
--dataset ../zkbugs/dataset/circom
# Automated: invoke Claude headlessly per case, write triage.json alongside
# each evaluation.json, rewrite evaluation.json with the triaged verdict
# (preserving the original at evaluation.original.json), and patch
# summary.json with an evaluation_counts rollup.
python3 scripts/triage_zkbugs_run.py output/zkbugs-run \
--dataset ../zkbugs/dataset/circom \
--auto --jobs 4 \
--update-evaluation --update-summaryFlags in brief:
--auto— shell out toclaude -pper case; write<bug>/<tool>/triage.jsonand a top-leveltriage_summary.json.--update-evaluation— additionally rewrite eachevaluation.jsonwith the triaged verdict, preserving the original atevaluation.original.json(written once; re-runs do not clobber it).--update-summary— patch the datasetsummary.jsonwith anevaluation_countsrollup and a per-toolevaluationsubsection. Implies--update-evaluation.--tool <name>— triage only one tool's verdicts (e.g.--tool picus).
Prerequisites: the claude CLI must be on PATH for --auto. Dry-run (default) needs no CLI.
# Tabular summary: header, TP/FN/Undecided rollup, per-tool counts, and
# a per-(bug, tool) table sorted by verdict.
python3 scripts/print_zkbugs_summary.py output/zkbugs-run
# Drill down by verdict or tool:
python3 scripts/print_zkbugs_summary.py output/zkbugs-run --filter FalseNegative
python3 scripts/print_zkbugs_summary.py output/zkbugs-run --tool picus
python3 scripts/print_zkbugs_summary.py output/zkbugs-run --no-rowsRaw JSON is always available too:
jq '.evaluation_counts' output/zkbugs-run/summary.json
jq '.bugs[] | select(.status=="skipped") | {bug_name, reason}' \
output/zkbugs-run/summary.json
jq '.' output/zkbugs-run/<some_bug>/circomspect/evaluation.jsonEach bug dir has the per-tool detail:
output/zkbugs-run/<bug_name>/
├── ground_truth.json # expected vulnerability, location, refs
├── run.log # per-bug worker log (parallel runs only)
├── scratch/ # precompile artifacts (.r1cs / .sym)
└── <tool>/
├── raw.txt # tool stdout/stderr
├── parsed.json # tool-specific structured output
├── results.json # unified findings
├── evaluation.json # final verdict (triaged or not)
├── evaluation.original.json # pre-triage verdict (only after --update-evaluation)
└── triage.json # skill response (only after --auto)
--zkbugs-mode direct(default) — run against each bug's isolated wrappercircuit.circom. Every bug supports this mode and it's the fastest path. circom link flags are still needed because the wrapper typicallyincludes files from the codebase (e.g.include "circuits/...").--zkbugs-mode original— run against the project's real entrypoint (Original Entrypointinzkbugs_config.json). Requiresdataset/codebases/to be populated.--zkbugs-mode both— run direct for every bug, then original only for bugs whoseOriginal Entrypointis non-empty (i.e. distinct from direct). Writes<output>/direct/+<output>/original/with their own per-modesummary.json, plus a combined<output>/summary.jsonthat aggregates both. Scripts liketriage_zkbugs_run.pyandprint_zkbugs_summary.pyoperate on each sub-dir independently.
--bugs <sel1>,<sel2>,...— comma-separated substrings matched against each bug's directory name or its--dataset-relative path.--bugs-file <path>— one selector per line (lines starting with#are ignored).
Both flags combine as a union. Missing-match exits with an error.
# Single bug
uv run python -m zkhydra.main zkbugs \
--dataset zkbugs/dataset/circom --tools circomspect \
--bugs veridise_decoder_accepting_bogus_output_signal
# Path fragment (matches all bugs under darkforest-v0.3/)
uv run python -m zkhydra.main zkbugs \
--dataset zkbugs/dataset/circom --tools all \
--bugs darkforest-eth/darkforest-v0.3
# From a file
uv run python -m zkhydra.main zkbugs \
--dataset zkbugs/dataset/circom --tools all \
--bugs-file my-selectors.txt--jobs N(default1) — dispatch one bug per worker process. Tools within a bug still run sequentially. Each worker writes its detailed log to<output>/<bug_name>/run.log; the top-level log stays a concise index.--random-bugs N— after selector filtering, randomly pick N bugs. Handy for quick parallel smoke tests. Ignored if N exceeds the runnable set.--random-seed <int>— make--random-bugsreproducible.
# 6 random bugs across 4 workers, reproducible
uv run python -m zkhydra.main zkbugs \
--dataset zkbugs/dataset/circom --tools all \
--jobs 4 --random-bugs 6 --random-seed 42 \
--timeout 600 --output output/parallel-smoke--jobs 1 is byte-identical to a serial run. summary.json adds an errors field and a jobs field; rows are sorted by (status, bug_name) so diffs between serial and parallel runs stay clean.
- circomspect - Static analyzer and linter
- circom_civer - SMT-based verification with CVC5
- Picus - Symbolic execution via Rosette
- EcneProject - Julia-based circuit analysis
- zkFuzz - Fuzzing-based bug detection
Run tools on a single circuit without ground truth.
uv run python -m zkhydra.main analyze \
--input circuit.circom \
--tools circomspect,circom_civer \
--timeout 600 \
--output results/Output: Raw findings from each tool in results/
Compare tool results against known vulnerabilities (requires zkbugs format config).
uv run python -m zkhydra.main evaluate \
--input bug/zkbugs_config.json \
--tools allOutput: Ground truth comparison, True Positives, False Negatives
Run tools against the refactored zkbugs dataset.
uv run python -m zkhydra.main zkbugs \
--dataset zkbugs/dataset/circom \
--zkbugs-mode direct \
--tools all \
--timeout 600Output: Per-bug analysis results, per-bug ground_truth.json (with the full refactored config: codebase, direct_entrypoint, original_entrypoint, input, executed, compiled_direct, compiled_original), and a dataset-level summary.json that records processed and skipped bugs with reasons.
# analyze / evaluate
--input, -i Circuit file (.circom) for analyze mode
# zkbugs
--dataset, -d Path to <zkbugs>/dataset/circom
--zkbugs-mode direct (default) | original
--bugs Comma-separated bug selectors (substring match)
--bugs-file File with one bug selector per line (# comments allowed)
--jobs, -j Parallel workers (one bug per worker; default 1)
--random-bugs, -n Randomly pick N bugs after selector filtering
--random-seed Seed for --random-bugs
# shared
--tools, -t Tools to run (comma-separated or 'all')
--output, -o Output directory (default: output/)
--timeout Timeout per tool in seconds (default: 1800)
--log-file Enable file logging
--log-level Logging verbosity (default: INFO)
--vanilla Re-process existing raw output instead of running tools| Script | Purpose |
|---|---|
scripts/triage_zkbugs_run.py <run> |
Walk a run dir and collect every Undecided verdict; with --auto invoke the triage-zkbugs-finding skill per case; with --update-evaluation / --update-summary write verdicts back into evaluation.json and summary.json. |
scripts/print_zkbugs_summary.py <run> |
Pretty-print a run's header, per-tool TP/FN/Undecided rollup, skipped/errored bugs, and the full per-(bug, tool) verdict table. Filters via --tool / --filter. |
docker-compose run --rm zkhydra uv run python -m zkhydra.main analyze \
--input examples/test_bug/circuits/circuit.circom \
--tools circomspectexamples/zkbugs_new_format/ ships a self-contained toy bug that mirrors the refactored layout:
uv run python -m zkhydra.main zkbugs \
--dataset examples/zkbugs_new_format/dataset/circom \
--zkbugs-mode direct \
--tools circomspect \
--timeout 30 \
--output output/zkbugs-new-format-smokedocker-compose run --rm zkhydra uv run python -m zkhydra.main analyze \
--input examples/test_bug/circuits/circuit.circom \
--tools circomspect,circom_civer,zkfuzz \
--timeout 300# From host machine with zkbugs cloned locally
docker-compose run --rm zkhydra uv run python -m zkhydra.main zkbugs \
--dataset zkbugs/dataset/circom \
--zkbugs-mode direct \
--tools all \
--timeout 600 \
--log-file \
--output output/zkbugs-rundocker-compose run --rm zkhydra uv run python -m zkhydra.main zkbugs \
--dataset zkbugs/dataset/circom \
--tools circomspect,circom_civer \
--bugs veridise_decoder_accepting_bogus_output_signal \
--timeout 120output/
└── analyze_YYYYMMDD_HHMMSS/
├── circomspect/
│ ├── raw.txt # Raw tool output
│ ├── tool_output.json # Execution metadata
│ ├── parsed.json # Structured findings
│ └── results.json # Standardized format
├── circom_civer/
│ └── ...
└── summary.json # Aggregated results
For zkbugs mode:
output/zkbugs-run/
├── <bug_name>/
│ ├── ground_truth.json # includes new keys (codebase, entrypoints, compile flags, mode)
│ ├── scratch/ # precompile artifacts (.r1cs / .sym / compile.log)
│ ├── circomspect/
│ │ ├── raw.txt
│ │ ├── results.json
│ │ └── evaluation.json # TP/FN/Undecided
│ └── ...
└── summary.json # processed + skipped rows with reasons, per-mode totals
You can find a report of the zkbugs analysis in output/zkbugs-report.pdf and you can download a tar with all the results here: https://drive.google.com/file/d/1zTIrrVqy0MXMxC4tRiPoFMEfXv-2TLYY/view?usp=sharing.
- Ubuntu 24.04
- Python 3.12+
- Rust toolchain
- Julia, Node.js, Racket
# Clone repository with submodules
git clone --recurse-submodules https://github.com/zksecurity/zkhydra.git
cd zkhydra
# Run setup script (installs all dependencies and builds tools)
./setup.sh
# Run zkHydra
uv run python -m zkhydra.main --help# Build image locally (takes 30-60 minutes)
docker build -t zkhydra:latest .
# Or use docker-compose
docker-compose build# Format and lint (requires uv)
make all
# Or manually
uv run black zkhydra/
uv run isort zkhydra/ --profile black
uv run ruff check zkhydra/ --fixzkhydra/
├── zkhydra/ # Python package
│ ├── tools/ # Tool wrappers (circomspect.py, etc.)
│ ├── cli.py # CLI argument parsing
│ ├── core.py # Execution orchestration
│ └── main.py # Entry point
├── tools/ # Tool source repos (git submodules)
├── examples/ # Example circuits for testing
├── docker-compose.yml # Docker configuration
└── Dockerfile # Multi-stage build with all tools
Increase timeout for slow tools:
--timeout 1800 # 30 minutesFor large circuits, run tools individually:
--tools circomspect # Run one at a time# Pull latest image
docker pull ghcr.io/zksecurity/zkhydra:latest
# Rebuild locally if needed
docker-compose build --no-cacheThis project was partially funded by an Ethereum Foundation grant.
See LICENSE file.
- zkbugs Dataset: https://github.com/zksecurity/zkbugs
- Issues: https://github.com/zksecurity/zkhydra/issues