Requires Node.js 22+. The recipe below uses pnpm; npm and yarn work the same way.
deepsec lives in a .deepsec/ directory at the root of your codebase
— checked into the same git repo, so config, project context, and
custom matchers travel with the code. Generated scan output (findings,
runs, reports) stays gitignored.
From the root of the codebase you want to scan:
npx deepsec init # creates .deepsec/ + registers this repo
cd .deepsec
pnpm install # installs deepsecinit lays down a minimal scaffold inside .deepsec/: package.json,
deepsec.config.ts (one projects[] entry pointing at .., id
derived from your repo dir's basename), data/<id>/INFO.md (template
with section placeholders), data/<id>/SETUP.md (per-project agent
prompt), workspace-level AGENTS.md, .env.local, .gitignore. No
custom matchers in the scaffold — add those later, only when a real
finding shapes one for you.
Open .env.local and fill in AI_GATEWAY_API_KEY. Get one from
Vercel AI Gateway — one token covers
both Claude and Codex. Prefer Anthropic directly? Set
ANTHROPIC_AUTH_TOKEN=sk-ant-… and ANTHROPIC_BASE_URL=https://api.anthropic.com
instead. If claude or codex is already logged in on this machine,
non-sandbox runs (process / revalidate / triage) skip the token
and reuse the subscription. See vercel-setup.md.
To scan a different codebase from the same .deepsec/, run
pnpm deepsec init-project <path> — relative paths resolve against
.deepsec/'s parent.
INFO.md is what makes deepsec project-aware. It's injected into the
AI prompt for every batch — vague content here means vague findings.
Open the parent repo (the codebase you scanned, not .deepsec/) in
your coding agent (Claude Code, Codex, Cursor, …) and paste the prompt
that deepsec init printed. It walks the agent through:
- Read
.deepsec/node_modules/deepsec/SKILL.mdto understand the tool. - Open
.deepsec/data/<id>/SETUP.mdfor project-specific instructions. - Skim the codebase, then replace each section of
.deepsec/data/<id>/INFO.md.
The same prompt is shown in the project root README and is what init
prints to stdout after scaffold.
The processor auto-loads data/<id>/INFO.md from the workspace's data
dir. Edit it directly — no extra wiring needed in
deepsec.config.ts. INFO.md is optional but worth keeping; even a
paragraph noticeably improves the AI's output.
Before the first command: deepsec writes per-project state to
./data/<project-id>/ next to your config — files/ (one JSON per
scanned source file), runs/, plus project.json and the optional
INFO.md / config.json. The directory is gitignored by default; see
data-layout.md for the full schema.
pnpm deepsec scan--project-id is auto-resolved when the config has a single project
(the common case). Pass --project-id <id> once you've registered
more than one project. Pass --root <path> to override the resolved
path — useful for one-off scans against a different checkout.
scan runs ~110 regex matchers across the codebase. There are no AI calls at this stage.
On a 2,000-file project it takes ~15s. Output goes to
data/<id>/files/ as one JSON file per scanned source file (called a
FileRecord).
pnpm deepsec statusshows the current state: how many files were scanned, how many are pending AI investigation, etc.
pnpm deepsec process --concurrency 5Defaults: Claude Opus, 5 files per batch,
5 batches in parallel = 25 files in flight at once. You can lower the
parallelism (--concurrency 1) or set --limit 50 to budget-cap.
A rough cost guide (Opus, default settings):
| Files | Approx cost | Approx wall time |
|---|---|---|
| 100 | $25–60 | 5–15 min |
| 500 | $130–300 | 25–60 min |
| 2,000 | $500–1200 | 1.5–4 hr |
Costs swing 2–3x based on file complexity. Run --limit 50 first to
calibrate before committing to the full pass.
For a cheaper backend:
pnpm deepsec process --agent codex --model gpt-5.5Codex is the OpenAI-flavored backend. Same prompt, same JSON output, different agent loop. Try both on a small sample to see which catches shapes you care about. See models.md for the full backend / model matrix, refusal handling, and how to swap in newer models.
pnpm deepsec triage --severity HIGH
pnpm deepsec revalidate --min-severity HIGH- triage: classifies findings P0/P1/P2 without re-reading the code. ~1¢/finding.
- revalidate: re-reads the code and the git history, then emits a
TP/FP/Fixed/Uncertain verdict. Comparable cost to
process. Cuts FP rate by 50%+ on most repos.
Both optional, but worth running on the HIGH/CRITICAL set.
pnpm deepsec export --format md-dir --out ./findings
pnpm deepsec export --format json --out findings.jsonmd-dir writes one markdown file per finding under
./findings/{CRITICAL,HIGH,MEDIUM,…}/. json writes a single array
suitable for piping to a downstream issue tracker.
For a quick aggregate look:
pnpm deepsec metrics(Each of these commands accepts --project-id <id> if your config has
multiple projects; the auto-resolution only kicks in when there's
exactly one.)
- docs/writing-matchers.md — prompt your coding
agent to compare
data/matches against the target repo and write matchers that close the entry-point coverage gaps. - docs/configuration.md — every field on
deepsec.config.tsanddata/<id>/config.json. - docs/models.md — defaults,
--agent/--model, refusal handling, future models. - docs/plugins.md — for org-specific patterns that don't belong in the public matcher set.