Local memory, search, and code intelligence — integrated with Claude Code and Codex via CLI, lifecycle hooks, and MCP.
mdkb indexes your project's docs, source code, and persistent knowledge into a local hybrid search engine — then exposes it to Claude Code, Codex, or any MCP client so the AI finds what it needs instead of guessing.
No cloud APIs. No token-heavy context dumps. Just fast, local, relevant retrieval.
- Hybrid search — BM25 + semantic vectors over your markdown docs
- Code intelligence — tree-sitter parsing for 13 languages, call graphs, symbol search
- Persistent memory — AI-created knowledge entries that survive across sessions, including time-bound
reminderentries with due-date surfacing andpriorentries for behavioral patterns (30-day TTL default) - Lifecycle hooks — proactive context injection and reindex enqueue via Claude Code / Codex CLI hooks (no tool call required)
- Markdown-native memory — export/import memory entries as a folder of
.mdfiles for review, git tracking, or bulk edit - Unified diagnostics —
mdkb statsrenders a static ASCII dashboard (index health, collections, memory, code, sessions, hooks) - Zero config serving — auto-indexes on startup, watches for file changes, auto-
VACUUMs on drift
Full details in CHANGES.md.
- 3.0.0 (breaking) — Hook dispatch via daemon IPC (Unix socket JSON-RPC instead of in-process execution);
reindex-queue.jsonlremoved (PostToolUse sends paths directly to daemon watcher channel); hook event logging tohook-events.jsonl; per-event configurable latency thresholds;spawn_blockingfor CPU-bound hook work. - 2.2.0 —
priorentry type for behavioral patterns (30d TTL default, excluded from searches);mdkb cheatsheetAI-friendly command reference;--entry-typefilter onmdkb search; PreToolUse Grep hook suggests CLI commands (works without MCP); optimized injected text (~185 fewer tokens per turn). - 2.0.0 (breaking) —
mdkb statusremoved (usemdkb stats);mdkb memory export/importround-trip entries as.mdfiles with YAML frontmatter; unified ASCII stats dashboard with--format jsonand--no-color. - 1.4.0 —
reminderentry type withdue_in(surfaced in session warmup once due); schema migration v9 → v10; input hardening (reject control chars in titles/tags).
brew install sstraus/tap/mdkbcargo install --path .Download from Releases — macOS (arm64/x64), Linux (arm64/x64), Windows (x64).
cd your-project
mdkb init
mdkb collection add docs ./docs
mdkb update# Project-scoped (recommended)
mdkb setup mcp claude --scope local
# Or user-scoped (global)
mdkb setup mcp claude --scope userRestart Claude Code after setup. The MCP server auto-indexes on startup and watches for file changes.
MCP gives the assistant tools; hooks make it use them. Hooks also work standalone without MCP — the PreToolUse Grep interceptor suggests CLI commands via current_exe(), and SessionStart points to mdkb cheatsheet for the full command reference.
Register the lifecycle dispatcher so Claude gets a memory warmup at session start, relevant context on every prompt, and Grep-to-mdkb suggestions — without having to call search first:
# Claude Code, project-scoped (writes .claude/settings.local.json)
mdkb setup hooks claude --scope local
# Claude Code, user-scoped / global (writes ~/.claude/settings.json)
mdkb setup hooks claude --scope user
# Codex CLI (writes ~/.codex/hooks.json)
mdkb setup hooks codex
# Preview the merged settings JSON without writing
mdkb setup hooks claude --scope local --dry-run
# Disable specific events at install time
mdkb setup hooks claude --disable post-tool-use
mdkb setup hooks claude --disable user-prompt-submit,post-tool-useRestart the host CLI after setup. Re-running is idempotent: existing hook entries are replaced, unrelated settings preserved. Events: session-start, user-prompt-submit, pre-tool-use (Grep interceptor), post-tool-use. Full contract, config, and opt-out in docs/hooks.md.
mdkb setup mcp … and mdkb setup hooks … hard-code the absolute path of the binary that ran the setup. If you later move or rebuild the binary, the recorded command breaks. For stable global installs, first run cargo install --path . (binary lands in ~/.cargo/bin/mdkb), then run setup from that binary.
# Remove all Claude Code registrations (MCP + hooks)
mdkb setup remove claude --scope local # per-project
mdkb setup remove claude --scope user # global
# Remove individually
mdkb setup remove mcp claude --scope local
mdkb setup remove mcp codex
mdkb setup remove hooks claude --scope local
mdkb setup remove hooks codexSoft alternatives before uninstalling: create an empty .mdkbignore-hooks marker at the repo root to silence hooks for that working tree, or toggle session_start_enabled / user_prompt_submit_enabled / post_tool_use_enabled in .mdkb/config.toml.
Add to your Claude Code MCP config (.claude/mcp.json or ~/.claude/mcp.json):
{
"mcpServers": {
"mdkb": {
"type": "stdio",
"command": "/path/to/mdkb",
"args": ["serve"],
"cwd": "/path/to/your/project"
}
}
}The cwd must point to a directory with .mdkb/ initialized.
| Tool | Description |
|---|---|
search |
Hybrid search across docs+memory (default), or scoped to docs, memory, code, symbols. scope="memory" accepts min_confidence to filter decayed entries |
get |
Retrieve by ID, path, memory slug, glob pattern, or comma-separated list |
code_graph |
Call graph queries: calls, callers, or impact (transitive) |
status |
Index health, collections, and code index stats |
update |
Differential reindex of all collections and source code |
memory_write |
Create or update a memory entry (supports ttl, due_in for reminders, near-duplicate rejection) |
memory_write_batch |
Create or update multiple memory entries at once (max 20) |
memory_confirm |
Atomic Bayesian signal — outcome="confirmed" / "refuted" bumps confirmations and last_confirmed_at without rewriting content |
memory_delete |
Delete a memory entry |
memory_list |
List memory entries sorted by recency, popularity, or creation date |
usage |
Session and lifetime token ledger (per-tool call counts, token totals, truncation stats) |
| Scope | What it searches |
|---|---|
| (omit) | Docs + memory combined (default) |
docs |
Hybrid BM25 + semantic over markdown documents |
memory |
Full-text over memory entries |
symbols |
Exact symbol lookup by name, filterable by kind and file |
code |
Semantic code search across indexed symbols |
Persistent AI knowledge that survives across sessions — decisions, patterns, solved problems:
- Confidence scoring — entries decay over time unless re-confirmed (0-1 score based on age, access count, source type)
- Duplicate detection — near-duplicate entries are rejected before writing
- Revision tracking — manual entries track up to 3 revision diffs
- TTL (time-to-live) — pass
ttl(seconds) tomemory_writefor auto-expiring entries. Expired entries are filtered from searches and listings but remain accessible viaget(id)with an[EXPIRED]marker, so they can be inspected or renewed. Omitttlfor permanent entries.
Entry types: topic (concepts), problem (solutions), decision (architectural choices), reminder (time-bound — see below), prior (behavioral patterns — 30-day TTL default, excluded from default searches), handoff (session handover — no default TTL).
Create with memory_write(id, title, content, entry_type="reminder", due_in=<seconds>) (or mdkb memory add --entry-type reminder --due-in N). While due_at > now the reminder is hidden from searches and listings. Once due, it appears in the session warmup index prefixed [reminder:DUE] {id}: {title} so the MCP client sees it on the next turn. The AI is instructed to ask for confirmation before deleting and to snooze via memory_write with a new due_in (same id updates the record).
Behavioral pattern entries written by external analyzers (e.g., HUD stop hooks). Create with memory_write(id, title, content, entry_type="prior") or mdkb memory add <id> --entry-type prior. Priors default to 30-day TTL and are excluded from all default searches — query them explicitly with mdkb search --scope memory --entry-type prior "query" or search(query, scope="memory", entry_type="prior") via MCP.
Session context transfer entries. Create with memory_write(id, title, content, entry_type="handoff") or mdkb memory add <id> --entry-type handoff. Use --file <path> (CLI) or source_file (MCP) to read content from a file — saves tokens when agents write handoffs to the filesystem. The file path is persisted as source_path metadata. Handoffs have no default TTL; confidence decay handles relevance naturally.
Source types control confidence weighting:
| Source Type | Multiplier | Use Case |
|---|---|---|
official_docs |
1.0 | Verified documentation |
user_statement |
0.85 | Human-stated facts (default) |
auto_extracted |
0.70 | Automated knowledge capture |
inference |
0.65 | AI-inferred knowledge |
Tree-sitter parsing for 13 languages: Rust, Go, TypeScript, JavaScript, Python, Java, Kotlin, C, C++, C#, PHP, Swift, Lua, and GDScript.
- Substring search — find symbols by partial name (FTS5 trigram, works from 3 characters)
- Semantic code search — find conceptually similar code using embeddings
- Persistent call graph — function calls, callers, and transitive impact radius survive restarts
Generate semantic embeddings (downloads ~30MB ONNX model on first run):
mdkb embedmdkb search "authentication flow"
mdkb search "handler" --scope symbols --kind function
mdkb search "auth handler" --scope codemdkb collection add <name> <path> [--pattern <glob>]
mdkb collection remove <name>
mdkb collection rename <old> <new>mdkb get <id|path|slug>
mdkb get 42 --lines 10:50
mdkb get "docs/*.md"mdkb code index
mdkb code search "handler" --kind fn
mdkb code calls main
mdkb code callers handle_get
mdkb code impact init --depth 5mdkb memory add auth-patterns -t "OAuth2 PKCE Flow" -T topic --tags auth,security \
-c "Always use PKCE for public clients..."
mdkb memory add pay-bill -t "Pay electricity bill" -T reminder --due-in 86400 \
-c "Monthly utility payment"
mdkb memory list
mdkb memory search "authentication"
mdkb memory history auth-patterns
# Export all entries to .mdkb/memory/entries/ (one .md file per entry)
mdkb memory export
mdkb memory export --dir ./memories --include-expired --overwrite
# Import from a markdown folder (auto-detected) or legacy JSON file
mdkb memory import .mdkb/memory/entries --skip-duplicates
mdkb memory import entries.json --dry-run --skip-duplicatesmdkb stats is the unified diagnostic dashboard introduced in 2.0.0 (replaces the former mdkb status — not aliased, it was removed).
# Unified ASCII diagnostic dashboard
mdkb stats
# Machine-readable JSON output (safe for pipes and scripts)
mdkb stats --format json
# Plain text (no ANSI color, no Unicode box-drawing)
mdkb stats --no-colorThe report is stacked: header (repo, version, db size, last update) → index health → collections → memory (by entry type, reminders DUE / upcoming 7d) → code (by language, top files by tokens) → sessions (totals, top tools) → hooks (slow events last 7d, reindex queue pending). Output auto-detects whether stdout is a TTY; the JSON format is stable for scripting.
Configuration lives in .mdkb/config.toml:
[search]
default_limit = 10
[indexing]
debounce_ms = 100
# When true, the doc/collection walker honors .gitignore.
# When false (default), it reads .mdkbignore instead.
respect_gitignore = false
[code.indexing]
# When true (default), the code walker honors .gitignore.
# When false, it reads .mdkbignore instead.
respect_gitignore = true
[mcp]
max_response_tokens = 50000
max_document_tokens = 10000Environment overrides: MDKB_SEARCH_DEFAULT_LIMIT=20, MDKB_INDEXING_DEBOUNCE_MS=200.
Both the document walker (mdkb update) and the code walker (mdkb code index) share a unified ignore system:
| Mode | Files honored | Use when |
|---|---|---|
respect_gitignore = true |
.gitignore (+ # mdkb:index force-include) |
Your ignore rules are already correct for indexing. |
respect_gitignore = false |
.mdkbignore only |
You want to index content that .gitignore excludes (e.g. stories/, generated sources), or you need a different ignore scope from git. |
Defaults:
- Code indexing:
respect_gitignore = true— source trees usually want.gitignorehonored (skiptarget/,node_modules/, etc.). - Document indexing:
respect_gitignore = false— project knowledge often lives in gitignored folders (plans, stories, drafts).
# mdkb:index annotation (only active when respect_gitignore = true):
Force-include a gitignored path by prefixing it with a # mdkb:index comment line in .gitignore:
# mdkb:index
generated/
# mdkb:index
docs/api/*.mdBlank lines between the annotation and the pattern are tolerated. The annotation is case-insensitive.
.mdkbignore (only active when respect_gitignore = false):
Uses the same syntax as .gitignore, including !pattern for re-inclusion. Place one at the repo root.
All data stays local in .mdkb/:
.mdkb/
├── config.toml
├── index.sqlite # FTS5 + document metadata
├── code.sqlite # Source code symbols + call graph
└── memory/ # Memory entries (markdown files)
The embedding model (AllMiniLML6V2, ~30MB ONNX) is downloaded on first use and cached locally.
Add .mdkb/ to .gitignore — it can be regenerated with mdkb update && mdkb embed.
MIT