Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 28 additions & 43 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,24 +50,22 @@ One command sets up everything. `install` detects which AI coding tools you have

To target a specific platform:

```bash

code-review-graph install --platform codex # configure only Codex
code-review-graph install --platform cursor # configure only Cursor
code-review-graph install --platform claude-code # configure only Claude Code
code-review-graph install --platform gemini-cli # configure only Gemini CLI
code-review-graph install --platform kiro # configure only Kiro
code-review-graph install --platform copilot # configure only GitHub Copilot (VS Code)
code-review-graph install --platform copilot-cli # configure only GitHub Copilot CLI
```

``
Requires Python 3.10+. For the best experience, install [uv](https://docs.astral.sh/uv/) (the MCP config will use `uvx` if available, otherwise falls back to the `code-review-graph` command directly).

Then open your project and ask your AI assistant:

```
Build the code review graph for this project
```

Build the code review graph for this project
`
The initial build takes ~10 seconds for a 500-file project. After that, watch mode and supported hooks can keep the graph updated automatically.


Expand Down Expand Up @@ -129,20 +127,21 @@ All numbers come from the automated evaluation runner against 6 real open-source
<summary><strong>Token efficiency: 38x – 528x fewer tokens per question (whole-corpus vs graph query)</strong></summary>
<br>

For a typical agent question (`"how does authentication work"`, `"what is the main entry point"`, etc.), the graph returns ~2,000–3,500 tokens of targeted search hits + neighbor edges instead of forcing the agent to read every source file. The table below averages over the 5 sample questions defined in `code_review_graph/token_benchmark.py`.
For a typical agent question (`"how does authentication work"`, `"what is the main entry point"`, etc.), the graph returns ~2,000–3,500 tokens of targeted search hits + neighbor edges instead of forcing the agent to read every source file. The table below averages over the 5 sample questions defined in code_review_graph/token_benchmark.py`.

| Repo | Snapshot SHA | naive_corpus_tokens | avg graph_tokens | Reduction |
|------|---|-----------------:|----------------:|----------:|
| fastapi | `0227991a` | 951,071 | 2,169 | **528.4x** |
| code-review-graph | `84bde354` | 208,821 | 2,495 | **93.0x** |
| gin | `5c00df8a` | 166,868 | 1,990 | **91.8x** |
| flask | `a29f88ce` | 125,022 | 1,986 | **71.4x** |
| express | `b4ab7d65` | 135,955 | 3,465 | **40.6x** |
| httpx | `b55d4635` | 89,492 | 2,438 | **38.0x** |
| fastapi |0227991a| 951,071 | 2,169 | **528.4x** |
| code-review-graph | 84bde354 | 208,821 | 2,495 | **93.0x** |
| gin | 5c00df8a | 166,868 | 1,990 | **91.8x** |
| flask | a29f88ce | 125,022 | 1,986 | **71.4x** |
| express | b4ab7d65| 135,955 | 3,465 | **40.6x** |
| httpx | b55d4635 | 89,492 | 2,438 | **38.0x** |

Range across the 6 repos: **38x – 528x** (median per-question reduction ~82x).

The formal `eval/benchmarks/token_efficiency.py` benchmark measures a different scenario — full `get_review_context()` JSON versus just the changed-file content of a commit — and reports ratios below 1 for small commits, because the review-context response carries impact-radius edges plus source snippets that exceed a tiny single-file diff. That is not a bug; the two benchmarks answer different questions. See [`docs/REPRODUCING.md`](docs/REPRODUCING.md) for the full methodology.
The formal eval/benchmarks/token_efficiency.py`
benchmark measures a different scenario — full `get_review_context()` JSON versus just the changed-file content of a commit — and reports ratios below 1 for small commits, because the review-context response carries impact-radius edges plus source snippets that exceed a tiny single-file diff. That is not a bug; the two benchmarks answer different questions. See [`docs/REPRODUCING.md`](docs/REPRODUCING.md) for the full methodology.

Since v2.3.4, review and impact tools attach a compact `context_savings` estimate so MCP clients can see the approximate context saved per call. In v2.3.5 the CLI surfaces this as the boxed `Token Savings` panel shown above (see "Token Savings panel" in the Usage section) and adds `--verify` to cross-check against OpenAI's `cl100k_base` tokenizer. Calibration data in [`docs/REPRODUCING.md`](docs/REPRODUCING.md) shows the estimate is within ~1% of real GPT-4 tokens in aggregate across 222 sample files.

Expand Down Expand Up @@ -248,7 +247,7 @@ Blast-radius analysis reaches 100% recall on every one of the 13 evaluation comm
<summary><strong>CLI reference</strong></summary>
<br>

```bash

code-review-graph install # Auto-detect and configure all platforms
code-review-graph install --platform <name> # Target a specific platform
code-review-graph build # Parse entire codebase
Expand All @@ -272,11 +271,8 @@ code-review-graph daemon stop # Stop the daemon
code-review-graph daemon status # Show daemon status and repos
code-review-graph eval # Run evaluation benchmarks
code-review-graph serve # Start MCP server
```

</details>

<details>
<summary><strong>Token Savings panel: <code>detect-changes --brief</code> vs <code>update --brief</code></strong></summary>
<br>

Expand Down Expand Up @@ -319,7 +315,6 @@ The daemon is included with `code-review-graph` — no separate install required

**Quick setup:**

```bash
# 1. Register the repos you want to watch
crg-daemon add ~/project-a --alias proj-a
crg-daemon add ~/project-b
Expand All @@ -331,22 +326,21 @@ crg-daemon start
crg-daemon status # check daemon and per-repo watcher status
crg-daemon logs --repo proj-a -f # tail logs for a specific repo
crg-daemon stop # stop daemon and all watcher processes
```

Also available as `code-review-graph daemon start|stop|status|...`.

Under the hood, `crg-daemon add` writes to a TOML config file at
`~/.code-review-graph/watch.toml`. You can also edit this file directly:

```toml

[[repos]]
path = "/home/user/project-a"
alias = "proj-a"

[[repos]]
path = "/home/user/project-b"
alias = "project-b"
```


The daemon monitors this config file for changes and automatically starts/stops
watcher processes as repos are added or removed. Health checks every 30 seconds
Expand Down Expand Up @@ -407,26 +401,24 @@ Your AI assistant uses these automatically once the graph is built.

To exclude paths from indexing, create a `.code-review-graphignore` file in your repository root:

```

generated/**
*.generated.ts
vendor/**
node_modules/**
```


Note: in git repos, only tracked files are indexed (`git ls-files`), so gitignored files are skipped automatically. Use `.code-review-graphignore` to exclude tracked files or when git isn't available.

Optional dependency groups:

```bash
pip install code-review-graph[embeddings] # Local vector embeddings (sentence-transformers)
pip install code-review-graph[google-embeddings] # Google Gemini embeddings
pip install code-review-graph[communities] # Community detection (igraph)
pip install code-review-graph[enrichment] # Python call-resolution enrichment (Jedi)
pip install code-review-graph[eval] # Evaluation benchmarks (matplotlib)
pip install code-review-graph[wiki] # Wiki generation with LLM summaries (ollama)
pip install code-review-graph[all] # All optional dependencies
```
pip install code-review-graph[all] # All optional dependenci

### Environment Variables

Expand Down Expand Up @@ -457,16 +449,13 @@ pip install code-review-graph[all] # All optional dependencies
OpenAI-compatible embeddings (real OpenAI, Azure, or any self-hosted gateway like
new-api / LiteLLM / vLLM / LocalAI / Ollama in openai mode) need no extra install —
just set the environment variables and pass `provider="openai"` to `embed_graph`:

```bash
export CRG_OPENAI_BASE_URL=http://127.0.0.1:3000/v1 # or https://api.openai.com/v1
export CRG_OPENAI_API_KEY=sk-...
export CRG_OPENAI_MODEL=text-embedding-3-small # whatever your gateway serves
# optional:
export CRG_OPENAI_DIMENSION=1536 # pin dim (v3 models support reduction)
export CRG_OPENAI_BATCH_SIZE=100 # lower for gateways with tight limits
# (e.g. Qwen text-embedding-v4 caps at 10)
```

The cloud-egress warning is auto-skipped when the base URL points to localhost
(`127.0.0.1`, `localhost`, `0.0.0.0`, `::1`).
Expand All @@ -493,18 +482,18 @@ CRG exposes 30 MCP tools by default. In token-constrained environments, you can
limit the server to a subset of tools using `--tools` or the `CRG_TOOLS`
environment variable:

```bash
``
# Via CLI flag
code-review-graph serve --tools query_graph_tool,semantic_search_nodes_tool,detect_changes_tool

# Via environment variable
CRG_TOOLS=query_graph_tool,semantic_search_nodes_tool code-review-graph serve
```


The CLI flag takes precedence over the environment variable. When neither is set,
all tools are available. This is especially useful for MCP client configurations:

```json

{
"mcpServers": {
"code-review-graph": {
Expand All @@ -513,11 +502,8 @@ all tools are available. This is especially useful for MCP client configurations
}
}
}
```

</details>

---

## Troubleshooting

Expand All @@ -530,7 +516,7 @@ Installing from a **source tree** (for example `pipx install .`) needs build dep
1. Run the same command from **macOS Terminal.app** (or iTerm) instead of the IDE’s terminal, then retry `pipx install .` or `pipx install "git+https://..."` .
2. Use **[uv](https://docs.astral.sh/uv/)** to install the CLI from a checkout (uses different download machinery than `pip` in many cases):

```bash

cd /path/to/code-review-graph
uv tool install . --force
```
Expand All @@ -544,23 +530,22 @@ If you are using Windows and encounter `Invalid JSON: EOF while parsing` or `MCP

Ensure `fastmcp` is updated to at least `3.2.4+`. Then, configure your `~/.claude.json` to execute the `.exe` directly and pass the UTF-8 environment variable via the config:

```json
`
"code-review-graph": {
"command": "C:\\path\\to\\your\\venv\\Scripts\\code-review-graph.exe",
"args": ["serve", "--repo", "C:\\path\\to\\your\\project"],
"env": { "PYTHONUTF8": "1" }
}
```

## Contributing

```bash

git clone https://github.com/tirth8205/code-review-graph.git
cd code-review-graph
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest
```


<details>
<summary><strong>Adding a new language</strong></summary>
Expand All @@ -572,8 +557,8 @@ Edit `code_review_graph/parser.py` and add your extension to `EXTENSION_TO_LANGU

## Licence

MIT. See [LICENSE](LICENSE).

MIT 2026-3020 See [LICENSE](LICENSE).
© JOYANNDAUBA
<p align="center">
<br>
<a href="https://code-review-graph.com">code-review-graph.com</a><br><br>
Expand Down