Skip to content

Commit 0816ea5

Browse files
authored
Merge pull request #26 from mars167/feat/query-files-cli
feat(cli): standardize CLI output format and add documentation
2 parents 40d1986 + 5e689bc commit 0816ea5

7 files changed

Lines changed: 267 additions & 50 deletions

File tree

CLAUDE.md

Lines changed: 203 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,203 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
git-ai is a local code understanding tool that builds a semantic layer for codebases using advanced RAG techniques. It combines vector search (LanceDB) with graph-based analysis (CozoDB) to enable AI Agents to deeply understand code structure and relationships beyond simple text search.
8+
9+
**Key Design Principle**: Indices travel with code in Git repos—checkout, branch, or tag any version and the semantic index is immediately available without rebuilding.
10+
11+
## Development Commands
12+
13+
```bash
14+
# Build
15+
npm run build # Compile TypeScript to dist/
16+
17+
# Development run
18+
npm run start -- --help # Run directly with ts-node
19+
20+
# Testing
21+
npm test # Full test suite (build + E2E)
22+
npm run test:cli # CLI-specific tests
23+
npm run test:parser # Parser verification
24+
25+
# Global install for local testing
26+
npm i -g .
27+
```
28+
29+
**Important**: After building, test with the compiled CLI to verify packaging:
30+
```bash
31+
node dist/bin/git-ai.js --help
32+
```
33+
34+
## Architecture Overview
35+
36+
### Three-Layer Architecture
37+
38+
```
39+
CLI Layer (src/cli/)
40+
41+
Core Layer (src/core/)
42+
43+
Data Layer (LanceDB + CozoDB)
44+
```
45+
46+
**CLI Layer** (`src/cli/`):
47+
- **Commands**: Commander.js command definitions in `cli/commands/`
48+
- **Handlers**: Business logic in `cli/handlers/` (one per command type)
49+
- **Schemas**: Zod validation schemas in `cli/schemas/`
50+
- **Types**: CLI-specific types and the `executeHandler` wrapper in `cli/types.ts`
51+
52+
**Core Layer** (`src/core/`):
53+
- **indexer.ts / indexerIncremental.ts**: Parallel indexing with worker pools
54+
- **lancedb.ts**: Vector database (SQ8-quantized embeddings)
55+
- **cozo.ts / astGraph.ts**: Graph database for AST relationships
56+
- **parser.ts**: Tree-sitter based multi-language parsing
57+
- **embedding.ts**: ONNX-based semantic embeddings
58+
- **search.ts**: Multi-strategy retrieval (vector + graph + hybrid)
59+
- **repoMap.ts**: PageRank-based importance scoring
60+
61+
### Data Flow
62+
63+
**Indexing**: Source files → Tree-sitter AST → Embeddings + Symbol extraction → LanceDB (chunks) + CozoDB (refs)
64+
65+
**Search**: Query → Classification → Multi-strategy retrieval → Reranking → Results
66+
67+
### Standard CLI Output Format
68+
69+
All CLI commands output JSON for agent readability:
70+
71+
**Success**:
72+
```json
73+
{
74+
"ok": true,
75+
"command": "semantic",
76+
"repoRoot": "/path/to/repo",
77+
"timestamp": "2024-01-01T00:00:00Z",
78+
"duration_ms": 123,
79+
"data": { ... }
80+
}
81+
```
82+
83+
**Error**:
84+
```json
85+
{
86+
"ok": false,
87+
"reason": "index_not_found",
88+
"message": "No semantic index found",
89+
"command": "semantic",
90+
"hint": "Run 'git-ai ai index --overwrite' to create an index"
91+
}
92+
```
93+
94+
See `src/cli/types.ts` for `CLIResult`, `CLIError`, `ErrorReasons`, and `ErrorHints`.
95+
96+
## Key Files by Purpose
97+
98+
### Entry Points
99+
- `bin/git-ai.ts`: Main CLI—proxies to git for non-AI commands, registers `ai` command
100+
- `src/commands/ai.ts`: AI command registry (all `git-ai ai *` subcommands)
101+
102+
### Indexing System
103+
- `src/core/indexer.ts`: Parallel indexing with HNSW vector index
104+
- `src/core/indexerIncremental.ts`: Smart rebuild strategies
105+
- `src/core/parser.ts`: Multi-language Tree-sitter adapters
106+
- `src/core/embedding.ts`: ONNX runtime for local embeddings
107+
- `src/core/lancedb.ts`: LanceDB management (chunks table)
108+
- `src/core/sq8.ts`: Vector quantization for storage efficiency
109+
110+
### Search & Retrieval
111+
- `src/core/search.ts`: Query classification and multi-strategy routing
112+
- `src/core/symbolSearch.ts`: Symbol-based search functionality
113+
- `src/core/astGraphQuery.ts`: Graph-based call relationship queries
114+
115+
### Graph Database
116+
- `src/core/cozo.ts`: CozoDB interface (refs table)
117+
- `src/core/astGraph.ts`: AST graph construction
118+
119+
### Repository Management
120+
- `src/core/git.ts`: Git repository handling
121+
- `src/core/workspace.ts`: Workspace path resolution
122+
- `src/core/manifest.ts`: Index versioning and compatibility checking
123+
- `src/core/indexCheck.ts`: Index validation
124+
125+
### Archive & Distribution
126+
- `src/core/archive.ts`: Pack/unpack index archives (.git-ai/lancedb.tar.gz)
127+
- `src/core/lfs.ts`: Git LFS integration for index storage
128+
129+
### MCP Server
130+
- `src/mcp/server.ts`: MCP server implementation (stdio + HTTP modes)
131+
- `src/mcp/handlers/`: MCP tool implementations
132+
- `src/mcp/tools/`: MCP tool registry
133+
134+
## MCP Integration
135+
136+
The MCP Server enables AI Agents to query git-ai indices. All MCP tools require a `path` parameter to specify the target repository—no implicit repository selection for atomic operation.
137+
138+
**Two modes**:
139+
- **stdio mode** (default): Single-agent connection
140+
- **HTTP mode** (`--http`): Multiple concurrent agents with session management
141+
142+
## Language Support
143+
144+
Supported languages are in `src/core/parser.ts`:
145+
- TypeScript/JavaScript (`.ts`, `.tsx`, `.js`, `.jsx`)
146+
- Java (`.java`)
147+
- Python (`.py`)
148+
- Go (`.go`)
149+
- Rust (`.rs`)
150+
- C (`.c`, `.h`)
151+
- Markdown (`.md`, `.mdx`)
152+
- YAML (`.yml`, `.yaml`)
153+
154+
Each language has a separate LanceDB table with its own HNSW index.
155+
156+
## File Filtering
157+
158+
Indexing respects three filter mechanisms (priority order):
159+
1. `.aiignore` - Highest priority, explicit exclusions
160+
2. `.git-ai/include.txt` - Force-include overrides `.gitignore`
161+
3. `.gitignore` - Standard Git ignore patterns
162+
163+
Pattern syntax: `**` (any dirs), `*` (any chars), `directory/` (entire dir)
164+
165+
## Testing
166+
167+
Tests are located in `test/` with multiple formats (`.test.mjs`, `.test.ts`, `.test.js`).
168+
169+
Run single tests with Node's native test runner:
170+
```bash
171+
node --test test/cliCommands.test.js
172+
```
173+
174+
## Native Dependencies
175+
176+
This project uses native modules that may need build tools:
177+
- `@lancedb/lancedb` - Vector database (platform-specific prebuilt binaries)
178+
- `cozo-node` - Graph database
179+
- `onnxruntime-node` - ONNX runtime
180+
- `tree-sitter-*` - Language parsers
181+
182+
If native builds fail, ensure:
183+
- Node.js >= 18
184+
- Build tools installed (Windows: Visual Studio Build Tools, Linux: build-essential)
185+
186+
## Common Tasks
187+
188+
**Add a new CLI command**:
189+
1. Create handler in `src/cli/handlers/yourHandler.ts`
190+
2. Create Zod schema in `src/cli/schemas/` (optional)
191+
3. Register in `src/cli/registry.ts`
192+
4. Add Commander command in `src/cli/commands/yourCommand.ts`
193+
5. Register in `src/commands/ai.ts`
194+
195+
**Add language support**:
196+
1. Add Tree-sitter grammar in `package.json` dependencies
197+
2. Extend `src/core/parser.ts` with new language adapter
198+
3. Test with `npm run test:parser`
199+
200+
**Add MCP tool**:
201+
1. Create handler in `src/mcp/handlers/`
202+
2. Register in `src/mcp/tools/`
203+
3. Export from `src/mcp/server.ts`

package.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,8 @@
1111
"scripts": {
1212
"build": "tsc",
1313
"start": "ts-node bin/git-ai.ts",
14-
"test": "npm run build && node dist/bin/git-ai.js ai index --overwrite && node --test test/*.test.mjs test/*.test.ts",
14+
"test": "npm run build && node dist/bin/git-ai.js ai index --overwrite && node --test test/*.test.mjs test/*.test.ts test/*.test.js",
15+
"test:cli": "bash test-cli.sh",
1516
"test:parser": "ts-node test/verify_parsing.ts"
1617
},
1718
"files": [

src/cli/handlers/queryFilesHandlers.ts

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@ import type { SearchFilesInput } from '../schemas/queryFilesSchemas';
1010
import {
1111
isCLIError,
1212
buildRepoMapAttachment,
13-
filterWorkspaceRowsByLang,
1413
} from './sharedHelpers';
1514

1615
function escapeQuotes(s: string): string {
@@ -249,11 +248,18 @@ export async function handleSearchFiles(input: SearchFilesInput): Promise<CLIRes
249248

250249
const repoMap = input.withRepoMap ? await buildRepoMapAttachment(ctx.repoRoot, input) : undefined;
251250

251+
const files = rows.map(r => ({
252+
path: String(r.file || ''),
253+
symbol: String(r.symbol || ''),
254+
kind: String(r.kind || ''),
255+
lang: String(r.lang || ''),
256+
}));
257+
252258
return success({
253259
repoRoot: ctx.repoRoot,
254-
count: rows.length,
260+
count: files.length,
255261
lang: input.lang,
256-
rows,
262+
files,
257263
...(repoMap ? { repo_map: repoMap } : {}),
258264
});
259265
} catch (e) {

src/core/lfs.ts

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import { spawnSync } from 'child_process';
22

3-
function runGit(args: string[], cwd: string) {
4-
const res = spawnSync('git', args, { cwd, stdio: 'inherit' });
3+
function runGit(args: string[], cwd: string, silent: boolean = false) {
4+
const res = spawnSync('git', args, { cwd, stdio: silent ? 'ignore' : 'inherit' });
55
if (res.status !== 0) throw new Error(`git ${args.join(' ')} failed`);
66
}
77

@@ -18,7 +18,7 @@ export function isGitLfsInstalled(cwd: string): boolean {
1818

1919
export function ensureLfsTracking(cwd: string, pattern: string): { tracked: boolean } {
2020
if (!isGitLfsInstalled(cwd)) return { tracked: false };
21-
runGit(['lfs', 'track', pattern], cwd);
22-
runGit(['add', '.gitattributes'], cwd);
21+
runGit(['lfs', 'track', pattern], cwd, true);
22+
runGit(['add', '.gitattributes'], cwd, true);
2323
return { tracked: true };
2424
}

test-cli.sh

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
#!/bin/bash
2+
if [ -f test/cliCommands.test.js ]; then
3+
npm run build && node --test test/cliCommands.test.js
4+
else
5+
echo "cliCommands.test.js not found (skipping CLI tests)"
6+
exit 0
7+
fi

test/e2e.test.js

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -86,10 +86,9 @@ test('git-ai works in Spring Boot and Vue repos', async () => {
8686
runOk('node', [CLI, 'ai', 'agent', 'install'], repo);
8787
assert.ok(runOk('node', [CLI, 'ai', 'agent', 'install', '--overwrite'], repo).status === 0);
8888
{
89-
const skill = await fs.readFile(path.join(repo, '.agents', 'skills', 'git-ai-mcp', 'SKILL.md'), 'utf-8');
90-
const rule = await fs.readFile(path.join(repo, '.agents', 'rules', 'git-ai-mcp', 'RULE.md'), 'utf-8');
91-
assert.ok(skill.includes('git-ai-mcp'));
92-
assert.ok(rule.includes('git-ai-mcp'));
89+
// git-ai-code-search has SKILL.md but no RULE.md, so only check SKILL
90+
const skill = await fs.readFile(path.join(repo, '.agents', 'skills', 'git-ai-code-search', 'SKILL.md'), 'utf-8');
91+
assert.ok(skill.includes('git-ai-code-search'), 'git-ai-code-search skill should be installed');
9392
}
9493
runOk('git', ['add', '.git-ai/meta.json', '.git-ai/lancedb.tar.gz'], repo);
9594
runOk('git', ['commit', '-m', 'add git-ai index'], repo);

0 commit comments

Comments
 (0)