Skip to content

dacrypt/ForgeQL

 
 

Repository files navigation

ForgeQL

Declarative code transformation for the era of AI-assisted development.

License

What Is ForgeQL?

ForgeQL is a declarative, code-aware transformation tool. You describe what you want to find or change in a codebase and ForgeQL executes it precisely — leaving the strategy and file selection to the agent or developer driving it.

Think of it as SQL for source code: a small, expressive query language backed by real syntax trees (tree-sitter), not fragile regular expressions.

It works in two modes:

  • MCP server — connects directly to AI coding agents (GitHub Copilot, Claude, etc.) inside VS Code or any MCP-capable editor.
  • Interpreter — pipe a FQL statement into the binary from a terminal or script.

Real-World Example: Finding Bugs in 60 Seconds

ForgeQL indexes code quality metrics at parse time — magic numbers, complex conditions, missing defaults, dead code, naming violations, and more. Here's what a single session looks like on a real embedded C++ project (14,797 symbols indexed):

USE pisco-code.main

-- 1. Where are the likely bugs hiding?
FIND symbols WHERE has_assignment_in_condition = 'true'
-- Result: 3 locations where = appears inside if() instead of ==

-- 2. Which conditions are too complex to reason about?
FIND symbols WHERE condition_tests >= 4 ORDER BY condition_tests DESC
-- Result: 5 functions with 4+ boolean sub-expressions in a single condition

-- 3. Any switch statements missing a default handler?
FIND symbols WHERE fql_kind = 'switch' WHERE has_catch_all = 'false'
-- Result: 2 switches that silently fall through on unexpected values

-- 4. Mixed && / || without grouping — operator precedence bugs?
FIND symbols WHERE mixed_logic = 'true'
-- Result: 4 conditions mixing AND/OR without parentheses

-- 5. Dead code — functions nobody calls?
FIND symbols WHERE fql_kind = 'function' WHERE usages = 0
  EXCLUDE 'tests/**' EXCLUDE 'vendor/**' IN 'src/**' ORDER BY path ASC
-- Result: 11 functions that can be safely removed

-- 6. Risk heat-map — which functions have the most dependents?
FIND symbols WHERE fql_kind = 'function'
  ORDER BY usages DESC LIMIT 5
-- Result: top 5 hotspots — a bug here breaks everything

-- 7. Zoom into one of those hotspots — read just the signature
FIND symbols WHERE name = 'PiscoCode::process'
-- Result: path=src/PiscoCode.cpp, line=87
SHOW LINES 87-103 OF 'src/PiscoCode.cpp'
-- Exactly 17 lines, exactly the function, zero waste

Total cost: 7 queries, ~800 tokens of output. A grep-based approach would need to read every file, parse the results manually, and still miss the semantic issues (mixed logic, assignment-in-condition, missing defaults). ForgeQL finds them because it operates on syntax trees, not text.


Two Core Goals

1. Small Command Surface

ForgeQL is intentionally minimal. Everything is built from four command families:

Family Commands
Session CREATE SOURCE · REFRESH SOURCE · USE · SHOW SOURCES · SHOW BRANCHES · DISCONNECT
Queries FIND symbols · FIND usages OF · FIND callees OF · FIND files
Content SHOW body · SHOW signature · SHOW outline · SHOW members · SHOW context · SHOW LINES
Mutations CHANGE FILE / CHANGE FILES (with MATCHING, LINES, WITH, or WITH NOTHING) · COPY LINES n-m OF src TO dst [AT LINE k] · MOVE LINES n-m OF src TO dst [AT LINE k]

Complex workflows — renaming a symbol, applying a coding standard, migrating a pattern — are composed by the agent from these primitives. ForgeQL provides the precision tools; the agent decides the strategy.

2. Small Token Footprint

Every command accepts a universal clause set that shapes the output before it reaches the agent's context window:

WHERE field operator value   -- filter rows
HAVING field operator value  -- filter after GROUP BY
IN 'glob'                    -- restrict to files matching a glob
EXCLUDE 'glob'               -- exclude files matching a glob
ORDER BY field ASC|DESC      -- sort
GROUP BY field               -- aggregate
LIMIT N                      -- cap row count
OFFSET N                     -- paginate
DEPTH N                      -- collapse tree depth

These clauses work identically on every command. Instead of returning thousands of rows for the agent to sift through, a single precise query returns exactly what is needed:

FIND symbols
  WHERE fql_kind = 'function'
  IN 'src/**'
  ORDER BY usages DESC
  LIMIT 10

Build and Install

Prerequisites

Tool Minimum version
Rust / Cargo 1.78
Git 2.x
VS Code 1.90 (for MCP integration)

tree-sitter grammars are compiled into the binary — no separate install needed.

Clone and Build

git clone https://github.com/andreviegas/ForgeQL.git
cd ForgeQL
cargo build --release

The binary lands at target/release/forgeql (Linux) or target\release\forgeql.exe (Windows).


Usage: MCP Server (VS Code)

This is the primary mode for AI agent use. ForgeQL speaks MCP over stdio; VS Code connects to it automatically once configured.

Linux

Create .vscode/mcp.json in your workspace (or ~/.config/Code/User/mcp.json for a global setup):

{
  "servers": {
    "forgeql": {
      "command": "/home/<your-user>/ForgeQL/target/release/forgeql",
      "args": ["--mcp", "--data-dir", "/your/data-dir"]
    }
  }
}

Windows

Create .vscode/mcp.json in your workspace:

{
  "servers": {
    "forgeql": {
      "command": "C:\\Users\\<YourUser>\\ForgeQL\\target\\release\\forgeql.exe",
      "args": ["--mcp", "--data-dir", "C:\\your\\data-dir"]
    }
  }
}

You can also add "--log-queries" to the args array to write every FQL statement to a log file — useful for debugging what the agent is sending.

After saving, open the Command Palette (Ctrl+Shift+P) and run MCP: Refresh Servers. The ForgeQL tools appear in the Copilot Chat tool list and can be called by any MCP-aware extension.


Usage: Interpreter Mode

You can also pipe any FQL statement directly to the binary. This is useful for scripting, quick lookups, and testing without an editor.

echo "SHOW SOURCES" | forgeql --data-dir /tmp/forgeql-lab

echo "FIND symbols WHERE fql_kind = 'function' LIMIT 5" \
  | forgeql --data-dir /tmp/forgeql-lab

Quick Start: Pisco Code v1.3.0

The examples below walk through exploring and modifying Pisco Code, an embedded C++ library, pinned at tag v1.3.0.

All commands work identically whether typed in Copilot Chat (MCP mode) or piped to the binary (interpreter mode).

Register and index the repository

CREATE SOURCE 'pisco' FROM 'https://github.com/pisco-de-luz/Pisco-Code.git'
USE pisco.v1.3.0

ForgeQL clones the repository, builds the tree-sitter index, and caches it on disk. Every subsequent query is served from the in-memory index — no re-reading files.

Explore the structure

-- Top-level file tree
FIND files DEPTH 2

-- Structural outline of a header
SHOW outline OF 'include/PiscoCode.h'

-- All classes defined in the library
FIND symbols
  WHERE fql_kind = 'class'
  ORDER BY name ASC

Find specific symbols

-- All getter/setter methods
FIND symbols
  WHERE fql_kind = 'function'
  WHERE name LIKE 'get%'
  ORDER BY name ASC

-- All #define macros in headers
FIND symbols
  WHERE fql_kind = 'macro'
  IN 'include/**'

Note for power users: fql_kind maps raw tree-sitter node kinds to universal names. If you need exact tree-sitter precision, the node_kind field is also available as a power-user escape hatch: WHERE node_kind = ... still works alongside all fql_kind queries.

Inspect a function

SHOW body OF 'PiscoCode::process'

Every SHOW response includes start_line and end_line. Those values feed directly into a CHANGE LINES command — no round-trip to re-read the file:

{
  "symbol": "PiscoCode::process",
  "file": "src/PiscoCode.cpp",
  "start_line": 87,
  "end_line": 103,
  "content": "void PiscoCode::process(...) { ... }"
}

Audit dead code

-- Functions that are never called
FIND symbols
  WHERE fql_kind = 'function'
  WHERE usages = 0
  IN 'src/**'
  EXCLUDE 'src/tests/**'

-- Usage count per file for a given symbol
FIND usages OF 'PiscoCode::process'
  GROUP BY file
  ORDER BY count DESC

Make changes inside a transaction

Transactions group multiple commands atomically. If VERIFY fails, every modified file is restored automatically.

BEGIN TRANSACTION 'rename-process'
  CHANGE FILES 'src/**/*.cpp', 'include/**/*.h'
    MATCHING 'PiscoCode::process' WITH 'PiscoCode::run'
  VERIFY build 'test'
COMMIT MESSAGE 'rename PiscoCode::process to PiscoCode::run'

Run a verify step on demand

VERIFY build can also be used as a standalone command — outside a transaction — to check the current state of the worktree against any step in .forgeql.yaml.

VERIFY build 'test'
# .forgeql.yaml
verify_steps:
  - name: test
    command: "cmake --build build && ctest --test-dir build -R unit"

Edit a specific function body

-- Step 1: get the exact line range
SHOW body OF 'PiscoCode::init'

-- Step 2: replace those lines with the new implementation
CHANGE FILE 'src/PiscoCode.cpp'
  LINES 87-103
  WITH 'void PiscoCode::run(Buffer& buffer) {
    for (auto& sample : buffer) {
        sample = this->pipeline.apply(sample);
    }
}'

Remove a deprecated function

-- After SHOW body returns start_line=200, end_line=214
BEGIN TRANSACTION 'remove-legacyHelper'
  CHANGE FILE 'src/PiscoCode.cpp'
    LINES 200-214
    WITH NOTHING
  VERIFY build 'test'
COMMIT MESSAGE 'remove deprecated legacyHelper'

About This Project

ForgeQL was conceived, designed, and validated by Andre Viegas — a C/C++ developer exploring Rust for the first time through this project.

Full transparency: 100% of the Rust code in this repository was initially generated by AI (GitHub Copilot / Claude). The architecture, the ForgeQL language design, the test strategy, and every design decision were mine; the AI translated those decisions into working Rust. This started as a proof of concept to answer a simple question: can a declarative, AST-aware transformation language make AI-assisted coding safer and more efficient?

Early results suggest it can. If you find the idea useful, I'd love help from experienced Rust developers to take it further — improving idiomatic Rust patterns, performance, multi-language support, and anything else that makes ForgeQL a better tool. See CONTRIBUTING.md for how to get involved.


Further Reading

  • doc/syntax.md — complete command and clause reference.
  • doc/architecture.md — internal design: index model, clause pipeline, MCP layer, agent guardrails.
  • doc/agents/ — AI agent integration: Custom Agent files for VS Code Copilot, Claude Code, and Cursor.

AI Agent Integration

ForgeQL ships with distributable agent configuration files that teach AI agents how to use it correctly — preventing drift to local filesystem tools (grep/find/cat) and enforcing precision query patterns.

Three layers of defense against agent drift:

  1. Tool restriction — the VS Code Custom Agent locks the agent to forgeql/* tools only. It literally cannot call grep, find, or cat.
  2. Behavioral instructions — every platform adapter includes the two-step workflow: FIND symbols WHERESHOW LINES n-m — no brute-force reading.
  3. MCP server guardrails — SHOW commands returning more than 40 lines without an explicit LIMIT clause are blocked. The agent gets zero lines and a guidance message redirecting it to precision queries. This teaches the right pattern on first contact, even without any agent files installed.
Platform File Tool Lock
VS Code Copilot forgeql.agent.md Yes (tools: [forgeql/*])
Claude Code CLAUDE.md No (behavioral + MCP guardrails)
Cursor .cursorrules No (behavioral + MCP guardrails)

See doc/agents/README.md for installation instructions.


License

Apache License 2.0 — see LICENSE.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Rust 100.0%