Skip to content

Commit 131d7bb

Browse files
committed
fix
1 parent 5d4c8dd commit 131d7bb

File tree

6 files changed

+282
-38
lines changed

6 files changed

+282
-38
lines changed

docs/PR_DATABASE_ENGINE_PLUGINS.md

Lines changed: 258 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,258 @@
1+
# Add database engine plugins (external engines)
2+
3+
This PR adds support for **database engine plugins**: external processes that implement a single `Parse` RPC and allow sqlc to work with databases that are not built-in (e.g. CockroachDB, TiDB, or custom SQL dialects). The plugin contract is deliberately minimal: no AST, no compiler in the middle, and a straight path from plugin output to codegen.
4+
5+
---
6+
7+
## Pipeline: built-in engine vs external plugin
8+
9+
### Built-in engine (PostgreSQL, MySQL, SQLite)
10+
11+
```mermaid
12+
flowchart LR
13+
subgraph input
14+
schema[schema.sql]
15+
queries[queries.sql]
16+
end
17+
18+
subgraph sqlc_core
19+
parser[Parser]
20+
ast[(AST)]
21+
compiler[Compiler]
22+
catalog[(Catalog)]
23+
codegen_input[Queries + types]
24+
end
25+
26+
subgraph output
27+
codegen[Codegen plugin]
28+
end
29+
30+
schema --> parser
31+
queries --> parser
32+
parser --> ast
33+
ast --> compiler
34+
schema --> catalog
35+
catalog --> compiler
36+
compiler --> codegen_input
37+
codegen_input --> codegen
38+
```
39+
40+
- Parser produces an **intermediate AST**.
41+
- **Compiler** resolves types, expands `*`, validates against catalog, produces Queries.
42+
- Codegen receives already-compiled queries and types.
43+
44+
### External engine plugin
45+
46+
```mermaid
47+
flowchart LR
48+
subgraph input
49+
schema[schema.sql or connection]
50+
queries[queries.sql]
51+
end
52+
53+
subgraph sqlc
54+
adapter[engine process runner]
55+
end
56+
57+
subgraph plugin["Engine plugin (external process)"]
58+
parse[Parse]
59+
end
60+
61+
subgraph plugin_output["Plugin returns"]
62+
sql[SQL text]
63+
params[parameters]
64+
cols[columns]
65+
end
66+
67+
subgraph codegen_path["To codegen"]
68+
codegen_input[SQL + params + columns]
69+
codegen[Codegen plugin]
70+
end
71+
72+
schema --> adapter
73+
queries --> adapter
74+
adapter -->|"ParseRequest{sql, schema_sql | connection_params}"| parse
75+
parse -->|"ParseResponse{sql, parameters, columns}"| plugin_output
76+
plugin_output --> codegen_input
77+
codegen_input --> codegen
78+
```
79+
80+
- **No intermediate AST**: the plugin returns already “resolved” data (SQL text, parameters, columns).
81+
- **No compiler** for the plugin path: type resolution, `*` expansion, and validation are the plugin’s job. sqlc does not run the built-in compiler on plugin output.
82+
- Data from the plugin is passed through to the **codegen plugin** as-is (or after a thin adapter that today still produces a synthetic `[]ast.Statement` for compatibility; the useful payload is `sql` + `parameters` + `columns`).
83+
84+
So: for external engines, the pipeline is effectively **schema + queries → engine plugin (Parse) → (sql, parameters, columns) → codegen**, with no AST and no compiler in between.
85+
86+
### Where the branch is taken (generate only)
87+
88+
The choice between “built-in engine” and “external plugin” happens **once per `sql[]` block**, when the compiler for that block is created. In the current implementation the branch is taken in **`internal/cmd/process.go`**: built-in engines use parse → compiler; plugin engines use **`runPluginQuerySet`** in **`plugin_engine_path.go`** (engine process runner, no compiler). Vet has no plugin-specific logic; for plugin-engine blocks it fails with compiler error "unknown engine".
89+
90+
```mermaid
91+
flowchart TB
92+
process["processQuerySets()"] --> branch{"engine for this sql[]"}
93+
branch -->|"sqlite / mysql / postgresql"| parse_path["parse() → NewCompiler() → Result"]
94+
branch -->|"name in engines"| plugin_path["runPluginQuerySet()"]
95+
plugin_path --> runner["engine process runner → external process"]
96+
runner --> to_result["pluginResponseToCompilerQuery → compiler.Result"]
97+
parse_path --> result["ProcessResult → codegen"]
98+
to_result --> result
99+
```
100+
101+
**Call flow (built-in path)**
102+
103+
1. **`internal/cmd/generate.go`**
104+
For each entry in `sql[]`, `parse()` is called with that block’s `config.SQL` (which includes `conf.Engine` = value of `engine: ...`).
105+
106+
2. **`parse()`** calls **`compiler.NewCompiler(sql, combo, parserOpts)`**
107+
So every SQL block gets its own compiler, and the engine is selected inside `NewCompiler`.
108+
109+
3. **`internal/compiler/engine.go`**, **`NewCompiler(conf config.SQL, combo config.CombinedSettings, ...)`**
110+
**Current code**: branch is in **`process.go`**, not here. **`NewCompiler`** only has sqlite/mysql/postgresql cases; `default` returns "unknown engine". Legacy snippet (branch used to be here):
111+
112+
```go
113+
switch conf.Engine {
114+
case config.EngineSQLite:
115+
// built-in: c.parser = sqlite.NewParser(), c.catalog = sqlite.NewCatalog(), ...
116+
case config.EngineMySQL:
117+
// built-in: dolphin parser + catalog
118+
case config.EnginePostgreSQL:
119+
// built-in: postgresql parser + catalog
120+
default:
121+
// “Other” engine name → treat as plugin
122+
if enginePlugin, found := config.FindEnginePlugin(&combo.Global, string(conf.Engine)); found {
123+
eng, _ := createPluginEngine(enginePlugin, combo.Dir) // plugin.NewPluginEngine or WASM
124+
c.parser = eng.Parser() // ProcessRunner, which calls the external process
125+
c.catalog = eng.Catalog()
126+
// ...
127+
} else {
128+
return nil, fmt.Errorf("unknown engine: %s ... add it to the 'engines' section ...")
129+
}
130+
}
131+
```
132+
133+
- **Built-in path**: `conf.Engine` is `"sqlite"`, `"mysql"`, or `"postgresql"` → the switch hits one of the first three cases; parser and catalog are the in-tree implementations.
134+
- **Plugin engines**: the compiler does *not* load plugin engines. For `engine: myplugin` (name under `engines:`), **generate** uses the plugin path in cmd (`runPluginQuerySet` → engine process runner in **`plugin_engine_path.go`**); **vet** fails with compiler error "unknown engine" (no plugin-specific code in vet). (“unknown engine”.
135+
136+
**Summary:** Built-in path = **`internal/compiler/engine.go`**; plugin path = **`internal/cmd/plugin_engine_path.go`**.
137+
138+
---
139+
140+
## No intermediate AST for external plugins
141+
142+
The plugin does **not** return an AST or “statements + AST”:
143+
144+
- **Request**: query text + schema (or connection).
145+
- **Response**: `sql` (possibly with `*` expanded), `parameters`, `columns`.
146+
147+
The plugin is the single place that defines how the query is interpreted. sqlc does not parse or analyze that SQL again; it forwards the plugin’s `ParseResponse` toward codegen. Any internal use of `[]ast.Statement` for the plugin path is a compatibility shim; the semantics are driven by the plugin’s `sql` / `parameters` / `columns`.
148+
149+
---
150+
151+
## No compiler for external plugins
152+
153+
The built-in **compiler** (catalog, type resolution, validation, expansion of `*`) is **not** used for external engine plugins:
154+
155+
- The plugin is responsible for:
156+
- Resolving parameter and column types (using schema or DB).
157+
- Expanding `SELECT *` if desired.
158+
- Emitting whatever shape of `parameters` and `columns` the codegen expects.
159+
- sqlc does not run the compiler on plugin output; it passes that output through to codegen. So “compiler” is only in the built-in-engine path.
160+
161+
---
162+
163+
## What is sent to and returned from the plugin
164+
165+
**Invocation**: one RPC, `Parse`, over stdin/stdout (protobuf).
166+
Example: `sqlc-engine-mydb parse` with `ParseRequest` on stdin and `ParseResponse` on stdout.
167+
168+
### Sent to the plugin (`ParseRequest`)
169+
170+
| Field | Description |
171+
|-------------------|-------------|
172+
| `sql` | Query text to parse (from `queries.sql` or the current batch). |
173+
| `schema_sql` | *(optional)* Contents of the schema file(s), e.g. concatenated `schema.sql`. |
174+
| `connection_params` | *(optional)* DSN + options for “database-only” mode when schema is taken from the DB. |
175+
176+
Exactly one of `schema_sql` or `connection_params` is used per request, depending on how the project is configured (see below).
177+
178+
### Returned from the plugin (`ParseResponse`)
179+
180+
| Field | Description |
181+
|-------------|-------------|
182+
| `sql` | Processed SQL. Can be the same as input, or e.g. `SELECT *` expanded to explicit columns. |
183+
| `parameters`| List of parameters: name, position, `data_type`, nullable, is_array, array_dims. |
184+
| `columns` | List of result columns: name, `data_type`, nullable, is_array, array_dims, optional table/schema. |
185+
186+
These three are enough for codegen to generate type-safe code without an AST or compiler step.
187+
188+
---
189+
190+
## How the schema is passed into the plugin
191+
192+
Schema is provided to the plugin in one of two ways, via `ParseRequest.schema_source`:
193+
194+
1. **Schema-based (files)**
195+
- sqlc reads the configured schema files (e.g. `schema: "schema.sql"`) and passes their contents as **`schema_sql`** (a string) in `ParseRequest`.
196+
- The plugin parses this SQL (e.g. `CREATE TABLE ...`) and uses it to resolve types, expand `*`, etc.
197+
198+
2. **Database-only**
199+
- When schema is not from files, sqlc can pass **`connection_params`** (DSN + optional extra options) in `ParseRequest`.
200+
- The plugin connects to the DB and uses live metadata (e.g. `INFORMATION_SCHEMA` / `pg_catalog`) to resolve types and columns.
201+
202+
So: **schema** is either “schema.sql as text” or “connection params to the database”; the plugin chooses how to use it.
203+
204+
---
205+
206+
## Changes in `sqlc.yaml`
207+
208+
### New top-level `engines`
209+
210+
Plugins are declared under `engines` and referenced by name in `sql[].engine`:
211+
212+
```yaml
213+
version: "2"
214+
215+
engines:
216+
- name: mydb
217+
process:
218+
cmd: sqlc-engine-mydb
219+
env:
220+
- MYDB_DSN
221+
222+
sql:
223+
- engine: mydb
224+
schema: "schema.sql"
225+
queries: "queries.sql"
226+
codegen:
227+
- plugin: go
228+
out: db
229+
```
230+
231+
- **`engines`**: list of named engines. Each has `name` and either `process.cmd` (and optionally `env`) or a WASM config.
232+
- **`sql[].engine`**: for that SQL block, use the engine named `mydb` (which triggers the plugin) instead of `postgresql` / `mysql` / `sqlite`.
233+
234+
So the only new concept in config is “define engines (including plugins) by name, then point `sql[].engine` at them.” Schema and queries are still configured per `sql[]` block as today.
235+
236+
---
237+
238+
## Who handles sqlc placeholders in queries
239+
240+
Support for sqlc-style placeholders (`sqlc.arg()`, `sqlc.narg()`, `sqlc.slice()`, `sqlc.embed()`, etc.) is **entirely up to the plugin**:
241+
242+
- The plugin receives the raw query text (including those macros) in `ParseRequest.sql`.
243+
- It can parse and interpret them and reflect the result in `parameters` (and, if needed, in `sql` or in how it uses schema). There is no separate “sqlc placeholder” pass in the core for the plugin path.
244+
- If the plugin does not handle a placeholder, that placeholder will not be turned into proper parameters/columns by sqlc; the pipeline does not add a generic placeholder expander for external engines.
245+
246+
So: **the database engine plugin is responsible for understanding and handling sqlc placeholders** for its engine.
247+
248+
---
249+
250+
## Summary for maintainers
251+
252+
- **One RPC**: `Parse(sql, schema_sql | connection_params) → (sql, parameters, columns)`.
253+
- **No AST, no compiler** on the plugin path; data flows from plugin to codegen.
254+
- **Schema** is passed either as `schema_sql` (file contents) or as `connection_params` (DSN) in `ParseRequest`.
255+
- **Config**: `engines[]` + `sql[].engine: <name>`; existing `schema` / `queries` / `codegen` stay as-is.
256+
- **Placeholders**: handled inside the plugin; core does not add a generic layer for external engines.
257+
258+
This keeps the plugin API small and leaves type resolution and dialect behavior inside the plugin, while still allowing sqlc to drive generation from a single, well-defined contract.

docs/howto/engine-plugins.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,8 @@ Engine plugins let you use sqlc with databases that are not built-in. You can ad
1010

1111
Data returned by the engine plugin (SQL text, parameters, columns) is passed through to [codegen plugins](../guides/plugins.md) without an extra compiler/AST step. The plugin is the single place that defines how queries are interpreted for that engine.
1212

13+
**Limitation:** `sqlc vet` does not support plugin engines. Use vet only with built-in engines (postgresql, mysql, sqlite).
14+
1315
## Overview
1416

1517
An engine plugin is an external process that implements one RPC:
@@ -154,11 +156,13 @@ A minimal engine that parses SQLite-style SQL and expands `*` using a schema is
154156

155157
## Architecture
156158

159+
For each `sql[]` block, `sqlc generate` branches on the configured engine: built-in (postgresql, mysql, sqlite) use the compiler and catalog; any engine listed under `engines:` in sqlc.yaml uses the plugin path (no compiler, schema + queries go to the plugin’s Parse RPC, then output goes to codegen).
160+
157161
```
158162
┌─────────────────────────────────────────────────────────────────┐
159163
│ sqlc generate │
160164
│ 1. Read sqlc.yaml, find engine for this sql block │
161-
│ 2. Call plugin: parse (sql + schema_sql or connection_params)
165+
│ 2. If plugin engine: call plugin parse (sql + schema_sql etc.)
162166
│ 3. Use returned sql, parameters, columns in codegen │
163167
└─────────────────────────────────────────────────────────────────┘
164168
Lines changed: 11 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,7 @@
1-
// Package plugin implements running database-engine plugins as external processes.
2-
//
3-
// It is used only by the generate path (cmd runPluginQuerySet): schema and queries
4-
// are sent via ParseRequest to the plugin; the compiler is not used for plugin engines.
5-
// Vet does not support plugin engines.
6-
package plugin
1+
// This file runs a database-engine plugin as an external process (parse RPC over stdin/stdout).
2+
// It is used only by the plugin-engine generate path (runPluginQuerySet). Vet does not support plugin engines.
3+
4+
package cmd
75

86
import (
97
"bytes"
@@ -20,29 +18,23 @@ import (
2018
pb "github.com/sqlc-dev/sqlc/pkg/engine"
2119
)
2220

23-
// ProcessRunner runs an engine plugin as an external process.
24-
type ProcessRunner struct {
21+
// engineProcessRunner runs an engine plugin as an external process.
22+
type engineProcessRunner struct {
2523
Cmd string
2624
Dir string // Working directory for the plugin (config file directory)
2725
Env []string
2826
}
2927

30-
// NewProcessRunner creates a new ProcessRunner.
31-
func NewProcessRunner(cmd, dir string, env []string) *ProcessRunner {
32-
return &ProcessRunner{
33-
Cmd: cmd,
34-
Dir: dir,
35-
Env: env,
36-
}
28+
func newEngineProcessRunner(cmd, dir string, env []string) *engineProcessRunner {
29+
return &engineProcessRunner{Cmd: cmd, Dir: dir, Env: env}
3730
}
3831

39-
func (r *ProcessRunner) invoke(ctx context.Context, method string, req, resp proto.Message) error {
32+
func (r *engineProcessRunner) invoke(ctx context.Context, method string, req, resp proto.Message) error {
4033
stdin, err := proto.Marshal(req)
4134
if err != nil {
4235
return fmt.Errorf("failed to encode request: %w", err)
4336
}
4437

45-
// Parse command string to support formats like "go run ./path"
4638
cmdParts := strings.Fields(r.Cmd)
4739
if len(cmdParts) == 0 {
4840
return fmt.Errorf("engine plugin not found: %s\n\nMake sure the plugin is installed and available in PATH.\nInstall with: go install <plugin-module>@latest", r.Cmd)
@@ -53,15 +45,12 @@ func (r *ProcessRunner) invoke(ctx context.Context, method string, req, resp pro
5345
return fmt.Errorf("engine plugin not found: %s\n\nMake sure the plugin is installed and available in PATH.\nInstall with: go install <plugin-module>@latest", r.Cmd)
5446
}
5547

56-
// Build arguments: rest of cmdParts + method
5748
args := append(cmdParts[1:], method)
5849
cmd := exec.CommandContext(ctx, path, args...)
5950
cmd.Stdin = bytes.NewReader(stdin)
60-
// Set working directory to config file directory for relative paths
6151
if r.Dir != "" {
6252
cmd.Dir = r.Dir
6353
}
64-
// Inherit the current environment and add SQLC_VERSION
6554
cmd.Env = append(os.Environ(), fmt.Sprintf("SQLC_VERSION=%s", info.Version))
6655

6756
out, err := cmd.Output()
@@ -77,13 +66,11 @@ func (r *ProcessRunner) invoke(ctx context.Context, method string, req, resp pro
7766
if err := proto.Unmarshal(out, resp); err != nil {
7867
return fmt.Errorf("failed to decode response: %w", err)
7968
}
80-
8169
return nil
8270
}
8371

84-
// ParseRequest invokes the plugin's Parse RPC with the given request (sql and optional schema_sql).
85-
// The cmd layer uses this for the plugin-engine generate path instead of the compiler.
86-
func (r *ProcessRunner) ParseRequest(ctx context.Context, req *pb.ParseRequest) (*pb.ParseResponse, error) {
72+
// parseRequest invokes the plugin's Parse RPC. Used by runPluginQuerySet.
73+
func (r *engineProcessRunner) parseRequest(ctx context.Context, req *pb.ParseRequest) (*pb.ParseResponse, error) {
8774
resp := &pb.ParseResponse{}
8875
if err := r.invoke(ctx, "parse", req, resp); err != nil {
8976
return nil, err

internal/cmd/plugin_engine_path.go

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,6 @@ import (
1313

1414
"github.com/sqlc-dev/sqlc/internal/compiler"
1515
"github.com/sqlc-dev/sqlc/internal/config"
16-
"github.com/sqlc-dev/sqlc/internal/engine/plugin"
1716
"github.com/sqlc-dev/sqlc/internal/metadata"
1817
"github.com/sqlc-dev/sqlc/internal/multierr"
1918
"github.com/sqlc-dev/sqlc/internal/source"
@@ -41,13 +40,13 @@ func runPluginQuerySet(ctx context.Context, rp ResultProcessor, name, dir string
4140
if o != nil && o.PluginParseFunc != nil {
4241
parseFn = o.PluginParseFunc
4342
} else {
44-
r := plugin.NewProcessRunner(enginePlugin.Process.Cmd, combo.Dir, enginePlugin.Env)
43+
r := newEngineProcessRunner(enginePlugin.Process.Cmd, combo.Dir, enginePlugin.Env)
4544
parseFn = func(schemaSQL, querySQL string) (*pb.ParseResponse, error) {
4645
req := &pb.ParseRequest{Sql: querySQL}
4746
if schemaSQL != "" {
4847
req.SchemaSource = &pb.ParseRequest_SchemaSql{SchemaSql: schemaSQL}
4948
}
50-
return r.ParseRequest(ctx, req)
49+
return r.parseRequest(ctx, req)
5150
}
5251
}
5352

0 commit comments

Comments
 (0)