Skip to content

Conversation

@asmyasnikov
Copy link

@asmyasnikov asmyasnikov commented Dec 28, 2025

This PR resolves issue #4158 and adds support for database engine plugins: external processes that implement a single Parse RPC and allow sqlc to work with databases that are not built-in (e.g. CockroachDB, TiDB, or custom SQL dialects). The plugin contract is deliberately minimal: no AST, no compiler in the middle, and a straight path from plugin output to codegen.

Motivation

  1. sqlc is widely used, and requests to add support for new databases come in regularly: feat(clickhouse): add ClickHouse database engine support #4244, feat(sqlserver): add Microsoft SQL Server support #4243, YDB support (draft) #4090, Add ClickHouse Engine Support to sqlc #4220, Adding support for CockroachDB #4009.

  2. Supporting new database engines increases maintenance burden on core maintainers (parsers, catalogs, dialects, and compatibility for each engine).

  3. Exposing a way to add external database engines (Refactor project for easy adding support of new database engine without sqlc core changes #4158) lets the community ship new backends without growing sqlc’s core, and lets users adopt sqlc for more SQL dialects.

  4. What should an engine plugin do?

    • 4.1) Validate the query. If the query is invalid, the plugin returns an error with a clear message. This is straightforward.
    • 4.2) Parse the query — but into what? If the plugin were to return an AST in sqlc’s format (i.e. by making internal/sql/ast public), we risk that a new dialect cannot be expressed with the existing AST node types. This PR therefore does not have the plugin pass an AST into sqlc’s core.
    • 4.3) Instead, the plugin parses the SQL and returns to the core the query parameters and the types and names of the result columns. That is enough for type-safe codegen.
    • 4.4) Optional behavior (wildcard expansion, etc.) is left to the plugin. To support it, the core passes either schema.sql or database connection parameters into the plugin.
    • 4.5) The core then feeds the plugin’s output directly into the codegen plugin. So an external database engine plugin bypasses the compiler and goes straight to code generation.

With this design, the engine plugin system can support many SQL dialects, while validation and enrichment of SQL (e.g. * expansion) are the plugin’s responsibility, not the core’s.


Pipeline: built-in engine vs external plugin

The choice between "built-in engine" and "external plugin" is made once per sql[] block in internal/cmd/process.go, based on engine: in config: if it is sqlite, mysql, or postgresql → built-in path; if it is a name listed under top-level engines: → plugin path. (Vet has no plugin logic; for plugin-engine blocks it fails with "unknown engine".)

flowchart TB
  subgraph input["Input (per sql[] block)"]
    schema[schema.sql]
    queries[queries.sql]
  end

  start["processQuerySets()"] --> input
  input --> branch{"engine for this sql[]"}

  branch -->|"sqlite / mysql / postgresql"| builtin["Built-in path"]

  subgraph builtin_flow["Built-in path (parse → NewCompiler → Result)"]
    direction TB
    parser[Parser]
    ast[(AST)]
    catalog[(Catalog)]
    compiler[Compiler]
    codegen_input_b[Queries + types]
    parser --> ast
    schema --> parser
    queries --> parser
    schema --> catalog
    ast --> compiler
    catalog --> compiler
    compiler --> codegen_input_b
  end

  builtin --> builtin_flow

  branch -->|"name in engines"| plugin_path["Plugin path"]

  subgraph plugin_flow["Plugin path (runPluginQuerySet)"]
    direction TB
    adapter[engine process runner]
    ext_parse["Engine plugin: Parse"]
    response["ParseResponse: sql, parameters, columns"]
    adapter -->|"stdin: ParseRequest"| ext_parse
    ext_parse -->|"stdout: ParseResponse"| response
    schema --> adapter
    queries --> adapter
  end

  plugin_path --> plugin_flow

  subgraph plugin_to_result["Adapter"]
    adapt["pluginResponseToCompilerQuery"]
    response --> adapt
  end

  builtin_flow --> merge[ProcessResult → codegen]
  plugin_flow --> plugin_to_result
  plugin_to_result --> merge

  subgraph output["Output"]
    codegen[Codegen plugin]
  end

  merge --> codegen
Loading
  • Built-in path: Parser → AST, schema → Catalog; Compiler uses both and produces Queries + types. Codegen receives that as usual.

  • Plugin path: Engine process runner sends ParseRequest (sql + schema_sql or connection_params) to the external process; plugin returns ParseResponse (sql, parameters, columns). That is passed through a thin adapter into the same ProcessResult/codegen shape. No AST, no compiler — validation and enrichment are the plugin's job.

  • No intermediate AST: the plugin returns already “resolved” data (SQL text, parameters, columns).

  • No compiler for the plugin path: type resolution, * expansion, and validation are the plugin’s job. sqlc does not run the built-in compiler on plugin output.

  • Data from the plugin is passed through to the codegen plugin as-is (or after a thin adapter that today still produces a synthetic []ast.Statement for compatibility; the useful payload is sql + parameters + columns).

So: for external engines, the pipeline is effectively schema + queries → engine plugin (Parse) → (sql, parameters, columns) → codegen, with no AST and no compiler in between.

Call flow (built-in vs plugin)

The split between built-in and plugin engines happens in internal/cmd/process.go inside processQuerySets(), which branches on config.IsBuiltinEngine(combo.Package.Engine).

  • Built-in path: For engine: sqlite | mysql | postgresql, processQuerySets calls parse() (defined in internal/cmd/generate.go). parse() calls compiler.NewCompiler(sql, combo, parserOpts). NewCompiler in internal/compiler/engine.go has only three cases (SQLite, MySQL, PostgreSQL) and default: return nil, fmt.Errorf("unknown engine: %s", conf.Engine); there is no plugin branch and no FindEnginePlugin in the compiler.

  • Plugin path: For any other engine: (e.g. a name defined under engines: in config), processQuerySets calls runPluginQuerySet() in internal/cmd/plugin_engine.go. parse() and NewCompiler are not used; the engine process runner talks to the external plugin and passes the result into the same codegen pipeline.

  • Vet: The vet command uses its own path and calls parse() / NewCompiler(), so vet fails for plugin engines with "unknown engine" (there is no plugin-specific branch in vet).

Summary: The branch and orchestration live in process.go. Built-in logic goes through generate.gocompiler/engine.go; plugin logic is in plugin_engine.go.

No intermediate AST for external plugins

The plugin does not return an AST or “statements + AST”:

  • Request: query text + schema (or connection).
  • Response: sql (possibly with * expanded), parameters, columns.

The plugin is the single place that defines how the query is interpreted. sqlc does not parse or analyze that SQL again; it forwards the plugin’s ParseResponse toward codegen. Any internal use of []ast.Statement for the plugin path is a compatibility shim; the semantics are driven by the plugin’s sql / parameters / columns.


No compiler for external plugins

The built-in compiler (catalog, type resolution, validation, expansion of *) is not used for external engine plugins:

  • The plugin is responsible for:
    • Resolving parameter and column types (using schema or DB).
    • Expanding SELECT * if desired.
    • Emitting whatever shape of parameters and columns the codegen expects.
  • sqlc does not run the compiler on plugin output; it passes that output through to codegen. So “compiler” is only in the built-in-engine path.

What is sent to and returned from the plugin

Invocation: one RPC, Parse, over stdin/stdout (protobuf).
Example: sqlc-engine-mydb parse with ParseRequest on stdin and ParseResponse on stdout.

Sent to the plugin (ParseRequest)

Field Description
sql Query text to parse (from queries.sql or the current batch).
schema_sql (optional) Contents of the schema file(s), e.g. concatenated schema.sql.
connection_params (optional) DSN + options for “database-only” mode when schema is taken from the DB.

Exactly one of schema_sql or connection_params is used per request, depending on how the project is configured (see below).

Returned from the plugin (ParseResponse)

Field Description
sql Processed SQL. Can be the same as input, or e.g. SELECT * expanded to explicit columns.
parameters List of parameters: name, position, data_type, nullable, is_array, array_dims.
columns List of result columns: name, data_type, nullable, is_array, array_dims, optional table/schema.

These three are enough for codegen to generate type-safe code without an AST or compiler step.


How the schema is passed into the plugin

Schema is provided to the plugin in one of two ways, via ParseRequest.schema_source:

  1. Schema-based (files)

    • sqlc reads the configured schema files (e.g. schema: "schema.sql") and passes their contents as schema_sql (a string) in ParseRequest.
    • The plugin parses this SQL (e.g. CREATE TABLE ...) and uses it to resolve types, expand *, etc.
  2. Database-only

    • When schema is not from files, sqlc can pass connection_params (DSN + optional extra options) in ParseRequest.
    • The plugin connects to the DB and uses live metadata (e.g. INFORMATION_SCHEMA / pg_catalog) to resolve types and columns.

So: schema is either “schema.sql as text” or “connection params to the database”; the plugin chooses how to use it.


Changes in sqlc.yaml

New top-level engines

Plugins are declared under engines and referenced by name in sql[].engine:

version: "2"

engines:
  - name: mydb
    process:
      cmd: sqlc-engine-mydb
    env:
      - MYDB_DSN

sql:
  - engine: mydb
    schema: "schema.sql"
    queries: "queries.sql"
    codegen:
      - plugin: go
        out: db
  • engines: list of named engines. Each has name and either process.cmd (and optionally env) or a WASM config.
  • sql[].engine: for that SQL block, use the engine named mydb (which triggers the plugin) instead of postgresql / mysql / sqlite.

So the only new concept in config is “define engines (including plugins) by name, then point sql[].engine at them.” Schema and queries are still configured per sql[] block as today.


Who handles sqlc placeholders in queries

Support for sqlc-style placeholders (sqlc.arg(), sqlc.narg(), sqlc.slice(), sqlc.embed(), etc.) is entirely up to the plugin:

  • The plugin receives the raw query text (including those macros) in ParseRequest.sql.
  • It can parse and interpret them and reflect the result in parameters (and, if needed, in sql or in how it uses schema). There is no separate “sqlc placeholder” pass in the core for the plugin path.
  • If the plugin does not handle a placeholder, that placeholder will not be turned into proper parameters/columns by sqlc; the pipeline does not add a generic placeholder expander for external engines.

So: the database engine plugin is responsible for understanding and handling sqlc placeholders for its engine.


Summary for maintainers

  • One RPC: Parse(sql, schema_sql | connection_params) → (sql, parameters, columns).
  • No AST, no compiler on the plugin path; data flows from plugin to codegen.
  • Schema is passed either as schema_sql (file contents) or as connection_params (DSN) in ParseRequest.
  • Config: engines[] + sql[].engine: <name>; existing schema / queries / codegen stay as-is.
  • Placeholders: handled inside the plugin; core does not add a generic layer for external engines.

This keeps the plugin API small and leaves type resolution and dialect behavior inside the plugin, while still allowing sqlc to drive generation from a single, well-defined contract.

@asmyasnikov asmyasnikov changed the title Added support for the database engine plugin system for extending sqlc with new databases (in addition to PostgreSQL, Dolphin, sqlite) Added support for the database engine plugin system for extending sqlc with new databases (in addition to PostgreSQL, Dolphin (MySQL), sqlite) Jan 19, 2026
@asmyasnikov asmyasnikov force-pushed the engine-plugin branch 2 times, most recently from a5131b5 to 7609ebc Compare January 26, 2026 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant