Skip to content

[PoC] Unified SQL Support#8

Draft
dai-chen wants to merge 8 commits into
mainfrom
poc/unified-sql-support
Draft

[PoC] Unified SQL Support#8
dai-chen wants to merge 8 commits into
mainfrom
poc/unified-sql-support

Conversation

@dai-chen
Copy link
Copy Markdown
Owner

@dai-chen dai-chen commented Mar 5, 2026

PoC: SQL and ANSI SQL Support in Unified Query API

Adds SQL language support to the unified query API alongside existing PPL, with two parsing modes:

OpenSearch SQL (default) — Routes through our ANTLR-based SQLSyntaxParser → AstBuilder → CalciteRelNodeVisitor → RelNode. Shares the same AST visitor path as PPL, supporting OpenSearch-specific UDFs like match()
and match_phrase().

Calcite-native SQL (opt-in via conformance) — Bypasses our AST layer entirely, using Calcite's built-in SqlParser → SqlValidator → SqlToRelConverter → RelNode pipeline. Supports standard SQL with dialect-specific
conformance levels (MySQL, Oracle, BigQuery, Presto, etc.).

Both paths converge at RelNode, so UnifiedQueryCompiler and UnifiedQueryTranspiler work unchanged.

API

java
// OpenSearch SQL (default — ANTLR parser, supports match() etc.)
.language(QueryType.SQL)

// Standard SQL via Calcite native parser
.language(QueryType.SQL).conformance(SqlConformanceEnum.DEFAULT)

// MySQL-compatible SQL
.language(QueryType.SQL).conformance(SqlConformanceEnum.MYSQL_5)

Changes

  • QueryType — kept as PPL, SQL (no new enum values)
  • UnifiedQueryContext — added .conformance(SqlConformance) builder method; wired into FrameworkConfig's SqlParser.Config
  • UnifiedQueryPlanner — dispatches to ANTLR or Calcite parser based on conformance presence
  • CalciteRelNodeVisitor — added Alias and Limit AST node handling for SQL compatibility
  • api/build.gradle — added sql module dependency
  • Integration tests — 4 new e2e tests against live OpenSearch: simple SQL, match() UDF, standard SQL, and self-join with aggregation

dai-chen added 8 commits March 4, 2026 13:40
Add two new language paths to UnifiedQueryPlanner alongside PPL:

- SQL: Uses ANTLR-based SQLSyntaxParser and SQL AstBuilder to produce
  the shared UnresolvedPlan AST, then follows the same
  CalciteRelNodeVisitor path as PPL to produce a RelNode.

- ANSI_SQL: Bypasses the AST layer entirely and uses Calcite's built-in
  SqlParser -> SqlValidator -> SqlToRelConverter -> RelNode pipeline.

All three paths converge at RelNode, so UnifiedQueryCompiler and
UnifiedQueryTranspiler work unchanged downstream.

Changes:
- Add ANSI_SQL to QueryType enum
- Add sql module dependency to api/build.gradle
- Refactor UnifiedQueryPlanner.plan() to branch on query type
- Refactor buildQueryParser() and parse() for SQL dispatch
- Add planWithCalcite() for ANSI_SQL path using Frameworks.getPlanner
- Update tests to reflect SQL and ANSI_SQL as supported types
Add end-to-end integration tests in UnifiedQueryOpenSearchIT that
verify both SQL and ANSI SQL query paths work against a live
OpenSearch cluster.

Also add Alias and Limit handling to CalciteRelNodeVisitor to support
SQL AST nodes that differ from PPL:
- Alias: SQL wraps all select items in Alias nodes
- Limit: SQL uses Limit AST node instead of PPL's Head node
Add two more integration tests to UnifiedQueryOpenSearchIT:

- testSQLWithMatchUDF: Demonstrates OpenSearch SQL with the match()
  relevance function (a PPL UDF) in the WHERE clause, verifying that
  registered UDFs work through the SQL AST path.

- testAnsiSQLSelfJoinWithAggregation: Demonstrates ANSI SQL with a
  self INNER JOIN, WHERE filtering on both sides, GROUP BY, COUNT(*)
  aggregation, and ORDER BY DESC through Calcite's native pipeline.
…tests

Rewrite all 4 new integration tests to use the consistent
verify/expectSchema/expectData assertion pattern with deterministic
queries that return exact known results.
Reflect the new language support in the unified query API README:
- Update UnifiedQueryPlanner description for all three languages
- Add QueryType.SQL and QueryType.ANSI_SQL context examples
- Add planner usage examples for SQL and ANSI SQL
- Remove completed future work item about SQL support
…patch

Remove ANSI_SQL from QueryType enum and introduce a conformance-based
model instead. SQL dialect selection is now orthogonal to language type:

- QueryType has two values: PPL and SQL
- When SQL with no conformance set: OpenSearch ANTLR parser (default)
- When SQL with conformance set: Calcite native SqlParser with that
  conformance level (DEFAULT, MYSQL_5, ORACLE_12, BIG_QUERY, etc.)

This exposes Calcite's SqlConformance directly as SPI, consistent with
the existing approach of exposing Schema, RelNode, and SqlDialect.

Changes:
- Remove ANSI_SQL from QueryType enum
- Add conformance field and builder method to UnifiedQueryContext
- Wire conformance into FrameworkConfig's SqlParser.Config
- Planner dispatches based on conformance presence, not query type
- Update all tests and README to use new API
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant