[PoC] Unified SQL Support#8
Draft
dai-chen wants to merge 8 commits into
Draft
Conversation
Add two new language paths to UnifiedQueryPlanner alongside PPL: - SQL: Uses ANTLR-based SQLSyntaxParser and SQL AstBuilder to produce the shared UnresolvedPlan AST, then follows the same CalciteRelNodeVisitor path as PPL to produce a RelNode. - ANSI_SQL: Bypasses the AST layer entirely and uses Calcite's built-in SqlParser -> SqlValidator -> SqlToRelConverter -> RelNode pipeline. All three paths converge at RelNode, so UnifiedQueryCompiler and UnifiedQueryTranspiler work unchanged downstream. Changes: - Add ANSI_SQL to QueryType enum - Add sql module dependency to api/build.gradle - Refactor UnifiedQueryPlanner.plan() to branch on query type - Refactor buildQueryParser() and parse() for SQL dispatch - Add planWithCalcite() for ANSI_SQL path using Frameworks.getPlanner - Update tests to reflect SQL and ANSI_SQL as supported types
Add end-to-end integration tests in UnifiedQueryOpenSearchIT that verify both SQL and ANSI SQL query paths work against a live OpenSearch cluster. Also add Alias and Limit handling to CalciteRelNodeVisitor to support SQL AST nodes that differ from PPL: - Alias: SQL wraps all select items in Alias nodes - Limit: SQL uses Limit AST node instead of PPL's Head node
Add two more integration tests to UnifiedQueryOpenSearchIT: - testSQLWithMatchUDF: Demonstrates OpenSearch SQL with the match() relevance function (a PPL UDF) in the WHERE clause, verifying that registered UDFs work through the SQL AST path. - testAnsiSQLSelfJoinWithAggregation: Demonstrates ANSI SQL with a self INNER JOIN, WHERE filtering on both sides, GROUP BY, COUNT(*) aggregation, and ORDER BY DESC through Calcite's native pipeline.
…tests Rewrite all 4 new integration tests to use the consistent verify/expectSchema/expectData assertion pattern with deterministic queries that return exact known results.
Reflect the new language support in the unified query API README: - Update UnifiedQueryPlanner description for all three languages - Add QueryType.SQL and QueryType.ANSI_SQL context examples - Add planner usage examples for SQL and ANSI SQL - Remove completed future work item about SQL support
…patch Remove ANSI_SQL from QueryType enum and introduce a conformance-based model instead. SQL dialect selection is now orthogonal to language type: - QueryType has two values: PPL and SQL - When SQL with no conformance set: OpenSearch ANTLR parser (default) - When SQL with conformance set: Calcite native SqlParser with that conformance level (DEFAULT, MYSQL_5, ORACLE_12, BIG_QUERY, etc.) This exposes Calcite's SqlConformance directly as SPI, consistent with the existing approach of exposing Schema, RelNode, and SqlDialect. Changes: - Remove ANSI_SQL from QueryType enum - Add conformance field and builder method to UnifiedQueryContext - Wire conformance into FrameworkConfig's SqlParser.Config - Planner dispatches based on conformance presence, not query type - Update all tests and README to use new API
Signed-off-by: Chen Dai <daichen@amazon.com>
Signed-off-by: Chen Dai <daichen@amazon.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PoC: SQL and ANSI SQL Support in Unified Query API
Adds SQL language support to the unified query API alongside existing PPL, with two parsing modes:
OpenSearch SQL (default) — Routes through our ANTLR-based SQLSyntaxParser → AstBuilder → CalciteRelNodeVisitor → RelNode. Shares the same AST visitor path as PPL, supporting OpenSearch-specific UDFs like match()
and match_phrase().
Calcite-native SQL (opt-in via conformance) — Bypasses our AST layer entirely, using Calcite's built-in SqlParser → SqlValidator → SqlToRelConverter → RelNode pipeline. Supports standard SQL with dialect-specific
conformance levels (MySQL, Oracle, BigQuery, Presto, etc.).
Both paths converge at RelNode, so UnifiedQueryCompiler and UnifiedQueryTranspiler work unchanged.
API
Changes