[RFC] Implicit Type Coercion with Calcite

 ### Problem Statement

Implicit type coercion is missing from the current PPL validation. It is useful in at least two scenarios:

1. when the actual argument types of a function don't match with its expected types (should coerce to expected param type)
2. when the two operand types of a comparator are different (should coerce to a common type)

**Relevant issue**: https://github.com/opensearch-project/sql/issues/3761

**Status update**:

- Approach 1: https://github.com/yuancu/sql-plugin/pull/6 Discarded due to the re-introduction of `SqlNode` (**Update**: picked up with #4892)
- Approach 2 (alternative proposal): https://github.com/opensearch-project/sql/pull/3914 Merged

### Current State

As a result, the current function implementation must have an override for each parameter type it supports, even when they could be implicitly cast. For example, a function `foo` that accepts a date as an argument should also accept `"2020-12-10"` as an argument. However, without implicit type coercion, we have to define both `foo(date)` and `foo(string)`.

### How Calcite Implements Implicit Type Coercion

Calcite checks operand types and attempts to coerce them if necessary during the **validation phase**.

For example, in FamilyOperandTypeChecker, when an argument type doesn't match expectations, Calcite tries to determine if the argument can be coerced to the expected type. The simplified [code snippet](https://github.com/apache/calcite/blob/dafb8d7a1fde884daf76d2a24aa6aea9c39ee5ca/core/src/main/java/org/apache/calcite/sql/type/FamilyOperandTypeChecker.java#L144) below illustrates this process:

```java
 public boolean checkOperandTypes(
      SqlCallBinding callBinding,
      boolean throwOnFailure){
      ...
if (!checkSingleOperandType(callBinding, op.e, op.i, families.get(op.i), false)) {
	    // try to coerce type if it is allowed.
	  boolean coerced = false;
	  if (callBinding.isTypeCoercionEnabled()) {
	    TypeCoercion typeCoercion = callBinding.getValidator().getTypeCoercion();
	    ImmutableList.Builder<RelDataType> builder = ImmutableList.builder();
	    for (int i = 0; i < callBinding.getOperandCount(); i++) {
	      builder.add(callBinding.getOperandType(i));
	    }
	    ImmutableList<RelDataType> dataTypes = builder.build();
	    coerced = typeCoercion.builtinFunctionCoercion(callBinding, dataTypes, families);
	  }
}
```

In this process, `builtinFunctionCoercion` calls [`coerceOperandType`](https://github.com/apache/calcite/blob/dafb8d7a1fde884daf76d2a24aa6aea9c39ee5ca/core/src/main/java/org/apache/calcite/sql/validate/implicit/AbstractTypeCoercion.java#L119C1-L128C17) to rewrite the operands when needed. The [coerceOperandType](https://github.com/apache/calcite/blob/dafb8d7a1fde884daf76d2a24aa6aea9c39ee5ca/core/src/main/java/org/apache/calcite/sql/validate/implicit/AbstractTypeCoercion.java#L119C1-L128C17) method rewrites the query by wrapping the argument with a `CAST` call:

```java
 protected boolean coerceOperandType(SqlValidatorScope scope, SqlCall call, int index, RelDataType targetType) {
		 ...
    RelDataType targetType1 = syncAttributes(operandType, targetType);
    SqlNode desired = castTo(operand, targetType1);
    call.setOperand(index, desired);
    updateInferredType(desired, targetType1);
    return true;
  }
```

To determine whether an argument can be cast to the target type, Calcite defines rules in 
`SqlTypeCoercionRule` and `SqlTypeAssignmentRule`. Additionally, the `TypeCoercion` interface defines how a type should be cast to another.

### Challenges in Applying Calcite's Type Coercion

The validation phase of Calcite SQL validates a parsed `SqlNode` AST. It resolves identifiers, derives return types, validates argument types, etc. However, since we created a new language interface outside of SQL, this phase isn't directly applicable and is **skipped** in our current implementation.

As a result, we can't directly utilize Calcite's type coercion mechanism.


### Proposal

We faced a similar issue with function argument type checking. We [implemented](https://github.com/opensearch-project/sql/pull/3626) our own argument validation initially. ~~However, later we [discovered](https://github.com/opensearch-project/sql/pull/3626#discussion_r2104225644) that other language interfaces—like [calcite-piglet](https://calcite.apache.org/javadocAggregate/org/apache/calcite/piglet/package-summary.html)—successfully leverage Calcite's type validation.~~ **Update**: with a deeper inspection, I found that piglet actually uses its own type check, without re-using Calcite's type checking mechanism at all.

Calcite-piglet translates its logical plans into a SQL AST (represented with `SqlNode`) to reuse Calcite's type checking. It first constructs its own AST via a custom parser, then converts each node into a corresponding `SqlNode`. ~~This approach allows it to access Calcite's standard validation flow and leverage its built-in type checking. The key implementation is in [PigRelToSqlConverter](https://calcite.apache.org/javadocAggregate/org/apache/calcite/piglet/PigRelToSqlConverter.html).~~

~~I propose we adopt a similar approach by converting PPL to `SqlNode`.~~ However, we could still try doing so. This should eliminate many future issues by reusing more of Calcite's functionality.

**My concern over this proposal**

With this approach, we **re-introduce** the `SqlNode` layer. This contradicts our previous design, which skips the layer.

- Calcite's workflow
  ```mermaid
  flowchart LR
  SQL -->|parse| SqlNode1[SqlNode] -->|validation| SqlNode2[SqlNode] -->|convert| RelNode1[RelNode] -->|optimize| RelNode2[RelNode]
  ```
- Our existing design
  ```mermaid
  flowchart LR
  PPL -->|parse| AST -->|convert| RelNode1[RelNode] -->|optimize| RelNode2[RelNode]
  ```
- With this proposal (the blocks in blue are added phases)
  ```mermaid
  flowchart LR
  PPL -->|parse| AST -->|convert| RelNode1[RelNode] -->|convert| SqlNode1[SqlNode]:::blue -->|validate| SqlNode2[SqlNode]:::blue -->|convert| RelNode2[RelNode]:::blue -->|optimize| RelNode3[RelNode]
  
  classDef blue fill:#2196F3,stroke:#0D47A1,color:white
  ```

It's like converting PPL to SQL for validation purpose, making people doubting why is the SqlNode phase skipped at the first place.

### Approach

This solution requires the following steps:

- [ ]  Convert PPL's logical plan to SqlNode by creating a similar `PplRelToSqlConverter`
    - [ ]  The handling of UDT needs more investigation
- [ ]  Refactor the current argument type validation with the converted SQL AST
- [ ]  Implement function type coercion based on this

**Long-Term Goals**
* Automatic coercion makes defining functions simpler and easier to maintain.
* Make use of more Calcite's validation mechanism to reduce maintenance effort and make the implementation more robust.

**Alternative**

```mermaid
flowchart LR
PPL -->|parse| AST -->|convert & cast| RelNode1[RelNode]:::blue -->|optimize| RelNode3[RelNode]

classDef blue fill:#2196F3,stroke:#0D47A1,color:white
```

- Make coercion directly at the `RelNode` level
  - Merits: Aligns with our existing design
  - Potential drawbacks: We make less use of what's already there in Calcite. It may leads to more investment in validation in the future. For example, we could have eliminated the effort to implement type checkers if we introduced the `SqlNode` layer.

**Implementation Discussion**



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] Implicit Type Coercion with Calcite #3865

Problem Statement

Current State

How Calcite Implements Implicit Type Coercion

Challenges in Applying Calcite's Type Coercion

Proposal

Approach

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] Implicit Type Coercion with Calcite #3865

Description

Problem Statement

Current State

How Calcite Implements Implicit Type Coercion

Challenges in Applying Calcite's Type Coercion

Proposal

Approach

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions