Skip to content

Feature/ppl udt compilation adapter#19

Draft
dai-chen wants to merge 1 commit into
mainfrom
feature/ppl-udt-compilation-adapter
Draft

Feature/ppl udt compilation adapter#19
dai-chen wants to merge 1 commit into
mainfrom
feature/ppl-udt-compilation-adapter

Conversation

@dai-chen
Copy link
Copy Markdown
Owner

@dai-chen dai-chen commented May 7, 2026

Add preCompilationRules to LanguageSpec.LanguageExtension and register DatetimeUdfCompilationAdapterRule in DatetimeExtension. The rule inserts CAST nodes to bridge the type mismatch between normalized standard types (DATE/TIME/TIMESTAMP as int/long) and PPL UDF implementors (which expect and produce String values):

Before: LAST_DAY($2:DATE) : DATE
After: CAST(LAST_DAY(CAST($2 AS VARCHAR)):VARCHAR AS DATE)

Applied only in UnifiedQueryCompiler before Enumerable code generation, so the logical plan seen by other consumers (Analytics Engine, Substrait) remains clean with standard types.

Description

[Describe what this change achieves]

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

… UDF

Add preCompilationRules to bridge the type mismatch between normalized
standard types (DATE/TIME/TIMESTAMP as int/long) and PPL UDF implementors
(which expect and produce String values):

1. DatetimeUdfCompilationAdapterRule inserts CAST nodes around datetime
   UDFs so implementors receive String input and produce String output,
   with CASTs bridging int/long <-> String conversion.

2. DatetimeUdtNormalizeRule enhanced to handle LogicalAggregate (rebuild
   AggregateCall with re-inferred types) and LogicalProject (refresh
   RexInputRef types from new child row type) to prevent type mismatch
   assertions when datetime UDF results feed into aggregates.

Both fixes are only needed for the UnifiedQueryCompiler (Enumerable)
path. The Analytics Engine (Substrait/DataFusion) path is unaffected.

Signed-off-by: Chen Dai <daichen@amazon.com>
@dai-chen dai-chen force-pushed the feature/ppl-udt-compilation-adapter branch from 8995649 to 7be9575 Compare May 8, 2026 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant