Skip to content

expr: refactor VariadicFunc to trait-based dispatch with typed inputs#34985

Draft
antiguru wants to merge 13 commits intoMaterializeInc:mainfrom
antiguru:variadic_func
Draft

expr: refactor VariadicFunc to trait-based dispatch with typed inputs#34985
antiguru wants to merge 13 commits intoMaterializeInc:mainfrom
antiguru:variadic_func

Conversation

@antiguru
Copy link
Member

Summary

  • Refactor the monolithic VariadicFunc enum to a trait-based dispatch pattern with EagerVariadicFunc / LazyVariadicFunc and a derive_variadic! macro, mirroring the existing BinaryFunc architecture.
  • Add a VariadicInput trait with typed Input/Output associated types on EagerVariadicFunc, replacing untyped &[Datum] parameters. Uses composable datum-peeling (&mut &[Datum]) with blanket DatumTypeVariadicInput bridge, OptionalArg, and tuple impls.
  • Default propagates_nulls/introduces_nulls/could_error derived from Input/Output types, removing boilerplate from most function impls.

Test plan

  • cargo check -p mz-expr — core crate compiles
  • cargo check -p mz-sql -p mz-transform -p mz-expr-parser — consumers compile
  • cargo test -p mz-expr — all 532 tests pass

🤖 Generated with Claude Code

@github-actions
Copy link

Pre-merge checklist

  • The PR title is descriptive and will make sense in the git log.
  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

antiguru and others added 12 commits February 13, 2026 11:10
Convert VariadicFunc from a monolithic enum with hand-written match arms
to the same trait-based pattern used by UnaryFunc and BinaryFunc. Each of
the 42 variants becomes its own struct implementing either LazyVariadicFunc
(for lazy-evaluated functions like And/Or) or EagerVariadicFunc (for eager
functions), with a derive_variadic! macro generating the dispatch enum.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use the modern Rust module layout convention.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Change switch_and_or, unit_of_and_or, and zero_of_and_or to return
Option instead of panicking on non-And/Or variants. Refactor call sites
to use the Option directly, removing redundant And/Or match guards.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduce a `VariadicInput` trait that converts a datum slice into typed
function inputs, replacing the untyped `&[Datum<'a>]` parameter on
`EagerVariadicFunc::call`. This follows the same pattern as
`EagerBinaryFunc` with `Input1`/`Input2`/`Output` associated types.

Key design points:

- Two lifetimes `<'a: 'b, 'b>` where `'a` is datum content and `'b` is
  the slice borrow, allowing `type Input = &'b [Datum<'a>]` without
  requiring the slice to live for `'a`.

- `try_from_datums(&mut &'b [Datum<'a>])` peels consumed elements off
  the front, enabling composable parsing.

- Blanket `impl<T: DatumType> VariadicInput for T` bridges single
  elements; `OptionalArg<T>` and `&'b [Datum<'a>]` have direct impls.
  Tuple impls delegate to each element's `VariadicInput`.

- `impl_lazy_from_eager!` macro replaces the blanket `LazyVariadicFunc`
  impl since Rust HRTB cannot express `for<'a: 'b, 'b>`.

- Default `propagates_nulls`/`introduces_nulls`/`could_error` are
  derived from `Input`/`Output` types, removing boilerplate from most
  function impls.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Change VariadicInput::try_from_datums to take an iterator of
Result<Datum<'a>, E> instead of a slice reference. This avoids
allocating a Vec for fixed-arity variadic functions by lazily
evaluating expressions from the iterator.

Add Present<T> wrapper for DatumType elements in tuple inputs to
avoid coherence issues with blanket impls. Update all 30 eager
function implementations to use Present<T> wrapping and Vec<Datum>
for truly variadic inputs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Moritz Hoffmann <antiguru@gmail.com>
Replace `Result<Datum<'a>, EvalError>` with concrete types where
possible so that `introduces_nulls`, `could_error`, and `output_type`
are correctly derived from the type system rather than needing manual
overrides.

Functions updated:
- String returns (&'a str): Concat, ConcatWs, PadLeading, Substr,
  Replace, Translate, SplitPart, RegexpReplace
- Bytes returns (&'a [u8]): HmacString, HmacBytes, hmac_inner
- Timestamp returns: DateBinTimestamp, DateBinTimestampTz
- Integer returns (i64): DateDiff{Timestamp,TimestampTz,Date,Time}
- Time returns (NaiveTime): TimezoneTime
- ACL returns: MakeAclItem (AclItem), MakeMzAclItem (MzAclItem)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Generalize the Vec VariadicInput impl from Vec<Datum<'a>> to
Vec<T: VariadicInput>, enabling structured element types like
Present<Option<&str>> and (Present<K>, Present<V>) tuples.

Key changes:
- Concat/ConcatWs: Vec<Present<Option<&str>>> with owned String output
- JsonbBuild*: Vec<Present<Datum>> / Vec<(Present<Datum>, Present<Datum>)>
- MapBuild: Vec<(Present<Option<&str>>, Present<Datum>)>
- ArrayIndex: (Present<Array>, Vec<Present<i64>>) with inlined impl
- ListIndex: (Present<DatumList>, Vec<Present<i64>>)
- ArrayToString: Present<Array> input with owned String output
- MakeTimestamp: Result<Option<CheckedTimestamp<...>>, EvalError> output
- Replace/Translate: owned String output (no temp_storage needed)
- Add Deref impls and derive macros for Present<T> and OptionalArg<T>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Moritz Hoffmann <antiguru@gmail.com>
The prior commit removing is_infix_op was too aggressive — And/Or need
infix display formatting for EXPLAIN output (e.g., `(a AND b AND c)`
instead of `AND(a, b, c)`).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Moritz Hoffmann <antiguru@gmail.com>
…unctions

Fix output_type to correctly compute nullability as
`output.nullable || (propagates_nulls && in_nullable)` across all
EagerVariadicFunc impls, matching the semantics expected by the
smoketest_all_builtins test.

Add explicit `introduces_nulls() -> false` overrides for functions with
`Result<Datum<'a>, EvalError>` output that never produce null (MapBuild,
ArrayCreate, ListCreate, RecordCreate, ListSliceLinear, RangeCreate,
ArrayFill, RegexpSplitToArray).

Fix RegexpMatch to use `Present<&'a str>` instead of `Present<Datum<'a>>`
for the haystack input, so null propagation is handled by the type system
rather than panicking in regexp_match_static.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Consolidate the variadic function implementations into a single file
to reduce the diff to main.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant