Skip to content

Docs gap: UDA process() receives _tp_delta column for changelog inputs (JS/Py/Remote UDA) #631

@yokofly

Description

@yokofly

Problem

When a JavaScript / Python / Remote UDA reads from a changelog input (CDC stream, versioned_kv, EMIT CHANGELOG, or downstream of a global aggregation), the engine appends an extra trailing _tp_delta column (+1 for insert, -1 for retract) to the arguments passed to the UDA's process(...). The UDA must use this delta to add or subtract from its accumulated state.

This contract is implemented in the engine — e.g. AggregateFunctionJavaScriptAdapter.h:

"If the input stream is changelog, aggregate function will pass _tp_delta column to JavaScript function"

and AggregateFunctionPythonAdapter.cpp does the same for Python adapters (insert +1 on add, -1 on negate as the trailing column).

But it is not documented anywhere in the UDA pages:

  • docs/js-udf.md
  • docs/py-udf.md
  • docs/remote-udf.md

_tp_delta is documented as a general changelog-stream concept (docs/changelog-stream.md, docs/global-aggregation.md, docs/streaming-aggregations.md), but a user writing a UDA has no way to learn that their process() will receive an extra trailing array when the input is a changelog — leading to silently wrong aggregates on retracts.

Suggested fix

Add a "Changelog input" subsection to each UDA page covering:

  • when the extra _tp_delta argument is appended (changelog inputs only)
  • its shape (trailing column, same length as the other argument arrays, values ∈ {+1, -1})
  • a small worked example showing add-vs-subtract handling in process()
  • interaction with has_customized_emit

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions