Skip to content

ENH: Rewrite ArrayCalculator filter with new expression engine#1567

Open
joeykleingers wants to merge 13 commits intoBlueQuartzSoftware:developfrom
joeykleingers:worktree-ArrayCalculatorRewrite
Open

ENH: Rewrite ArrayCalculator filter with new expression engine#1567
joeykleingers wants to merge 13 commits intoBlueQuartzSoftware:developfrom
joeykleingers:worktree-ArrayCalculatorRewrite

Conversation

@joeykleingers
Copy link
Contributor

@joeykleingers joeykleingers commented Mar 23, 2026

Summary

Complete rewrite of the ArrayCalculator filter, replacing ~65 utility class files with a consolidated expression engine in a single algorithm file pair.

What's better about the new version

  • Dramatically simpler codebase: ~65 separate operator/utility class files (each with .hpp/.cpp) consolidated into ArrayCalculator.hpp/cpp. Adding a new operator is one line in a registry table instead of two new files.
  • Robust hand-written tokenizer replacing the fragile regex-based parser that was difficult to maintain and debug.
  • Static operator registry table with all 23 operators defined as data, replacing 30+ class files with deep inheritance hierarchies where the actual logic was a one-line lambda.
  • Clean shunting-yard RPN evaluator with proper precedence, associativity, and trig mode handling.
  • 7 bugs fixed from the original implementation: null-check-after-dereference, off-by-one in component extraction, dead code, non-unique scratch names, and more.
  • All index variables use usize — no integer overflow risk with large arrays.
  • CalculatorParameter changed from MutableDataParameter to ValueParameter — removes the framework-rendered data path selector that is no longer needed.

New features

  • Arrays from anywhere in the DataStructure: No longer restricted to a single AttributeMatrix. Bare array names are auto-searched across the entire DataStructure. If ambiguous, users can disambiguate with quoted full paths (e.g., "CellData/Confidence Index").
  • Selected group as priority hint: When a selected_group is present (from old pipelines), it takes priority for bare name resolution, ensuring backward compatibility.
  • Built-in constants: pi and e resolve to std::numbers::pi and std::numbers::e with full double precision.
  • Modulo operator: % for std::fmod(a, b).
  • Tuple+component indexing: Array[T, C] extracts a single scalar value at tuple T, component C. Broadcasts like a literal when combined with arrays.
  • Sub-expression component access: (ArrayA + ArrayB)[0] extracts component 0 from the result of a computed expression.
  • Sub-expression tuple+component access: (ArrayA + ArrayB)[T, C] extracts a single scalar value at tuple T, component C from a computed expression result. Broadcasts like a literal.
  • Quoted full-path references: "DataContainer/CellData/Array Name" resolves directly by DataPath.

Backward compatibility

  • All existing unit tests pass unchanged (3689 assertions) — this was the hard gate before any old code was removed.
  • Old pipeline JSON loads and executes identically: The selected_group key is still read from pipeline JSON. When present, it acts as a priority hint for bare name resolution, producing the same behavior as the old filter.
  • FromSIMPLJson conversion still works: Legacy SIMPL pipelines convert correctly.
  • Error code numeric values preserved: External code checking specific error codes continues to work.
  • parametersVersion bumped to 2: fromJsonImpl handles both version 1 and version 2 JSON seamlessly.
  • Backward-compatible pipeline tested via nxrunner with non-empty selected_group — all steps execute successfully.

Known issue: TBB crash on exit in Debug builds

These changes surface a pre-existing crash-on-exit in Debug builds only. The crash occurs in a TBB worker thread (tbb::detail::r1::rml::private_worker::run()) jumping to unmapped memory during exit() / __cxa_finalize_ranges. It does not occur in Release builds.

Root cause: Debug builds link libtbb_debug.12.4.dylib (from vcpkg/simplnx), but VTK is built in Release and links libtbb.12.4.dylib (release). Both TBB runtimes are loaded into the same process simultaneously. During shutdown, the release TBB's worker threads try to execute tasks whose code has been unloaded, crashing at an invalid address.

How to fix: Ensure VTK and simplnx link the same TBB variant. Options:

  1. Build VTK in Debug mode for Debug builds so it also links libtbb_debug.
  2. Configure the Debug build to use release TBB (libtbb instead of libtbb_debug) so it matches VTK.
  3. Add a TBB finalization call (e.g., oneapi::tbb::finalize()) in DREAM3D-NX's shutdown path to drain worker threads before libraries are unloaded.

This issue is not caused by the ArrayCalculator changes — it existed before but was not triggered consistently. The code changes in this PR shift the shared library layout enough to make the timing race reproducible.

Companion PR

This PR must be merged together with the DREAM3D-NX companion PR BlueQuartzSoftware/DREAM3DNX#1087 — the DREAM3D-NX widget changes depend on the CalculatorParameter base class change (MutableDataParameterValueParameter) and the new engine API.

Test Plan

  • All 8 existing + new ArrayCalculatorFilter test cases pass (100%)
  • Backward compatibility pipeline tested via nxrunner
  • DREAM3D-NX Release build compiles and runs
  • No crash on exit in Release builds

Signed-off-by: Joey Kleingers joey.kleingers@bluequartz.net

Rewrite ArrayCalculator.hpp with new enum classes (CalculatorErrorCode,
CalculatorWarningCode, TokenType), value structs (Token, OperatorDef,
CalcValue, RpnItem), parser class (ArrayCalculatorParser), and algorithm
class (ArrayCalculator). The .cpp provides stub implementations so the
algorithm files compile independently; ArrayCalculatorFilter.cpp will
break until it is updated to use the new API in the next task.
…it tests

Single-pass character scanner that produces tokens for numbers, identifiers,
quoted strings, operators, brackets, and commas with position tracking.
…lator

Populate getOperatorRegistry() with the complete table of binary infix
operators (+, -, *, /, %, ^), unary math functions (abs, sqrt, ceil,
floor, exp, ln, log10), trig functions (sin, cos, tan, asin, acos, atan),
and binary functions (log, root, min, max).  log10 is placed before log
in the vector so prefix-based identifier matching resolves correctly.
Add the core parsing logic that converts tokens into typed ParsedItems
ready for shunting-yard conversion. Includes multi-word identifier
merging, identifier resolution with priority chain (operator > constant
> selected group array > unique DataStructure array), bracket indexing
for component and tuple+component extraction, minus sign disambiguation
(unary negative vs binary subtraction), WrapFunctionArguments for
multi-arg functions, and validation (matched parentheses, consistent
tuple counts and component shapes).
…Calculator

Add the final engine components to complete the ArrayCalculator rewrite:
- Shunting-yard algorithm at the end of parse() converts ParsedItems to RPN
- RPN stack evaluator in evaluateInto() with unary/binary ops, scalar
  broadcasting, trig angle unit conversion, and component extraction
- CopyResultFunctor for type-dispatched output with scalar fill support
- ArrayCalculator::operator()() wires the parser to the filter algorithm
Relax CalculatorParameter::validate() to accept empty or non-existent
selected groups (the path is only a resolution hint). Rewrite
preflightImpl() to use ArrayCalculatorParser::parseAndValidate() and
executeImpl() to delegate to the ArrayCalculator algorithm class. Remove
old ICalculatorArray include and bump parametersVersion to 2.
…yCalculator

Replace CalculatorItem:: enum references with CalculatorErrorCode:: and
CalculatorWarningCode:: in ArrayCalculatorTest.cpp.  Remove the old
CalculatorItem.hpp include since the test now uses the new
ArrayCalculator.hpp header exclusively.

Add validation in the parser to emit the same legacy error codes:
- OperatorNoOpeningParen / OperatorNoClosingParen for bare functions
- OperatorNoLeftValue / OperatorNoRightValue for binary operators
- TooManyArguments / NotEnoughArguments for function argument counts
- NoPrecedingUnaryOperator for commas in non-function parens
- AmbiguousNameWarning for operator symbols matching array names
…ameter

The selected group is no longer a required data path -- it is just a
resolution hint. Switching to ValueParameter removes the framework-rendered
data path selector from the UI.
@joeykleingers joeykleingers added the enhancement New feature or request label Mar 23, 2026
Signed-off-by: Joey Kleingers <joey.kleingers@bluequartz.net>
Adds TupleComponentExtract as a new RPN item type that extracts a single
scalar value from a computed sub-expression result at a specific tuple
and component index. Produces a Number that broadcasts like a literal.

Signed-off-by: Joey Kleingers <joey.kleingers@bluequartz.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant