[GLUTEN-12157][VL] Fix silently-skipped math/scalar test suites; add Velox native tests for sin, tan, tanh, radians, ln#12158
Conversation
|
Hi @brijrajk, apologies for the parallel work, I didn't see your PR was already up when I started. I had an upstream Velox fix in flight at the same time: facebookincubator/velox#17648. Heads up on So a change on Velox is indeed inevitable, the Velox PR adds a Your test-framework fixes ( Happy to coordinate whichever direction maintainers prefer. |
Thanks for the detailed explanation and for flagging the parallel work — no worries at all. You're right on ln; I've dropped it from this PR so only the four semantically safe functions (sin, tan, tanh, radians) land here, along with the test-framework fixes. Once your Velox PR (facebookincubator/velox#17648) lands upstream and Gluten picks up the new sparksql::LnFunction, ln can follow in a separate PR. |
6270b01 to
9aa5e1b
Compare
|
|
||
| // Register math functions that are present in the prestosql implementation | ||
| // but not yet in the sparksql registry. These are semantically identical | ||
| // to Spark's behavior for the same names. |
There was a problem hiding this comment.
Thanks for the PR. Would it make sense to refactor the Velox code, moving these functions to a common location and registering them in Spark's registerMathFunctions, provided we verify their implementations are identical?
There was a problem hiding this comment.
Yes, that would make sense. As @n0r0shi noted above, they were already working on registering these in Velox's sparksql::registerMathFunctions (facebookincubator/velox#17648) — including a proper sparksql::LnFunction for Spark's null semantics — when this PR went up.
Given that upstream work is in flight, I'd propose narrowing this PR to just the test-framework fixes (abstract class → class, ANSI disable) and the four Scala integration tests. Those are independently valuable since those suites are silently not running today, and the tests will validate native execution once the Velox changes land in Gluten.
Happy to drop the C++ registrations if that approach works for you.
There was a problem hiding this comment.
@brijrajk, thank you for the clarification. I was just aware of the above discussion. That makes sense.
Perhaps, you may set the Velox PR ID in the following position for verifying that PR on Gluten side.
gluten/ep/build-velox/src/get-velox.sh
Line 28 in ce6e16f
There was a problem hiding this comment.
Done — set UPSTREAM_VELOX_PR_ID=17648 in get-velox.sh so CI builds Gluten with facebookincubator/velox#17648 applied and validates the five Scala integration tests natively. Also added ln back with an edge-case check verifying ln(0) and ln(-1) return null. Will clear UPSTREAM_VELOX_PR_ID and mark the PR ready once that Velox change lands in Gluten.
2568c79 to
831b2f5
Compare
|
Thanks for marking it draft, @philo-he. The PR is now narrowed to the test-framework fixes and Scala integration tests, with UPSTREAM_VELOX_PR_ID=17648 wired for CI validation. Will mark ready once facebookincubator/velox#17648 lands. |
@brijrajk, this PR totally makes sense. Generally, the community recommends marking a PR as draft if it's not ready for review and merge. Thank you so much for the update! I just triggered the CI (manual trigger is required for a new contributor's first PR). It looks like there are some Scala code formatting issues. You can run |
…Velox native tests for sin, tan, tanh, radians, ln MathFunctionsValidateSuite and ScalarFunctionsValidateSuite were left as abstract classes after PR apache#11756 (RAS removal) deleted their only concrete subclasses, causing all tests in those suites to silently not run. Promote both to concrete classes and disable ANSI mode (enabled by default in Spark 4) so checkGlutenPlan[ProjectExecTransformer] works correctly. Add Scala integration tests for sin, tan, tanh, radians, and ln. The ln test includes an edge-case check that ln(0) and ln(-1) return null, not -Infinity/NaN, matching Spark's Hive-inherited semantics. The native registrations for these five functions are provided by facebookincubator/velox#17648 (sparksql::registerMathFunctions). Set UPSTREAM_VELOX_PR_ID=17648 to validate the full pipeline on CI while that Velox PR is in flight. This will be cleared before merge once the Velox change lands in Gluten.
831b2f5 to
a55f75f
Compare
Thank you for the kind words @philo-he and for triggering CI! Fixed the Scala formatting by running ./dev/format-scala-code.sh and pushed. |
What changes are proposed in this pull request?
MathFunctionsValidateSuiteandScalarFunctionsValidateSuitewere left asabstract classafter PR #11756 (RAS removal) deleted their only concretesubclasses, causing all tests in those suites to silently not run. Both are
promoted to concrete classes here, consistent with the fix already applied to
DateFunctionsValidateSuitein that same PR.ANSI mode (enabled by default in Spark 4) wraps arithmetic expressions in ANSI
check nodes, shifting the top-level plan node away from
ProjectExecTransformer.MathFunctionsValidateSuitenow explicitly disables ANSI mode; ANSI-specificbehaviour is covered by the existing
MathFunctionsValidateSuiteAnsiOn.Adds Scala integration tests for
sin,tan,tanh,radians, andln. Thelntest includes an edge-case check thatln(0)andln(-1)returnnullrather than
-Infinity/NaN, matching Spark's Hive-inherited semantics.The native Velox registrations for these five functions are provided by
facebookincubator/velox#17648.
UPSTREAM_VELOX_PR_ID=17648is set inget-velox.shto validate the full pipelineon CI while that Velox PR is in flight. It will be cleared before merge once the
Velox change lands in Gluten.
Fixes #12157
How was this patch tested?
Tests in
MathFunctionsValidateSuiteuserunQueryAndComparewithcheckGlutenPlan[ProjectExecTransformer]to verify native execution without fallback.The
lnedge-case test usescompareResultsAgainstVanillaSparkto explicitly validatenull semantics against vanilla Spark.
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude (Anthropic)