You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CI shows all three parse_url tests failing on Spark 4.0 with COMET: invoke is not supported. This happens because ParseUrl was changed to RuntimeReplaceable in Spark 4.0. It gets replaced by an Invoke expression during analysis, so Comet's serde never sees the ParseUrl node.
On the compatibility side, I noticed that the PR doesn't distinguish between ANSI and legacy mode for invalid URL handling. DataFusion's parse_url throws errors on malformed URLs, which matches Spark's behavior when failOnError=true (ANSI mode), but in legacy mode (failOnError=false), Spark returns null for invalid URLs instead of throwing.
Spark 4.0 RuntimeReplaceable / Invoke path: ParseUrl is now handled when Spark rewrites it to Invoke(ParseUrlEvaluator.evaluate, ...), so Comet serde no longer misses it and we avoid "the invoke is not supported" fallback (please check it out if it's correct or not)
ANSI vs legacy behavior: serde now respects failOnError:
failOnError = true (ANSI) -> parse_url (error on malformed URLs)
I also registered try_parse_url on the native side and added a test for invalid URL handling in legacy mode (Spark 4.0 path).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
parse_urlby mappingParseUrlto the native scalar functionparse_urlin expression serde.parse_urlas supported.Why
This closes one of the missing DataFusion 50 migration functions from issue #2443
Part of #2443