[VL][Delta] Support native OPTIMIZE ZORDER expressions#12062
Draft
malinjawi wants to merge 1 commit intoapache:mainfrom
Draft
[VL][Delta] Support native OPTIMIZE ZORDER expressions#12062malinjawi wants to merge 1 commit intoapache:mainfrom
malinjawi wants to merge 1 commit intoapache:mainfrom
Conversation
|
Run Gluten Clickhouse CI on x86 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes are proposed in this pull request?
This PR follows #12024 and adds native support for Delta OPTIMIZE ZORDER expression execution in the Velox backend.
The change:
Why are the changes needed?
#12024 enabled plain OPTIMIZE compaction command offload. OPTIMIZE ZORDER still needed native expression coverage for InterleaveBits and RangePartitionId to keep the supported command execution native.
Does this PR introduce any user-facing change?
No public API change. It extends native Delta OPTIMIZE ZORDER support in the Velox backend.
How was this patch tested?
Built and ran locally:
Performance
I ran a targeted local benchmark for OPTIMIZE ZORDER on Spark 3.5. Workload: 2,000,000 rows, 14 columns, 128 input files, 1 warmup, 3 measured runs. The benchmark measures only:
OPTIMIZE delta.
pathZORDER BY (z1, z2)Table setup time is excluded.
Compared with native Delta write disabled:
Compared with local vanilla Spark/Delta using spark.gluten.enabled=false:
The patch improves Gluten native ZORDER posture and is faster than the existing Gluten fallback path on this workload. It does not yet beat local vanilla Spark/Delta; remaining overhead is likely in Delta command planning/log/listing/commit work plus Gluten planning/listener and small terminal job overhead.
Related issue: #10215
Tracked by #12025
Was this patch authored or co-authored using generative AI tooling?
Generated-by: IBM BOB