Avoid null-restrict evaluation for predicates that reference non-join columns in PushDownFilter#20961
Draft
kosiew wants to merge 6 commits intoapache:mainfrom
Draft
Avoid null-restrict evaluation for predicates that reference non-join columns in PushDownFilter#20961kosiew wants to merge 6 commits intoapache:mainfrom
kosiew wants to merge 6 commits intoapache:mainfrom
Conversation
Introduce a test case to assert non-restricting behavior when evaluating the predicate a > b, focusing on join keys that only include a. This directly tests the new early-return branch in the is_restrict_null_predicate function in utils.rs, enhancing overall code coverage.
Extract the column-membership check into a new helper function called `predicate_uses_only_columns` in utils.rs. Update the current implementation at utils.rs:91 to use this new helper, improving code readability and maintainability.
Add call-site contract comment in push_down_filter.rs to specify that only Ok(true) is treated as null-restricting. State that both Ok(false) and Err(_) are considered non-restricting and will be skipped during processing.
Inline iterator predicate in utils.rs and streamline the null-restrict handling in push_down_filter.rs. This reduces indirections and lines of code while maintaining the same logic and behavior. No public interface or behavior changes intended.
…te_uses_only_columns function
Contributor
Author
|
run benchmark sql_planner_extended |
|
🤖 Criterion benchmark running (GKE) | trigger |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
PushDownFiltercan spend a disproportionate amount of planning time inferring predicates across joins. One expensive path isis_restrict_null_predicate, which falls back to compiling and evaluating the predicate against a null-filled schema to decide whether a predicate is null-rejecting.For predicates that reference columns outside the join-key set, that evaluation cannot succeed with the synthetic null schema built for join columns only. In practice, callers already treat evaluation failures as non-restricting, but we still pay the full cost of the physical-expression compilation and evaluation path first.
This change adds a cheap guard to detect predicates that reference columns outside the allowed join columns and returns
falseearly. That preserves the existing behavior while avoiding unnecessary work in a hot optimizer path.What changes are included in this PR?
This PR makes two focused changes:
is_restrict_null_predicate, collect the join columns into aHashSetand add a fast-path check that verifies whether the predicate only references those columns.Ok(false)immediately instead of attempting null-evaluation.Additionally:
evaluate_expr_with_null_columnpath.InferredPredicates::insert_inferred_predicateis simplified to use.unwrap_or(false)when consumingis_restrict_null_predicate, which matches the prior effective behavior of treating errors as non-restricting.a > b, wherebis outside the join-key set, to verify the fast path returnsfalse.Are these changes tested?
Yes.
A test case was added to cover the scenario where a predicate references a column outside the join key set:
a > bnow explicitly verifies thatis_restrict_null_predicatereturnsfalse.This exercises the new early-return path and protects against regressions in predicate analysis behavior.
Are there any user-facing changes?
No.
This change is an internal optimizer performance improvement and does not change public APIs or intended query results.
LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.