fix(postgres): correctly infer nullability for LEFT JOIN rewritten as RIGHT JOIN#4285
Open
luca992 wants to merge 1 commit into
Open
fix(postgres): correctly infer nullability for LEFT JOIN rewritten as RIGHT JOIN#4285luca992 wants to merge 1 commit into
luca992 wants to merge 1 commit into
Conversation
… RIGHT JOIN PostgreSQL's planner may execute `A LEFT JOIN B` as a hash join with `Join Type: Right` to put the smaller relation on the hash-build side. This is documented behavior of the planner: * [Postgres docs, 14.3 Controlling the Planner with Explicit JOIN Clauses] > "Most practical cases involving LEFT JOIN or RIGHT JOIN can be > rearranged to some extent." https://www.postgresql.org/docs/current/explicit-joins.html * [Postgres Pro, "Queries in PostgreSQL: 6. Hashing"] > "On the physical level, the planner determines which set is the > inner one and which is the outer one not by their positions in > the query, but by the relative join cost. ... So, the join type > switches from left to right in the plan." https://postgrespro.com/blog/pgsql/5969673 After the swap, the SQL right operand (the nullable side under LEFT JOIN semantics) appears as the plan's `Outer` child rather than the `Inner` child. The previous `visit_plan` only marked `Inner` children as nullable, which on a `Join Type: Right` plan: * incorrectly marked the SQL left operand (always preserved) as nullable — causing spurious `Option<T>` in macro output for NOT NULL columns; and * failed to mark the SQL right operand as nullable — masking real NULLs and panicking at decode time when no LEFT JOIN row matched. Thread the parent join type into `visit_plan` and decide which child is the NULL-fill side based on it: * `Left` → `Inner` child is nullable (no change) * `Right` → `Outer` child is nullable (new) * `Full` → both children are nullable (no change) Also recurse into all child plans (not only when the current node is `Left`/`Right`), so nested joins reached through non-join intermediates like `Hash` are walked. Closes transact-rs#3202.
977cbbb to
691420b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PostgreSQL's planner may execute
A LEFT JOIN Bas a hash join withJoin Type: Rightto put the smaller relation on the hash-build side.This is documented behavior:
[Postgres docs, 14.3 Controlling the Planner with Explicit JOIN Clauses]
[Postgres Pro, "Queries in PostgreSQL: 6. Hashing"]
After the swap the SQL right operand (the nullable side under LEFT JOIN semantics) ends up as the plan's
Outerchild instead of theInnerchild. The oldvisit_planonly markedInnerchildren nullable, which on aJoin Type: Rightplan:Option<T>in macro output for NOT NULL columnsFix threads the parent join type into
visit_planand picks the NULL-fill side based on it:Left->Innerchild is nullable (no change)Right->Outerchild is nullable (new)Full-> both children are nullable (no change)It also recurses into all child plans, not only when the current node is
Left/Right, so nested joins reached through non-join intermediates likeHashare walked.Does your PR solve an issue?
fixes #3202
Is this a breaking change?
Behavior change yes. The old behavior was wrong inference, not a documented contract.
For queries with a LEFT JOIN that postgres rewrites as a Hash Right Join (driven by
pg_statisticcost estimates, fires on production sized data withplan_cache_mode = force_generic_planwhichsqlx-macros-coreitself sets), generated types change:Option<T>toT. Code that was working around the bug with.unwrap()orOptionfield types stops compiling. Trivial to fix by dropping the workaround.TtoOption<T>. Code treating these as non-null was already at risk ofunexpected null; try decoding as an Optionpanics at runtime whenever a row had no matching join partner. After this it stops compiling and the field needs to becomeOption<T>.The second case is the bigger one. Code passed type checks before only because LEFT JOINs happened to always find a match in test data. After this fix the type system exposes the real nullability.