Skip to content

[multistage] physical optimizer lookup join support #17961

@dang-stripe

Description

@dang-stripe

While testing the physical optimizer, we get this error when trying to invoke a lookup join:

Error log:

"message": "Received 1 error from stage N on Server_1234: Right input must be leaf operator",

Query:

WITH "source_1" AS (
  SELECT
    'v1' AS "column_1",
    1234567890 AS "column_2"
  FROM "table_3"
  LIMIT 1
)
SELECT
  /*+ "joinOptions"("join_strategy" = 'lookup') */
  a."column_1",
  a."column_2",
  b."column_3"
FROM "source_1" a
INNER JOIN "table_2" b
  ON b."column_1" = a."column_1"
 AND b."column_2" = a."column_2"

Claude's analysis:

The V2 physical optimizer (usePhysicalOptimizer=true) does not correctly handle lookup joins. When a query uses /*+ joinOptions(join_strategy='lookup') */, V2 inserts a BROADCAST exchange on the right side (dim table), which splits the dim table scan into a separate fragment from the join. At runtime, LookupJoinOperator requires rightInput instanceof LeafOperator but receives a MailboxReceiveOperator instead, throwing "Right input must be leaf operator"

cc @ankitsultana @suvodeep-pyne

Metadata

Metadata

Assignees

Labels

mse-physical-optimizerMulti-stage engine physical query optimizermulti-stageRelated to the multi-stage query engine

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions