Describe the bug
PR #21743 (commit 4bff17ebd) unified ANY/ALL planning under a CASE expression to align NULL semantics with PostgreSQL.
As a side effect, the new LogicalPlan shape is no longer recognized by PruningPredicate, which disables Parquet row-group statistics pruning, bloom-filter pruning, and page-index pruning for <expr> <op> ANY(<array literal>) queries.
To Reproduce
This will fail:
#[tokio::test]
async fn prune_int32_eq_any_literal_array() {
RowGroupPruningTest::new()
.with_scenario(Scenario::Int)
.with_query("SELECT * FROM t where i32 = ANY([100])")
.with_expected_errors(Some(0))
.with_matched_by_stats(Some(0))
.with_pruned_by_stats(Some(0))
.with_pruned_files(Some(1))
.with_matched_by_bloom_filter(Some(0))
.with_pruned_by_bloom_filter(Some(0))
.with_expected_rows(0)
.test_row_group_prune()
.await;
}
Running with the datafusion CLI, you can see it as well:
CREATE EXTERNAL TABLE alltypes_plain
STORED AS PARQUET
LOCATION 'parquet-testing/data/alltypes_plain.parquet';
EXPLAIN VERBOSE SELECT id FROM alltypes_plain WHERE id = ANY([1, 2, 3]);
Gives the following pruning predicate:
pruning_predicate=id_null_count@0 > 0 OR id_null_count@0 != row_count@1
Whereas an IN statement works fine
CREATE EXTERNAL TABLE alltypes_plain
STORED AS PARQUET
LOCATION 'parquet-testing/data/alltypes_plain.parquet';
EXPLAIN VERBOSE SELECT id FROM alltypes_plain WHERE id IN (1, 2, 3);
pruning_predicate=id_null_count@2 != row_count@3 AND id_min@0 <= 1 AND 1 <= id_max@1 OR id_null_count@2 != row_count@3 AND id_min@0 <= 2 AND 2 <= id_max@1 OR id_null_count@2 != row_count@3 AND id_min@0 <= 3 AND 3 <= id_max@1,
Expected behavior
The test will pass
Additional context
No response
Describe the bug
PR #21743 (commit
4bff17ebd) unifiedANY/ALLplanning under a CASE expression to align NULL semantics with PostgreSQL.As a side effect, the new
LogicalPlanshape is no longer recognized byPruningPredicate, which disables Parquet row-group statistics pruning, bloom-filter pruning, and page-index pruning for<expr> <op> ANY(<array literal>)queries.To Reproduce
This will fail:
Running with the datafusion CLI, you can see it as well:
Gives the following pruning predicate:
Whereas an
INstatement works fineExpected behavior
The test will pass
Additional context
No response