Misc hash / hash aggregation performance improvements #19910

Dandandan · 2026-01-20T15:06:12Z

Which issue does this PR close?

Closes Optimize hash aggregate performance #19912

Rationale for this change

Just a couple of optimizations for hash table lookups usage in hash aggregate.

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Dandandan · 2026-01-20T15:06:22Z

run benchmarks

alamb-ghbot · 2026-01-20T15:43:32Z

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing aggregate_speed (e3529a9) to 3d90d4b diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

Dandandan · 2026-01-20T15:59:04Z

datafusion/physical-expr-common/src/binary_map.rs

                let entry = self.map.find_mut(hash, |header| {
                    // compare value if hashes match
-                    if header.len != value_len {
+                    if header.hash != hash {


Comment says "compare value if hashes match" but actual implementation is comparing lengths 🤔

I am double checking the reasoning, if the hash values aren't the same we know the values can not be the same either due to the requirements of the Hash trait

However (as codex points out to me) there is some small subtle chance that if two different values collide on hash, and their inline encodings match, they will be treated as equal even though the byte sequences differ which is what the old len check avoided

So I think we still need the old length checks too 🤔

Dandandan · 2026-01-20T16:00:19Z

datafusion/physical-expr-common/src/binary_map.rs

                let entry = self.map.find_mut(hash, |header| {
                    // compare value if hashes match
-                    if header.len != value_len {
+                    if header.hash != hash {


This should save some random access if a lot of values share the same length

I think we also need to update this place to check header.len

Dandandan · 2026-01-20T16:00:36Z

datafusion/physical-expr-common/src/binary_view_map.rs

-                let v = self.builder.get_value(header.view_idx);
-
-                if v.len() != value.len() {
+                if header.hash != hash {


This should save some random access if a lot of values share the same length

Dandandan · 2026-01-20T16:05:42Z

run benchmark sql_aggregation

alamb-ghbot · 2026-01-20T16:05:45Z

🤖 Hi @Dandandan, thanks for the request (#19910 (comment)).

scrape_comments.py only supports whitelisted benchmarks.

Standard: clickbench_1, clickbench_extended, clickbench_partitioned, clickbench_pushdown, external_aggr, tpcds, tpch, tpch10, tpch_mem, tpch_mem10
Criterion: aggregate_query_sql, aggregate_vectorized, case_when, character_length, in_list, left, range_and_generate_series, reset_plan_states, sort, sql_planner, strpos, substr_index, with_hashes

Please choose one or more of these with run benchmark <name> or run benchmark <name1> <name2>...
Unsupported benchmarks: sql_aggregation.

Dandandan · 2026-01-20T16:06:10Z

run benchmark aggregate_query_sql aggregate_vectorized

alamb-ghbot · 2026-01-20T16:23:01Z

🤖: Benchmark completed

Details

Comparing HEAD and aggregate_speed
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query    ┃        HEAD ┃ aggregate_speed ┃        Change ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0 │  2403.48 ms │      2336.84 ms │     no change │
│ QQuery 1 │   947.85 ms │       930.03 ms │     no change │
│ QQuery 2 │  1878.83 ms │      1931.61 ms │     no change │
│ QQuery 3 │  1111.82 ms │      1123.10 ms │     no change │
│ QQuery 4 │  2246.72 ms │      2203.75 ms │     no change │
│ QQuery 5 │ 28371.99 ms │     28468.45 ms │     no change │
│ QQuery 6 │  4022.17 ms │      4014.38 ms │     no change │
│ QQuery 7 │  2856.77 ms │      2592.76 ms │ +1.10x faster │
└──────────┴─────────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary              ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)              │ 43839.64ms │
│ Total Time (aggregate_speed)   │ 43600.92ms │
│ Average Time (HEAD)            │  5479.96ms │
│ Average Time (aggregate_speed) │  5450.11ms │
│ Queries Faster                 │          1 │
│ Queries Slower                 │          0 │
│ Queries with No Change         │          7 │
│ Queries with Failure           │          0 │
└────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃        HEAD ┃ aggregate_speed ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │     1.89 ms │         1.91 ms │     no change │
│ QQuery 1  │    52.75 ms │        52.53 ms │     no change │
│ QQuery 2  │   131.63 ms │       131.35 ms │     no change │
│ QQuery 3  │   156.28 ms │       148.32 ms │ +1.05x faster │
│ QQuery 4  │  1047.01 ms │      1030.70 ms │     no change │
│ QQuery 5  │  1423.72 ms │      1356.43 ms │     no change │
│ QQuery 6  │     1.87 ms │         1.89 ms │     no change │
│ QQuery 7  │    59.81 ms │        57.16 ms │     no change │
│ QQuery 8  │  1451.76 ms │      1395.50 ms │     no change │
│ QQuery 9  │  1871.42 ms │      1790.36 ms │     no change │
│ QQuery 10 │   354.65 ms │       347.13 ms │     no change │
│ QQuery 11 │   397.94 ms │       393.80 ms │     no change │
│ QQuery 12 │  1322.98 ms │      1266.57 ms │     no change │
│ QQuery 13 │  1984.51 ms │      1957.52 ms │     no change │
│ QQuery 14 │  1307.29 ms │      1232.98 ms │ +1.06x faster │
│ QQuery 15 │  1235.27 ms │      1172.21 ms │ +1.05x faster │
│ QQuery 16 │  2627.51 ms │      2507.42 ms │     no change │
│ QQuery 17 │  2576.70 ms │      2449.13 ms │     no change │
│ QQuery 18 │  6191.78 ms │      4883.16 ms │ +1.27x faster │
│ QQuery 19 │   119.68 ms │       121.97 ms │     no change │
│ QQuery 20 │  1986.41 ms │      1885.87 ms │ +1.05x faster │
│ QQuery 21 │  2311.67 ms │      2172.49 ms │ +1.06x faster │
│ QQuery 22 │  9447.87 ms │      3746.09 ms │ +2.52x faster │
│ QQuery 23 │ 29056.75 ms │     12216.76 ms │ +2.38x faster │
│ QQuery 24 │   206.59 ms │       208.64 ms │     no change │
│ QQuery 25 │   489.89 ms │       465.04 ms │ +1.05x faster │
│ QQuery 26 │   230.74 ms │       219.60 ms │     no change │
│ QQuery 27 │  2688.78 ms │      2630.32 ms │     no change │
│ QQuery 28 │ 24272.19 ms │     23084.82 ms │     no change │
│ QQuery 29 │   969.98 ms │       990.96 ms │     no change │
│ QQuery 30 │  1304.27 ms │      1284.35 ms │     no change │
│ QQuery 31 │  1389.29 ms │      1341.89 ms │     no change │
│ QQuery 32 │  4495.41 ms │      4252.35 ms │ +1.06x faster │
│ QQuery 33 │  5729.23 ms │      5307.72 ms │ +1.08x faster │
│ QQuery 34 │  6123.65 ms │      5553.47 ms │ +1.10x faster │
│ QQuery 35 │  1953.57 ms │      1836.05 ms │ +1.06x faster │
│ QQuery 36 │    70.49 ms │        67.39 ms │     no change │
│ QQuery 37 │    45.16 ms │        44.55 ms │     no change │
│ QQuery 38 │    65.60 ms │        67.41 ms │     no change │
│ QQuery 39 │   102.02 ms │       101.12 ms │     no change │
│ QQuery 40 │    26.52 ms │        27.72 ms │     no change │
│ QQuery 41 │    23.87 ms │        23.67 ms │     no change │
│ QQuery 42 │    20.90 ms │        20.31 ms │     no change │
└───────────┴─────────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary              ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)              │ 117327.29ms │
│ Total Time (aggregate_speed)   │  89846.64ms │
│ Average Time (HEAD)            │   2728.54ms │
│ Average Time (aggregate_speed) │   2089.46ms │
│ Queries Faster                 │          13 │
│ Queries Slower                 │           0 │
│ Queries with No Change         │          30 │
│ Queries with Failure           │           0 │
└────────────────────────────────┴─────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃      HEAD ┃ aggregate_speed ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 108.84 ms │       108.00 ms │     no change │
│ QQuery 2  │  34.44 ms │        33.68 ms │     no change │
│ QQuery 3  │  42.36 ms │        34.28 ms │ +1.24x faster │
│ QQuery 4  │  32.50 ms │        31.19 ms │     no change │
│ QQuery 5  │  93.89 ms │        91.25 ms │     no change │
│ QQuery 6  │  21.08 ms │        21.31 ms │     no change │
│ QQuery 7  │ 161.81 ms │       164.72 ms │     no change │
│ QQuery 8  │  43.30 ms │        40.74 ms │ +1.06x faster │
│ QQuery 9  │ 119.92 ms │       103.41 ms │ +1.16x faster │
│ QQuery 10 │  70.89 ms │        67.39 ms │     no change │
│ QQuery 11 │  19.75 ms │        18.99 ms │     no change │
│ QQuery 12 │  54.82 ms │        50.48 ms │ +1.09x faster │
│ QQuery 13 │  52.23 ms │        48.03 ms │ +1.09x faster │
│ QQuery 14 │  16.38 ms │        15.04 ms │ +1.09x faster │
│ QQuery 15 │  33.03 ms │        30.49 ms │ +1.08x faster │
│ QQuery 16 │  30.95 ms │        28.20 ms │ +1.10x faster │
│ QQuery 17 │ 157.94 ms │       143.92 ms │ +1.10x faster │
│ QQuery 18 │ 291.57 ms │       293.75 ms │     no change │
│ QQuery 19 │  41.36 ms │        39.79 ms │     no change │
│ QQuery 20 │  60.35 ms │        54.05 ms │ +1.12x faster │
│ QQuery 21 │ 195.07 ms │       198.62 ms │     no change │
│ QQuery 22 │  23.12 ms │        22.47 ms │     no change │
└───────────┴───────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary              ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)              │ 1705.57ms │
│ Total Time (aggregate_speed)   │ 1639.81ms │
│ Average Time (HEAD)            │   77.53ms │
│ Average Time (aggregate_speed) │   74.54ms │
│ Queries Faster                 │        10 │
│ Queries Slower                 │         0 │
│ Queries with No Change         │        12 │
│ Queries with Failure           │         0 │
└────────────────────────────────┴───────────┘

Dandandan · 2026-01-20T16:38:51Z

run benchmarks

alamb-ghbot · 2026-01-20T17:02:03Z

🤖 ./gh_compare_branch_bench.sh compare_branch_bench.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing aggregate_speed (e3529a9) to 3d90d4b diff
BENCH_NAME=aggregate_query_sql
BENCH_COMMAND=cargo bench --features=parquet --bench aggregate_query_sql
BENCH_FILTER=
BENCH_BRANCH_NAME=aggregate_speed
Results will be posted here when complete

alamb-ghbot · 2026-01-20T17:35:28Z

🤖: Benchmark completed

Details

group                                                                         aggregate_speed                        main
-----                                                                         ---------------                        ----
aggregate_query_approx_percentile_cont_on_f32                                 1.00      4.5±0.18ms        ? ?/sec    1.04      4.6±0.21ms        ? ?/sec
aggregate_query_approx_percentile_cont_on_u64                                 1.00      4.8±0.21ms        ? ?/sec    1.02      4.9±0.26ms        ? ?/sec
aggregate_query_distinct_median                                               1.00      3.2±0.04ms        ? ?/sec    1.02      3.2±0.12ms        ? ?/sec
aggregate_query_group_by                                                      1.00      2.0±0.07ms        ? ?/sec    1.00      2.0±0.05ms        ? ?/sec
aggregate_query_group_by_u64 15 12                                            1.01  1972.8±53.98µs        ? ?/sec    1.00  1948.2±53.18µs        ? ?/sec
aggregate_query_group_by_u64_multiple_keys                                    1.00      4.7±0.21ms        ? ?/sec    1.00      4.8±0.30ms        ? ?/sec
aggregate_query_group_by_wide_u64_and_f32_without_aggregate_expressions       1.00      2.5±0.12ms        ? ?/sec    1.03      2.6±0.13ms        ? ?/sec
aggregate_query_group_by_wide_u64_and_string_without_aggregate_expressions    1.00      3.0±0.16ms        ? ?/sec    1.03      3.1±0.19ms        ? ?/sec
aggregate_query_group_by_with_filter                                          1.01      2.1±0.04ms        ? ?/sec    1.00      2.1±0.04ms        ? ?/sec
aggregate_query_group_by_with_filter_u64 15 12                                1.03      2.1±0.07ms        ? ?/sec    1.00      2.0±0.03ms        ? ?/sec
aggregate_query_no_group_by 15 12                                             1.02  1163.5±27.21µs        ? ?/sec    1.00  1142.3±17.08µs        ? ?/sec
aggregate_query_no_group_by_count_distinct_narrow                             1.00  1755.8±39.45µs        ? ?/sec    1.00  1758.8±62.43µs        ? ?/sec
aggregate_query_no_group_by_count_distinct_wide                               1.05      2.7±0.13ms        ? ?/sec    1.00      2.6±0.12ms        ? ?/sec
aggregate_query_no_group_by_min_max_f64                                       1.02  1110.9±21.40µs        ? ?/sec    1.00  1088.9±14.11µs        ? ?/sec
first_last_ignore_nulls                                                       1.00      2.8±0.10ms        ? ?/sec    1.04      2.9±0.13ms        ? ?/sec
first_last_many_columns                                                       1.00      2.8±0.06ms        ? ?/sec    1.04      2.9±0.15ms        ? ?/sec
first_last_one_column                                                         1.00      2.4±0.08ms        ? ?/sec    1.02      2.5±0.09ms        ? ?/sec

alamb-ghbot · 2026-01-20T17:35:31Z

🤖 ./gh_compare_branch_bench.sh compare_branch_bench.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing aggregate_speed (e3529a9) to 3d90d4b diff
BENCH_NAME=aggregate_vectorized
BENCH_COMMAND=cargo bench --features=parquet --bench aggregate_vectorized
BENCH_FILTER=
BENCH_BRANCH_NAME=aggregate_speed
Results will be posted here when complete

alamb-ghbot · 2026-01-20T17:35:32Z

Benchmark script failed with exit code 101.

Last 10 lines of output:

Click to expand

++ BENCH_BRANCH_NAME=aggregate_speed
++ rm -f /tmp/comment.txt
++ cat
+++ uname -a
++ gh pr comment -F /tmp/comment.txt https://github.com/apache/datafusion/pull/19910
https://github.com/apache/datafusion/pull/19910#issuecomment-3774128559
++ rm -rf target/criterion/
++ cargo bench --features=parquet --bench aggregate_vectorized -- --save-baseline aggregate_speed
error: target `aggregate_vectorized` in package `datafusion-physical-plan` requires the features: `test_utils`
Consider enabling them by passing, e.g., `--features="test_utils"`

alamb-ghbot · 2026-01-20T17:35:36Z

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing aggregate_speed (e3529a9) to 3d90d4b diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

alamb-ghbot · 2026-01-20T18:15:35Z

🤖: Benchmark completed

Details

Comparing HEAD and aggregate_speed
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query    ┃        HEAD ┃ aggregate_speed ┃        Change ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0 │  2555.74 ms │      2355.10 ms │ +1.09x faster │
│ QQuery 1 │   945.81 ms │       963.86 ms │     no change │
│ QQuery 2 │  1966.14 ms │      1924.57 ms │     no change │
│ QQuery 3 │  1131.70 ms │      1093.71 ms │     no change │
│ QQuery 4 │  2210.45 ms │      2211.00 ms │     no change │
│ QQuery 5 │ 29577.76 ms │     29100.78 ms │     no change │
│ QQuery 6 │  4196.72 ms │      4136.06 ms │     no change │
│ QQuery 7 │  2924.04 ms │      2689.02 ms │ +1.09x faster │
└──────────┴─────────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary              ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)              │ 45508.36ms │
│ Total Time (aggregate_speed)   │ 44474.11ms │
│ Average Time (HEAD)            │  5688.55ms │
│ Average Time (aggregate_speed) │  5559.26ms │
│ Queries Faster                 │          2 │
│ Queries Slower                 │          0 │
│ Queries with No Change         │          6 │
│ Queries with Failure           │          0 │
└────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃        HEAD ┃ aggregate_speed ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │     2.15 ms │         1.94 ms │ +1.11x faster │
│ QQuery 1  │    52.09 ms │        52.97 ms │     no change │
│ QQuery 2  │   134.86 ms │       134.26 ms │     no change │
│ QQuery 3  │   153.65 ms │       155.96 ms │     no change │
│ QQuery 4  │  1165.81 ms │      1183.09 ms │     no change │
│ QQuery 5  │  1399.01 ms │      1619.04 ms │  1.16x slower │
│ QQuery 6  │     1.90 ms │         1.90 ms │     no change │
│ QQuery 7  │    55.28 ms │        56.83 ms │     no change │
│ QQuery 8  │  1500.54 ms │      1562.75 ms │     no change │
│ QQuery 9  │  1897.65 ms │      1948.93 ms │     no change │
│ QQuery 10 │   356.77 ms │       362.33 ms │     no change │
│ QQuery 11 │   409.29 ms │       415.60 ms │     no change │
│ QQuery 12 │  1320.58 ms │      1549.13 ms │  1.17x slower │
│ QQuery 13 │  1982.48 ms │      2258.48 ms │  1.14x slower │
│ QQuery 14 │  1304.04 ms │      1373.55 ms │  1.05x slower │
│ QQuery 15 │  1288.37 ms │      1381.33 ms │  1.07x slower │
│ QQuery 16 │  2596.80 ms │      2857.70 ms │  1.10x slower │
│ QQuery 17 │  2600.57 ms │      2805.02 ms │  1.08x slower │
│ QQuery 18 │  5539.00 ms │      5016.02 ms │ +1.10x faster │
│ QQuery 19 │   122.35 ms │       125.05 ms │     no change │
│ QQuery 20 │  2017.72 ms │      1899.26 ms │ +1.06x faster │
│ QQuery 21 │  2353.44 ms │      2213.35 ms │ +1.06x faster │
│ QQuery 22 │  5976.27 ms │      3794.97 ms │ +1.57x faster │
│ QQuery 23 │ 22227.07 ms │     12525.75 ms │ +1.77x faster │
│ QQuery 24 │   230.60 ms │       217.93 ms │ +1.06x faster │
│ QQuery 25 │   490.15 ms │       483.64 ms │     no change │
│ QQuery 26 │   233.50 ms │       224.88 ms │     no change │
│ QQuery 27 │  2744.67 ms │      2655.57 ms │     no change │
│ QQuery 28 │ 24195.33 ms │     23455.91 ms │     no change │
│ QQuery 29 │   968.99 ms │       981.79 ms │     no change │
│ QQuery 30 │  1327.17 ms │      1414.24 ms │  1.07x slower │
│ QQuery 31 │  1312.72 ms │      1439.56 ms │  1.10x slower │
│ QQuery 32 │  4265.82 ms │      4338.96 ms │     no change │
│ QQuery 33 │  5658.56 ms │      5791.79 ms │     no change │
│ QQuery 34 │  5952.48 ms │      6118.01 ms │     no change │
│ QQuery 35 │  2069.38 ms │      2191.56 ms │  1.06x slower │
│ QQuery 36 │    69.53 ms │        68.57 ms │     no change │
│ QQuery 37 │    48.22 ms │        49.73 ms │     no change │
│ QQuery 38 │    67.67 ms │        68.00 ms │     no change │
│ QQuery 39 │   103.51 ms │       109.07 ms │  1.05x slower │
│ QQuery 40 │    29.57 ms │        29.85 ms │     no change │
│ QQuery 41 │    23.88 ms │        25.12 ms │  1.05x slower │
│ QQuery 42 │    22.10 ms │        21.58 ms │     no change │
└───────────┴─────────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary              ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)              │ 106271.53ms │
│ Total Time (aggregate_speed)   │  94980.99ms │
│ Average Time (HEAD)            │   2471.43ms │
│ Average Time (aggregate_speed) │   2208.86ms │
│ Queries Faster                 │           7 │
│ Queries Slower                 │          12 │
│ Queries with No Change         │          24 │
│ Queries with Failure           │           0 │
└────────────────────────────────┴─────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃      HEAD ┃ aggregate_speed ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 111.69 ms │       106.94 ms │     no change │
│ QQuery 2  │  34.09 ms │        32.55 ms │     no change │
│ QQuery 3  │  42.08 ms │        40.20 ms │     no change │
│ QQuery 4  │  32.18 ms │        31.14 ms │     no change │
│ QQuery 5  │  89.66 ms │        92.21 ms │     no change │
│ QQuery 6  │  20.69 ms │        20.99 ms │     no change │
│ QQuery 7  │ 155.52 ms │       164.85 ms │  1.06x slower │
│ QQuery 8  │  43.74 ms │        40.79 ms │ +1.07x faster │
│ QQuery 9  │ 105.57 ms │       121.53 ms │  1.15x slower │
│ QQuery 10 │  68.54 ms │        70.12 ms │     no change │
│ QQuery 11 │  19.44 ms │        19.61 ms │     no change │
│ QQuery 12 │  52.91 ms │        52.51 ms │     no change │
│ QQuery 13 │  52.48 ms │        48.75 ms │ +1.08x faster │
│ QQuery 14 │  15.95 ms │        15.65 ms │     no change │
│ QQuery 15 │  31.75 ms │        30.70 ms │     no change │
│ QQuery 16 │  29.44 ms │        27.92 ms │ +1.05x faster │
│ QQuery 17 │ 148.05 ms │       144.17 ms │     no change │
│ QQuery 18 │ 287.18 ms │       278.10 ms │     no change │
│ QQuery 19 │  39.62 ms │        40.96 ms │     no change │
│ QQuery 20 │  57.44 ms │        56.94 ms │     no change │
│ QQuery 21 │ 190.26 ms │       190.38 ms │     no change │
│ QQuery 22 │  23.47 ms │        23.41 ms │     no change │
└───────────┴───────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary              ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)              │ 1651.76ms │
│ Total Time (aggregate_speed)   │ 1650.42ms │
│ Average Time (HEAD)            │   75.08ms │
│ Average Time (aggregate_speed) │   75.02ms │
│ Queries Faster                 │         3 │
│ Queries Slower                 │         2 │
│ Queries with No Change         │        17 │
│ Queries with Failure           │         0 │
└────────────────────────────────┴───────────┘

Dandandan · 2026-01-20T18:16:28Z

run benchmarks

alamb-ghbot · 2026-01-20T18:16:34Z

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing aggregate_speed (e3529a9) to 3d90d4b diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

alamb-ghbot · 2026-01-20T18:42:16Z

🤖: Benchmark completed

Details

Comparing HEAD and aggregate_speed
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query    ┃        HEAD ┃ aggregate_speed ┃        Change ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0 │  2461.45 ms │      2326.22 ms │ +1.06x faster │
│ QQuery 1 │   925.18 ms │       951.03 ms │     no change │
│ QQuery 2 │  1884.15 ms │      1857.46 ms │     no change │
│ QQuery 3 │  1134.41 ms │      1085.88 ms │     no change │
│ QQuery 4 │  2264.97 ms │      2192.02 ms │     no change │
│ QQuery 5 │ 28889.42 ms │     28399.99 ms │     no change │
│ QQuery 6 │  4052.38 ms │      4006.24 ms │     no change │
│ QQuery 7 │  2763.45 ms │      2562.32 ms │ +1.08x faster │
└──────────┴─────────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary              ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)              │ 44375.40ms │
│ Total Time (aggregate_speed)   │ 43381.16ms │
│ Average Time (HEAD)            │  5546.93ms │
│ Average Time (aggregate_speed) │  5422.65ms │
│ Queries Faster                 │          2 │
│ Queries Slower                 │          0 │
│ Queries with No Change         │          6 │
│ Queries with Failure           │          0 │
└────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃        HEAD ┃ aggregate_speed ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │     2.05 ms │         2.13 ms │     no change │
│ QQuery 1  │    55.38 ms │        52.26 ms │ +1.06x faster │
│ QQuery 2  │   135.55 ms │       135.50 ms │     no change │
│ QQuery 3  │   160.80 ms │       160.69 ms │     no change │
│ QQuery 4  │  1152.99 ms │      1025.83 ms │ +1.12x faster │
│ QQuery 5  │  1415.92 ms │      1386.90 ms │     no change │
│ QQuery 6  │     1.97 ms │         1.87 ms │ +1.06x faster │
│ QQuery 7  │    57.78 ms │        57.69 ms │     no change │
│ QQuery 8  │  1494.66 ms │      1404.82 ms │ +1.06x faster │
│ QQuery 9  │  1920.79 ms │      1829.07 ms │     no change │
│ QQuery 10 │   357.85 ms │       347.42 ms │     no change │
│ QQuery 11 │   410.14 ms │       400.30 ms │     no change │
│ QQuery 12 │  1339.89 ms │      1302.11 ms │     no change │
│ QQuery 13 │  2032.48 ms │      1974.56 ms │     no change │
│ QQuery 14 │  1305.76 ms │      1246.62 ms │     no change │
│ QQuery 15 │  1325.00 ms │      1193.04 ms │ +1.11x faster │
│ QQuery 16 │  2633.58 ms │      2496.11 ms │ +1.06x faster │
│ QQuery 17 │  2601.68 ms │      2489.65 ms │     no change │
│ QQuery 18 │  5655.53 ms │      4839.48 ms │ +1.17x faster │
│ QQuery 19 │   125.48 ms │       121.24 ms │     no change │
│ QQuery 20 │  1981.10 ms │      1878.02 ms │ +1.05x faster │
│ QQuery 21 │  2322.64 ms │      2207.96 ms │     no change │
│ QQuery 22 │  4264.28 ms │      3869.57 ms │ +1.10x faster │
│ QQuery 23 │ 22344.02 ms │     12331.73 ms │ +1.81x faster │
│ QQuery 24 │   231.59 ms │       217.00 ms │ +1.07x faster │
│ QQuery 25 │   515.26 ms │       480.71 ms │ +1.07x faster │
│ QQuery 26 │   226.90 ms │       229.51 ms │     no change │
│ QQuery 27 │  2763.10 ms │      2687.23 ms │     no change │
│ QQuery 28 │ 23709.81 ms │     23373.70 ms │     no change │
│ QQuery 29 │   988.97 ms │       989.85 ms │     no change │
│ QQuery 30 │  1308.66 ms │      1263.47 ms │     no change │
│ QQuery 31 │  1414.41 ms │      1328.20 ms │ +1.06x faster │
│ QQuery 32 │  4496.81 ms │      4324.06 ms │     no change │
│ QQuery 33 │  5905.91 ms │      5411.06 ms │ +1.09x faster │
│ QQuery 34 │  6079.85 ms │      5744.92 ms │ +1.06x faster │
│ QQuery 35 │  2050.82 ms │      1881.73 ms │ +1.09x faster │
│ QQuery 36 │    68.84 ms │        68.87 ms │     no change │
│ QQuery 37 │    47.36 ms │        47.89 ms │     no change │
│ QQuery 38 │    67.72 ms │        69.07 ms │     no change │
│ QQuery 39 │   108.92 ms │       104.18 ms │     no change │
│ QQuery 40 │    30.29 ms │        28.96 ms │     no change │
│ QQuery 41 │    24.05 ms │        24.19 ms │     no change │
│ QQuery 42 │    20.58 ms │        22.13 ms │  1.08x slower │
└───────────┴─────────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary              ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)              │ 105157.18ms │
│ Total Time (aggregate_speed)   │  91051.26ms │
│ Average Time (HEAD)            │   2445.52ms │
│ Average Time (aggregate_speed) │   2117.47ms │
│ Queries Faster                 │          16 │
│ Queries Slower                 │           1 │
│ Queries with No Change         │          26 │
│ Queries with Failure           │           0 │
└────────────────────────────────┴─────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃      HEAD ┃ aggregate_speed ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 108.94 ms │       105.43 ms │     no change │
│ QQuery 2  │  34.04 ms │        34.50 ms │     no change │
│ QQuery 3  │  41.93 ms │        35.96 ms │ +1.17x faster │
│ QQuery 4  │  32.86 ms │        30.81 ms │ +1.07x faster │
│ QQuery 5  │  96.05 ms │        91.83 ms │     no change │
│ QQuery 6  │  21.62 ms │        21.41 ms │     no change │
│ QQuery 7  │ 165.29 ms │       174.71 ms │  1.06x slower │
│ QQuery 8  │  45.35 ms │        42.40 ms │ +1.07x faster │
│ QQuery 9  │ 110.97 ms │       107.82 ms │     no change │
│ QQuery 10 │  72.22 ms │        70.33 ms │     no change │
│ QQuery 11 │  19.77 ms │        20.45 ms │     no change │
│ QQuery 12 │  53.10 ms │        53.98 ms │     no change │
│ QQuery 13 │  55.05 ms │        52.61 ms │     no change │
│ QQuery 14 │  16.46 ms │        17.10 ms │     no change │
│ QQuery 15 │  31.80 ms │        33.28 ms │     no change │
│ QQuery 16 │  31.54 ms │        33.05 ms │     no change │
│ QQuery 17 │ 157.18 ms │       164.98 ms │     no change │
│ QQuery 18 │ 303.04 ms │       301.56 ms │     no change │
│ QQuery 19 │  42.04 ms │        40.01 ms │     no change │
│ QQuery 20 │  61.20 ms │        58.75 ms │     no change │
│ QQuery 21 │ 197.90 ms │       196.74 ms │     no change │
│ QQuery 22 │  23.46 ms │        23.10 ms │     no change │
└───────────┴───────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary              ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)              │ 1721.80ms │
│ Total Time (aggregate_speed)   │ 1710.82ms │
│ Average Time (HEAD)            │   78.26ms │
│ Average Time (aggregate_speed) │   77.76ms │
│ Queries Faster                 │         3 │
│ Queries Slower                 │         1 │
│ Queries with No Change         │        18 │
│ Queries with Failure           │         0 │
└────────────────────────────────┴───────────┘

Dandandan · 2026-01-20T19:05:29Z

Unfortunately the benchmark runner is very noisy lately 😢

Dandandan · 2026-01-20T19:06:13Z

run benchmark tpch tpcds

alamb-ghbot · 2026-01-20T19:06:20Z

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing aggregate_speed (e3529a9) to 3d90d4b diff using: tpch
Results will be posted here when complete

alamb-ghbot · 2026-01-20T19:06:58Z

🤖: Benchmark completed

Details

Comparing HEAD and aggregate_speed
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Query     ┃      HEAD ┃ aggregate_speed ┃       Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ QQuery 1  │ 186.48 ms │       180.60 ms │    no change │
│ QQuery 2  │  89.84 ms │        89.80 ms │    no change │
│ QQuery 3  │ 126.52 ms │       127.35 ms │    no change │
│ QQuery 4  │  79.92 ms │        78.07 ms │    no change │
│ QQuery 5  │ 177.08 ms │       179.54 ms │    no change │
│ QQuery 6  │  63.69 ms │        70.21 ms │ 1.10x slower │
│ QQuery 7  │ 214.24 ms │       211.79 ms │    no change │
│ QQuery 8  │ 178.18 ms │       175.70 ms │    no change │
│ QQuery 9  │ 238.74 ms │       235.68 ms │    no change │
│ QQuery 10 │ 195.07 ms │       191.07 ms │    no change │
│ QQuery 11 │  66.54 ms │        65.49 ms │    no change │
│ QQuery 12 │ 121.78 ms │       122.40 ms │    no change │
│ QQuery 13 │ 225.19 ms │       221.62 ms │    no change │
│ QQuery 14 │  90.34 ms │        90.66 ms │    no change │
│ QQuery 15 │ 121.87 ms │       127.93 ms │    no change │
│ QQuery 16 │  63.67 ms │        60.70 ms │    no change │
│ QQuery 17 │ 275.21 ms │       262.08 ms │    no change │
│ QQuery 18 │ 320.66 ms │       310.73 ms │    no change │
│ QQuery 19 │ 135.76 ms │       138.37 ms │    no change │
│ QQuery 20 │ 132.05 ms │       131.29 ms │    no change │
│ QQuery 21 │ 256.70 ms │       262.80 ms │    no change │
│ QQuery 22 │  38.92 ms │        41.56 ms │ 1.07x slower │
└───────────┴───────────┴─────────────────┴──────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary              ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)              │ 3398.45ms │
│ Total Time (aggregate_speed)   │ 3375.43ms │
│ Average Time (HEAD)            │  154.47ms │
│ Average Time (aggregate_speed) │  153.43ms │
│ Queries Faster                 │         0 │
│ Queries Slower                 │         2 │
│ Queries with No Change         │        20 │
│ Queries with Failure           │         0 │
└────────────────────────────────┴───────────┘

alamb-ghbot · 2026-01-20T19:07:00Z

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing aggregate_speed (e3529a9) to 3d90d4b diff using: tpcds
Results will be posted here when complete

alamb-ghbot · 2026-01-20T19:15:38Z

🤖: Benchmark completed

Details

Comparing HEAD and aggregate_speed
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃        HEAD ┃ aggregate_speed ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │    73.94 ms │        75.51 ms │     no change │
│ QQuery 2  │   213.82 ms │       209.50 ms │     no change │
│ QQuery 3  │   158.20 ms │       161.40 ms │     no change │
│ QQuery 4  │  1940.23 ms │      1927.31 ms │     no change │
│ QQuery 5  │   292.48 ms │       284.12 ms │     no change │
│ QQuery 6  │  1443.73 ms │      1367.92 ms │ +1.06x faster │
│ QQuery 7  │   505.41 ms │       514.24 ms │     no change │
│ QQuery 8  │   171.82 ms │       175.31 ms │     no change │
│ QQuery 9  │   281.81 ms │       305.82 ms │  1.09x slower │
│ QQuery 10 │   173.86 ms │       178.03 ms │     no change │
│ QQuery 11 │  1280.71 ms │      1303.97 ms │     no change │
│ QQuery 12 │    70.61 ms │        69.03 ms │     no change │
│ QQuery 13 │   537.45 ms │       541.72 ms │     no change │
│ QQuery 14 │  1877.03 ms │      1834.97 ms │     no change │
│ QQuery 15 │    31.70 ms │        30.81 ms │     no change │
│ QQuery 16 │    67.87 ms │        65.87 ms │     no change │
│ QQuery 17 │   372.01 ms │       375.16 ms │     no change │
│ QQuery 18 │   194.91 ms │       198.28 ms │     no change │
│ QQuery 19 │   229.02 ms │       234.36 ms │     no change │
│ QQuery 20 │    26.37 ms │        26.91 ms │     no change │
│ QQuery 21 │    40.54 ms │        39.60 ms │     no change │
│ QQuery 22 │   711.65 ms │       733.01 ms │     no change │
│ QQuery 23 │  1780.16 ms │      1776.26 ms │     no change │
│ QQuery 24 │   692.62 ms │       698.28 ms │     no change │
│ QQuery 25 │   528.11 ms │       538.48 ms │     no change │
│ QQuery 26 │   131.75 ms │       126.95 ms │     no change │
│ QQuery 27 │   503.14 ms │       521.14 ms │     no change │
│ QQuery 28 │   290.49 ms │       303.85 ms │     no change │
│ QQuery 29 │   458.87 ms │       464.78 ms │     no change │
│ QQuery 30 │    76.24 ms │        74.98 ms │     no change │
│ QQuery 31 │   308.67 ms │       305.30 ms │     no change │
│ QQuery 32 │    85.95 ms │        84.32 ms │     no change │
│ QQuery 33 │   208.09 ms │       203.62 ms │     no change │
│ QQuery 34 │   163.69 ms │       158.36 ms │     no change │
│ QQuery 35 │   171.36 ms │       177.28 ms │     no change │
│ QQuery 36 │   279.28 ms │       293.90 ms │  1.05x slower │
│ QQuery 37 │   257.34 ms │       262.42 ms │     no change │
│ QQuery 38 │   153.31 ms │       150.11 ms │     no change │
│ QQuery 39 │   216.44 ms │       213.13 ms │     no change │
│ QQuery 40 │   180.55 ms │       175.25 ms │     no change │
│ QQuery 41 │    24.00 ms │        24.17 ms │     no change │
│ QQuery 42 │   142.63 ms │       142.55 ms │     no change │
│ QQuery 43 │   125.85 ms │       125.74 ms │     no change │
│ QQuery 44 │    30.23 ms │        28.98 ms │     no change │
│ QQuery 45 │    92.96 ms │        90.79 ms │     no change │
│ QQuery 46 │   324.92 ms │       327.44 ms │     no change │
│ QQuery 47 │  1053.10 ms │      1077.19 ms │     no change │
│ QQuery 48 │   401.51 ms │       419.27 ms │     no change │
│ QQuery 49 │   374.87 ms │       381.15 ms │     no change │
│ QQuery 50 │   343.31 ms │       356.69 ms │     no change │
│ QQuery 51 │   303.39 ms │       305.00 ms │     no change │
│ QQuery 52 │   142.46 ms │       147.20 ms │     no change │
│ QQuery 53 │   149.85 ms │       154.47 ms │     no change │
│ QQuery 54 │   221.79 ms │       227.92 ms │     no change │
│ QQuery 55 │   142.17 ms │       143.62 ms │     no change │
│ QQuery 56 │   208.50 ms │       208.65 ms │     no change │
│ QQuery 57 │   323.21 ms │       306.19 ms │ +1.06x faster │
│ QQuery 58 │   496.74 ms │       505.25 ms │     no change │
│ QQuery 59 │   284.23 ms │       292.14 ms │     no change │
│ QQuery 60 │   209.85 ms │       213.48 ms │     no change │
│ QQuery 61 │   245.12 ms │       243.68 ms │     no change │
│ QQuery 62 │  1296.56 ms │      1309.90 ms │     no change │
│ QQuery 63 │   154.90 ms │       152.87 ms │     no change │
│ QQuery 64 │  1203.66 ms │      1199.56 ms │     no change │
│ QQuery 65 │   349.11 ms │       349.67 ms │     no change │
│ QQuery 66 │   387.06 ms │       402.55 ms │     no change │
│ QQuery 67 │   544.87 ms │       548.53 ms │     no change │
│ QQuery 68 │   367.01 ms │       376.94 ms │     no change │
│ QQuery 69 │   171.51 ms │       171.84 ms │     no change │
│ QQuery 70 │   487.53 ms │       497.42 ms │     no change │
│ QQuery 71 │   182.36 ms │       183.64 ms │     no change │
│ QQuery 72 │  2136.85 ms │      2135.82 ms │     no change │
│ QQuery 73 │   158.53 ms │       158.04 ms │     no change │
│ QQuery 74 │   792.34 ms │       834.02 ms │  1.05x slower │
│ QQuery 75 │   421.94 ms │       413.72 ms │     no change │
│ QQuery 76 │   183.07 ms │       190.52 ms │     no change │
│ QQuery 77 │   283.94 ms │       289.18 ms │     no change │
│ QQuery 78 │   946.58 ms │       950.42 ms │     no change │
│ QQuery 79 │   322.27 ms │       334.42 ms │     no change │
│ QQuery 80 │   520.80 ms │       521.91 ms │     no change │
│ QQuery 81 │    52.20 ms │        53.23 ms │     no change │
│ QQuery 82 │   283.33 ms │       289.71 ms │     no change │
│ QQuery 83 │    83.26 ms │        84.61 ms │     no change │
│ QQuery 84 │    66.36 ms │        68.12 ms │     no change │
│ QQuery 85 │   227.84 ms │       230.77 ms │     no change │
│ QQuery 86 │    60.53 ms │        58.62 ms │     no change │
│ QQuery 87 │   155.30 ms │       148.60 ms │     no change │
│ QQuery 88 │   267.44 ms │       266.70 ms │     no change │
│ QQuery 89 │   169.42 ms │       174.88 ms │     no change │
│ QQuery 90 │    48.16 ms │        47.22 ms │     no change │
│ QQuery 91 │    96.87 ms │        98.30 ms │     no change │
│ QQuery 92 │    82.86 ms │        85.31 ms │     no change │
│ QQuery 93 │   276.08 ms │       284.63 ms │     no change │
│ QQuery 94 │    93.03 ms │        93.25 ms │     no change │
│ QQuery 95 │   260.39 ms │       248.90 ms │     no change │
│ QQuery 96 │   115.31 ms │       112.92 ms │     no change │
│ QQuery 97 │   193.03 ms │       186.08 ms │     no change │
│ QQuery 98 │   217.64 ms │       215.38 ms │     no change │
│ QQuery 99 │ 13993.98 ms │     14007.00 ms │     no change │
└───────────┴─────────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary              ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)              │ 50475.83ms │
│ Total Time (aggregate_speed)   │ 50678.07ms │
│ Average Time (HEAD)            │   509.86ms │
│ Average Time (aggregate_speed) │   511.90ms │
│ Queries Faster                 │          2 │
│ Queries Slower                 │          3 │
│ Queries with No Change         │         94 │
│ Queries with Failure           │          0 │
└────────────────────────────────┴────────────┘

Dandandan · 2026-01-20T19:17:08Z

run benchmark tpch_mem

alamb-ghbot · 2026-01-20T19:17:14Z

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing aggregate_speed (e3529a9) to 3d90d4b diff using: tpch_mem
Results will be posted here when complete

alamb-ghbot · 2026-01-20T19:17:36Z

🤖: Benchmark completed

Details

Comparing HEAD and aggregate_speed
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃      HEAD ┃ aggregate_speed ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 108.05 ms │       105.63 ms │     no change │
│ QQuery 2  │  33.05 ms │        32.34 ms │     no change │
│ QQuery 3  │  37.35 ms │        39.73 ms │  1.06x slower │
│ QQuery 4  │  31.51 ms │        30.39 ms │     no change │
│ QQuery 5  │  91.97 ms │        88.91 ms │     no change │
│ QQuery 6  │  21.04 ms │        25.32 ms │  1.20x slower │
│ QQuery 7  │ 158.43 ms │       162.47 ms │     no change │
│ QQuery 8  │  42.52 ms │        40.97 ms │     no change │
│ QQuery 9  │ 107.65 ms │       102.56 ms │     no change │
│ QQuery 10 │  69.96 ms │        66.23 ms │ +1.06x faster │
│ QQuery 11 │  18.87 ms │        19.12 ms │     no change │
│ QQuery 12 │  52.96 ms │        51.89 ms │     no change │
│ QQuery 13 │  50.97 ms │        48.26 ms │ +1.06x faster │
│ QQuery 14 │  15.37 ms │        15.54 ms │     no change │
│ QQuery 15 │  31.98 ms │        30.85 ms │     no change │
│ QQuery 16 │  29.40 ms │        29.00 ms │     no change │
│ QQuery 17 │ 148.93 ms │       145.53 ms │     no change │
│ QQuery 18 │ 289.32 ms │       280.93 ms │     no change │
│ QQuery 19 │  40.38 ms │        39.52 ms │     no change │
│ QQuery 20 │  56.09 ms │        55.31 ms │     no change │
│ QQuery 21 │ 187.91 ms │       193.07 ms │     no change │
│ QQuery 22 │  22.92 ms │        22.36 ms │     no change │
└───────────┴───────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary              ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)              │ 1646.65ms │
│ Total Time (aggregate_speed)   │ 1625.93ms │
│ Average Time (HEAD)            │   74.85ms │
│ Average Time (aggregate_speed) │   73.91ms │
│ Queries Faster                 │         2 │
│ Queries Slower                 │         2 │
│ Queries with No Change         │        18 │
│ Queries with Failure           │         0 │
└────────────────────────────────┴───────────┘

alamb · 2026-01-23T19:05:32Z

datafusion/physical-plan/src/aggregates/group_values/single_group_by/primitive.rs

                        hash,
-                        |&(g, _)| unsafe { self.values.get_unchecked(g).is_eq(key) },
+                        |&(g, h)| unsafe {
+                            hash == h && self.values.get_unchecked(g).is_eq(key)


I double checked that Hashbrown says

https://docs.rs/hashbrown/latest/hashbrown/struct.HashTable.html#method.entry

This method will call eq for all entries with the given hash, but may also call it for entries with a different hash. eq should only return true for the desired entry, at which point the search is stopped.

So checking the hash first makes senes to me as it will avoid a memory access

alamb

Looks good to me

datafusion/common/src/utils/proxy.rs

alamb · 2026-01-23T19:09:05Z

datafusion/common/src/utils/proxy.rs

+            self.reserve(bump_elements, &hasher);
        }
+
+        // still need to insert the element since first try failed


I don't see a first attempt to insert -- maybe this comment needs to be updated

datafusion/common/src/utils/proxy.rs

alamb · 2026-01-23T19:11:35Z

Thank you @Dandandan

Dandandan · 2026-01-23T21:41:17Z

datafusion/physical-expr-common/src/binary_view_map.rs

+                if header.hash != hash {
                    return false;
                }
+                let v = self.builder.get_value(header.view_idx);


I saw in profiles this is pretty expensive (also self.builder.append_value and always comparing by bytes - I think we are not really using the potential of binary views here, I opened #19961

Dandandan · 2026-01-23T21:44:04Z

run benchmarks

alamb-ghbot · 2026-01-23T21:44:14Z

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing aggregate_speed (0661704) to 3d90d4b diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

alamb-ghbot · 2026-01-23T22:25:01Z

🤖: Benchmark completed

Details

Comparing HEAD and aggregate_speed
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query    ┃        HEAD ┃ aggregate_speed ┃        Change ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0 │  2415.22 ms │      2367.93 ms │     no change │
│ QQuery 1 │   965.85 ms │       990.74 ms │     no change │
│ QQuery 2 │  1868.57 ms │      1920.79 ms │     no change │
│ QQuery 3 │  1118.23 ms │      1105.22 ms │     no change │
│ QQuery 4 │  2244.74 ms │      2164.26 ms │     no change │
│ QQuery 5 │ 28553.68 ms │     28312.67 ms │     no change │
│ QQuery 6 │  4001.53 ms │      3984.25 ms │     no change │
│ QQuery 7 │  2954.52 ms │      2733.24 ms │ +1.08x faster │
└──────────┴─────────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary              ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)              │ 44122.34ms │
│ Total Time (aggregate_speed)   │ 43579.10ms │
│ Average Time (HEAD)            │  5515.29ms │
│ Average Time (aggregate_speed) │  5447.39ms │
│ Queries Faster                 │          1 │
│ Queries Slower                 │          0 │
│ Queries with No Change         │          7 │
│ Queries with Failure           │          0 │
└────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃        HEAD ┃ aggregate_speed ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │     1.98 ms │         2.17 ms │  1.09x slower │
│ QQuery 1  │    52.69 ms │        55.25 ms │     no change │
│ QQuery 2  │   137.78 ms │       136.84 ms │     no change │
│ QQuery 3  │   155.86 ms │       155.90 ms │     no change │
│ QQuery 4  │  1114.82 ms │      1079.15 ms │     no change │
│ QQuery 5  │  1432.87 ms │      1399.32 ms │     no change │
│ QQuery 6  │     1.93 ms │         2.08 ms │  1.07x slower │
│ QQuery 7  │    58.60 ms │        58.02 ms │     no change │
│ QQuery 8  │  1466.43 ms │      1465.33 ms │     no change │
│ QQuery 9  │  1857.41 ms │      1934.46 ms │     no change │
│ QQuery 10 │   364.68 ms │       371.38 ms │     no change │
│ QQuery 11 │   417.25 ms │       408.53 ms │     no change │
│ QQuery 12 │  1341.12 ms │      1333.18 ms │     no change │
│ QQuery 13 │  2037.05 ms │      1980.11 ms │     no change │
│ QQuery 14 │  1305.13 ms │      1255.63 ms │     no change │
│ QQuery 15 │  1296.10 ms │      1164.30 ms │ +1.11x faster │
│ QQuery 16 │  2647.76 ms │      2488.81 ms │ +1.06x faster │
│ QQuery 17 │  2580.05 ms │      2464.00 ms │     no change │
│ QQuery 18 │  5204.25 ms │      4896.05 ms │ +1.06x faster │
│ QQuery 19 │   127.56 ms │       122.63 ms │     no change │
│ QQuery 20 │  1980.32 ms │      1918.05 ms │     no change │
│ QQuery 21 │  2310.31 ms │      2173.63 ms │ +1.06x faster │
│ QQuery 22 │  3947.64 ms │      3708.05 ms │ +1.06x faster │
│ QQuery 23 │ 12516.96 ms │     12158.37 ms │     no change │
│ QQuery 24 │   217.26 ms │       210.08 ms │     no change │
│ QQuery 25 │   505.20 ms │       468.19 ms │ +1.08x faster │
│ QQuery 26 │   228.53 ms │       222.32 ms │     no change │
│ QQuery 27 │  2804.98 ms │      2633.21 ms │ +1.07x faster │
│ QQuery 28 │ 23993.73 ms │     21719.75 ms │ +1.10x faster │
│ QQuery 29 │   986.99 ms │       966.20 ms │     no change │
│ QQuery 30 │  1306.56 ms │      1225.37 ms │ +1.07x faster │
│ QQuery 31 │  1379.96 ms │      1313.86 ms │     no change │
│ QQuery 32 │  5193.01 ms │      4478.30 ms │ +1.16x faster │
│ QQuery 33 │  6382.28 ms │      5374.40 ms │ +1.19x faster │
│ QQuery 34 │  6648.72 ms │      6118.85 ms │ +1.09x faster │
│ QQuery 35 │  2090.80 ms │      1880.91 ms │ +1.11x faster │
│ QQuery 36 │    72.43 ms │        69.09 ms │     no change │
│ QQuery 37 │    47.60 ms │        44.70 ms │ +1.06x faster │
│ QQuery 38 │    69.43 ms │        67.13 ms │     no change │
│ QQuery 39 │   109.89 ms │       103.33 ms │ +1.06x faster │
│ QQuery 40 │    29.11 ms │        28.60 ms │     no change │
│ QQuery 41 │    25.81 ms │        23.67 ms │ +1.09x faster │
│ QQuery 42 │    22.14 ms │        20.13 ms │ +1.10x faster │
└───────────┴─────────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary              ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)              │ 96471.00ms │
│ Total Time (aggregate_speed)   │ 89699.33ms │
│ Average Time (HEAD)            │  2243.51ms │
│ Average Time (aggregate_speed) │  2086.03ms │
│ Queries Faster                 │         17 │
│ Queries Slower                 │          2 │
│ Queries with No Change         │         24 │
│ Queries with Failure           │          0 │
└────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃      HEAD ┃ aggregate_speed ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 112.72 ms │       103.89 ms │ +1.09x faster │
│ QQuery 2  │  32.83 ms │        33.72 ms │     no change │
│ QQuery 3  │  40.40 ms │        42.46 ms │  1.05x slower │
│ QQuery 4  │  31.73 ms │        32.18 ms │     no change │
│ QQuery 5  │  93.34 ms │        93.43 ms │     no change │
│ QQuery 6  │  20.78 ms │        21.45 ms │     no change │
│ QQuery 7  │ 160.15 ms │       168.23 ms │  1.05x slower │
│ QQuery 8  │  39.63 ms │        42.97 ms │  1.08x slower │
│ QQuery 9  │ 102.61 ms │       111.07 ms │  1.08x slower │
│ QQuery 10 │  69.07 ms │        69.48 ms │     no change │
│ QQuery 11 │  18.96 ms │        19.59 ms │     no change │
│ QQuery 12 │  50.33 ms │        52.67 ms │     no change │
│ QQuery 13 │  51.51 ms │        51.95 ms │     no change │
│ QQuery 14 │  15.41 ms │        16.33 ms │  1.06x slower │
│ QQuery 15 │  31.01 ms │        31.86 ms │     no change │
│ QQuery 16 │  29.40 ms │        29.20 ms │     no change │
│ QQuery 17 │ 150.18 ms │       157.23 ms │     no change │
│ QQuery 18 │ 294.52 ms │       298.02 ms │     no change │
│ QQuery 19 │  39.20 ms │        42.60 ms │  1.09x slower │
│ QQuery 20 │  57.90 ms │        59.21 ms │     no change │
│ QQuery 21 │ 192.77 ms │       194.49 ms │     no change │
│ QQuery 22 │  22.89 ms │        22.53 ms │     no change │
└───────────┴───────────┴─────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary              ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)              │ 1657.35ms │
│ Total Time (aggregate_speed)   │ 1694.59ms │
│ Average Time (HEAD)            │   75.33ms │
│ Average Time (aggregate_speed) │   77.03ms │
│ Queries Faster                 │         1 │
│ Queries Slower                 │         6 │
│ Queries with No Change         │        15 │
│ Queries with Failure           │         0 │
└────────────────────────────────┴───────────┘

Always use hash first as check

91abc0a

github-actions bot added physical-expr Changes to the physical-expr crates physical-plan Changes to the physical-plan crate labels Jan 20, 2026

Dandandan changed the title ~~Always use hash first as check~~ Always use hash first as check for hashmap equality Jan 20, 2026

Use insert_unique / avoid eq check

aad3b41

Dandandan changed the title ~~Always use hash first as check for hashmap equality~~ [Bench] Aggregation performance Jan 20, 2026

github-actions bot added the common Related to common crate label Jan 20, 2026

Update doc

e3529a9

Dandandan commented Jan 20, 2026

View reviewed changes

Dandandan marked this pull request as ready for review January 20, 2026 16:34

Dandandan changed the title ~~[Bench] Aggregation performance~~ Aggregation performance improvements Jan 20, 2026

Dandandan changed the title ~~Aggregation performance improvements~~ Misc aggregation performance improvements Jan 20, 2026

Dandandan requested a review from Rachelint January 20, 2026 19:06

alamb reviewed Jan 23, 2026

View reviewed changes

alamb approved these changes Jan 23, 2026

View reviewed changes

alamb changed the title ~~Misc aggregation performance improvements~~ Misc hash / hash aggregation performance improvements Jan 23, 2026

alamb added the performance Make DataFusion faster label Jan 23, 2026

Feedback

0661704

Dandandan commented Jan 23, 2026

View reviewed changes

Merge branch 'main' into aggregate_speed

17fef56

Dandandan added this pull request to the merge queue Jan 24, 2026

Merged via the queue into apache:main with commit 17cbff0 Jan 24, 2026
32 checks passed

Misc hash / hash aggregation performance improvements #19910

Misc hash / hash aggregation performance improvements #19910

Conversation

Dandandan commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Dandandan commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

Dandandan Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

alamb Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Dandandan Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

alamb Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Dandandan Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Dandandan commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

Dandandan commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

Dandandan commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

Dandandan commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

Dandandan commented Jan 20, 2026

Uh oh!

Dandandan commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

Dandandan commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

alamb-ghbot commented Jan 20, 2026

Uh oh!

alamb Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Dandandan commented Jan 20, 2026 •

edited

Loading

alamb Jan 23, 2026 •

edited

Loading

Dandandan Jan 23, 2026 •

edited

Loading