Skip to content

fix(parquet): bound data page byte size for large variable-width values#9972

Open
adriangb wants to merge 6 commits into
apache:mainfrom
pydantic:parquet-page-size-mid-batch
Open

fix(parquet): bound data page byte size for large variable-width values#9972
adriangb wants to merge 6 commits into
apache:mainfrom
pydantic:parquet-page-size-mid-batch

Conversation

@adriangb
Copy link
Copy Markdown
Contributor

@adriangb adriangb commented May 14, 2026

We write large values into our parquet files (e.g. a 5MB LLM prompt). A naive write will cause massive pages (we've seen up to 2GB) at default write settings. The main knob to control this is write_batch_size which defaults to 1024. But if each row is 5MB that's 5GB. On the other hand setting this to something small like 32 kills write performance and is completely unnecessary for other fixed width columns.

The writer even documents this (parquet/src/column/writer/mod.rs):

We check for DataPage limits only after we have inserted the values. If a user writes a large number of values, the DataPage size can be well above the limit.

This PR makes the mini-batch size byte-budget aware:

  • For each chunk, compute bytes_per_value from the values about to be written and pick sub_batch_size = page_byte_limit / bytes_per_value (clamped ≥ 1).
  • For typical small values — numeric columns, short strings — sub_batch_size ≥ chunk size, so we stay on the existing batched fast path with zero behavior change.
  • Only when individual values are large enough that a full chunk would blow the page does the sub-batch shrink — to one row per mini-batch in the limit, matching the format minimum of one record per page.

Implementation notes

Skip the byte-size check while parquet dictionary encoding is active: estimated_value_bytes returns plain-encoded size but a dict-encoded data page only stores small RLE indices, so the estimate would spuriously shrink pages. Dict fallback bounds dict-encoded pages independently.

For repeated/nested columns the sub-batch steps record-by-record (rep == 0 boundaries) so a record never spans data pages, matching the parquet format rule.

Regression test

test_column_writer_caps_page_size_for_large_byte_array_values writes 64 × 64 KiB BYTE_ARRAY values with a 16 KiB page byte limit. Before this fix that produced a single ~4 MiB page; after, it's one page per value (~64 pages, all within ~2× the value size).

Bench results

5-run medians, criterion arrow_writer bench, default writer properties, on a noisy laptop (run-to-run variance ~±1.6%):

bench Δ vs main
primitive/default (i32 25% null) −1.0%
primitive_non_null/default −0.0%
bool_non_null/default −1.2%
string/default +0.6%
short_string_non_null/default (new, 1M × 8 B) +0.2%
large_string_non_null/default (new, 1024 × 256 KiB) +1.2%
string_non_null/default −2.1%
string_dictionary/default +0.4%
list_primitive/default +0.5%
list_primitive_non_null/default +0.1%

🤖 Generated with Claude Code

@github-actions github-actions Bot added the parquet Changes to the parquet crate label May 14, 2026
@adriangb
Copy link
Copy Markdown
Contributor Author

run benchmark arrow_writer

@adriangb adriangb force-pushed the parquet-page-size-mid-batch branch from 393ead0 to 4823429 Compare May 14, 2026 04:21
@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4447473325-58-pzct6 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing parquet-page-size-mid-batch (4823429) to 48fa8a7 (merge-base) diff
BENCH_NAME=arrow_writer
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_writer
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                              main                                   parquet-page-size-mid-batch
-----                                              ----                                   ---------------------------
bool/bloom_filter                                  1.00     13.1±0.03ms    19.2 MB/sec    1.01     13.2±0.03ms    18.9 MB/sec
bool/cdc                                           1.00     15.7±0.05ms    16.0 MB/sec    1.03     16.1±0.07ms    15.5 MB/sec
bool/default                                       1.00     11.0±0.02ms    22.8 MB/sec    1.01     11.1±0.03ms    22.5 MB/sec
bool/parquet_2                                     1.00     14.7±0.04ms    17.0 MB/sec    1.01     14.8±0.03ms    16.9 MB/sec
bool/zstd                                          1.00     11.5±0.03ms    21.8 MB/sec    1.01     11.6±0.03ms    21.5 MB/sec
bool/zstd_parquet_2                                1.00     15.1±0.04ms    16.6 MB/sec    1.01     15.2±0.05ms    16.4 MB/sec
bool_non_null/bloom_filter                         1.00      7.0±0.03ms    17.8 MB/sec    1.01      7.1±0.03ms    17.6 MB/sec
bool_non_null/cdc                                  1.00      6.8±0.03ms    18.4 MB/sec    1.01      6.9±0.03ms    18.2 MB/sec
bool_non_null/default                              1.00      4.3±0.02ms    29.4 MB/sec    1.02      4.3±0.02ms    28.8 MB/sec
bool_non_null/parquet_2                            1.01      9.1±0.04ms    13.8 MB/sec    1.00      9.0±0.03ms    13.9 MB/sec
bool_non_null/zstd                                 1.00      4.6±0.02ms    27.1 MB/sec    1.02      4.7±0.02ms    26.6 MB/sec
bool_non_null/zstd_parquet_2                       1.01      9.5±0.03ms    13.2 MB/sec    1.00      9.4±0.03ms    13.3 MB/sec
float_with_nans/bloom_filter                       1.00     91.9±0.45ms   152.3 MB/sec    1.03     95.0±0.49ms   147.4 MB/sec
float_with_nans/cdc                                1.00     81.2±0.33ms   172.4 MB/sec    1.02     82.7±0.17ms   169.4 MB/sec
float_with_nans/default                            1.00     74.0±0.32ms   189.3 MB/sec    1.03     76.3±0.28ms   183.4 MB/sec
float_with_nans/parquet_2                          1.00     93.7±0.44ms   149.4 MB/sec    1.01     94.8±0.26ms   147.7 MB/sec
float_with_nans/zstd                               1.00    111.5±0.25ms   125.5 MB/sec    1.02    114.2±0.26ms   122.6 MB/sec
float_with_nans/zstd_parquet_2                     1.00    131.7±0.83ms   106.3 MB/sec    1.00    131.8±0.19ms   106.2 MB/sec
large_string_non_null/bloom_filter                                                        1.00     78.3±0.17ms     3.2 GB/sec
large_string_non_null/cdc                                                                 1.00    241.5±1.40ms  1059.9 MB/sec
large_string_non_null/default                                                             1.00     59.9±0.14ms     4.2 GB/sec
large_string_non_null/parquet_2                                                           1.00     59.9±0.17ms     4.2 GB/sec
large_string_non_null/zstd                                                                1.00     60.2±0.60ms     4.2 GB/sec
large_string_non_null/zstd_parquet_2                                                      1.00     60.0±0.29ms     4.2 GB/sec
list_primitive/bloom_filter                        1.00    321.4±1.04ms  1696.6 MB/sec    1.01    325.1±0.77ms  1677.4 MB/sec
list_primitive/cdc                                 1.01    362.7±4.79ms  1503.8 MB/sec    1.00    360.4±0.58ms  1513.3 MB/sec
list_primitive/default                             1.00    245.4±0.60ms     2.2 GB/sec    1.01    248.7±0.79ms     2.1 GB/sec
list_primitive/parquet_2                           1.00    267.1±0.44ms  2041.6 MB/sec    1.01    270.4±1.01ms  2016.9 MB/sec
list_primitive/zstd                                1.00    495.4±0.86ms  1100.9 MB/sec    1.00    496.4±2.54ms  1098.7 MB/sec
list_primitive/zstd_parquet_2                      1.00    490.1±0.48ms  1112.9 MB/sec    1.01    494.1±0.92ms  1103.8 MB/sec
list_primitive_non_null/bloom_filter               1.00    426.6±3.62ms  1275.7 MB/sec    1.00    427.6±3.63ms  1272.8 MB/sec
list_primitive_non_null/cdc                        1.01    440.0±7.70ms  1236.8 MB/sec    1.00    434.8±8.76ms  1251.6 MB/sec
list_primitive_non_null/default                    1.00    287.9±2.90ms  1890.4 MB/sec    1.01    291.1±3.72ms  1869.5 MB/sec
list_primitive_non_null/parquet_2                  1.00   308.6±12.82ms  1763.4 MB/sec    1.05    323.0±9.12ms  1684.9 MB/sec
list_primitive_non_null/zstd                       1.00    714.5±3.78ms   761.7 MB/sec    1.00    712.8±5.58ms   763.6 MB/sec
list_primitive_non_null/zstd_parquet_2             1.00    683.0±0.52ms   796.8 MB/sec    1.00    686.0±0.81ms   793.4 MB/sec
list_primitive_sparse_99pct_null/bloom_filter      1.00     11.1±0.22ms     3.3 GB/sec    1.02     11.3±0.02ms     3.2 GB/sec
list_primitive_sparse_99pct_null/cdc               1.00     22.6±0.18ms  1651.1 MB/sec    1.00     22.7±0.06ms  1648.7 MB/sec
list_primitive_sparse_99pct_null/default           1.00     10.8±0.06ms     3.4 GB/sec    1.02     11.0±0.09ms     3.3 GB/sec
list_primitive_sparse_99pct_null/parquet_2         1.00     10.8±0.07ms     3.4 GB/sec    1.02     11.0±0.02ms     3.3 GB/sec
list_primitive_sparse_99pct_null/zstd              1.00     12.6±0.09ms     2.9 GB/sec    1.01     12.8±0.02ms     2.9 GB/sec
list_primitive_sparse_99pct_null/zstd_parquet_2    1.00     10.9±0.03ms     3.4 GB/sec    1.02     11.1±0.07ms     3.3 GB/sec
primitive/bloom_filter                             1.00    147.4±0.75ms   304.6 MB/sec    1.03    151.3±0.39ms   296.6 MB/sec
primitive/cdc                                      1.00    158.2±0.59ms   283.7 MB/sec    1.02    160.7±0.64ms   279.2 MB/sec
primitive/default                                  1.00    117.5±0.77ms   382.0 MB/sec    1.02    119.7±0.47ms   375.0 MB/sec
primitive/parquet_2                                1.00    131.9±0.39ms   340.1 MB/sec    1.02    134.4±0.21ms   334.0 MB/sec
primitive/zstd                                     1.00    146.2±0.26ms   306.9 MB/sec    1.02    149.2±0.34ms   300.8 MB/sec
primitive/zstd_parquet_2                           1.00    165.0±0.33ms   271.9 MB/sec    1.02    167.9±0.38ms   267.2 MB/sec
primitive_all_null/bloom_filter                    1.00     11.5±0.15ms     3.8 GB/sec    1.00     11.5±0.17ms     3.8 GB/sec
primitive_all_null/cdc                             1.05     30.5±0.34ms  1469.9 MB/sec    1.00     29.2±0.33ms  1537.6 MB/sec
primitive_all_null/default                         1.00     10.9±0.10ms     4.0 GB/sec    1.01     10.9±0.11ms     4.0 GB/sec
primitive_all_null/parquet_2                       1.00     10.9±0.14ms     4.0 GB/sec    1.00     10.9±0.11ms     4.0 GB/sec
primitive_all_null/zstd                            1.00     11.0±0.15ms     4.0 GB/sec    1.00     11.0±0.12ms     4.0 GB/sec
primitive_all_null/zstd_parquet_2                  1.00     11.0±0.23ms     4.0 GB/sec    1.00     11.1±0.24ms     4.0 GB/sec
primitive_non_null/bloom_filter                    1.04    110.6±1.27ms   397.8 MB/sec    1.00    106.2±0.53ms   414.5 MB/sec
primitive_non_null/cdc                             1.00     89.3±0.47ms   492.9 MB/sec    1.02     91.3±0.47ms   482.0 MB/sec
primitive_non_null/default                         1.00     67.1±0.31ms   655.8 MB/sec    1.02     68.6±0.50ms   641.6 MB/sec
primitive_non_null/parquet_2                       1.00     88.7±0.30ms   495.9 MB/sec    1.01     89.8±0.33ms   489.9 MB/sec
primitive_non_null/zstd                            1.04    103.9±0.21ms   423.7 MB/sec    1.00     99.6±0.49ms   441.6 MB/sec
primitive_non_null/zstd_parquet_2                  1.04    128.8±1.61ms   341.5 MB/sec    1.00    123.6±0.32ms   356.0 MB/sec
primitive_sparse_99pct_null/bloom_filter           1.01     18.1±0.21ms     2.4 GB/sec    1.00     18.0±0.06ms     2.4 GB/sec
primitive_sparse_99pct_null/cdc                    1.03     37.0±0.31ms  1214.3 MB/sec    1.00     35.8±0.35ms  1253.1 MB/sec
primitive_sparse_99pct_null/default                1.00     16.7±0.06ms     2.6 GB/sec    1.00     16.7±0.03ms     2.6 GB/sec
primitive_sparse_99pct_null/parquet_2              1.00     16.8±0.07ms     2.6 GB/sec    1.00     16.7±0.03ms     2.6 GB/sec
primitive_sparse_99pct_null/zstd                   1.00     20.0±0.11ms     2.2 GB/sec    1.00     20.0±0.10ms     2.2 GB/sec
primitive_sparse_99pct_null/zstd_parquet_2         1.01     18.7±0.13ms     2.3 GB/sec    1.00     18.6±0.04ms     2.4 GB/sec
short_string_non_null/bloom_filter                                                        1.00     27.9±0.10ms   429.7 MB/sec
short_string_non_null/cdc                                                                 1.00     19.9±0.09ms   602.3 MB/sec
short_string_non_null/default                                                             1.00     15.7±0.09ms   764.8 MB/sec
short_string_non_null/parquet_2                                                           1.00     25.4±0.06ms   472.0 MB/sec
short_string_non_null/zstd                                                                1.00     35.3±0.09ms   339.9 MB/sec
short_string_non_null/zstd_parquet_2                                                      1.00     28.3±0.07ms   424.6 MB/sec
string/bloom_filter                                1.06   226.9±24.81ms     2.3 GB/sec    1.00   214.3±22.17ms     2.4 GB/sec
string/cdc                                         1.00    220.1±5.61ms     2.3 GB/sec    1.00    219.4±7.14ms     2.3 GB/sec
string/default                                     1.20   140.7±24.25ms     3.6 GB/sec    1.00   116.9±11.73ms     4.4 GB/sec
string/parquet_2                                   1.00    124.5±0.65ms     4.1 GB/sec    1.01    125.6±0.79ms     4.1 GB/sec
string/zstd                                        1.00    423.4±2.87ms  1238.1 MB/sec    1.04   440.9±19.31ms  1189.0 MB/sec
string/zstd_parquet_2                              1.00    394.3±0.42ms  1329.7 MB/sec    1.03   406.0±10.72ms  1291.4 MB/sec
string_and_binary_view/bloom_filter                1.00     62.8±0.33ms   513.4 MB/sec    1.05     65.7±0.35ms   491.1 MB/sec
string_and_binary_view/cdc                         1.00     58.2±0.13ms   553.9 MB/sec    1.05     61.0±0.41ms   528.6 MB/sec
string_and_binary_view/default                     1.00     47.7±0.18ms   675.9 MB/sec    1.05     50.0±0.31ms   645.3 MB/sec
string_and_binary_view/parquet_2                   1.00     58.5±0.18ms   551.2 MB/sec    1.04     61.1±0.35ms   527.8 MB/sec
string_and_binary_view/zstd                        1.00     84.1±0.18ms   383.3 MB/sec    1.03     86.6±0.31ms   372.4 MB/sec
string_and_binary_view/zstd_parquet_2              1.00     72.4±0.35ms   445.3 MB/sec    1.04     75.1±0.30ms   429.6 MB/sec
string_dictionary/bloom_filter                     1.00     88.7±0.91ms     2.9 GB/sec    1.02     90.6±0.58ms     2.8 GB/sec
string_dictionary/cdc                              1.61     83.8±0.82ms     3.1 GB/sec    1.00     52.2±1.18ms     4.9 GB/sec
string_dictionary/default                          1.00     48.0±0.33ms     5.4 GB/sec    1.03     49.2±0.94ms     5.2 GB/sec
string_dictionary/parquet_2                        1.00     53.7±0.14ms     4.8 GB/sec    1.02     55.0±0.21ms     4.7 GB/sec
string_dictionary/zstd                             1.00    208.4±1.00ms  1267.2 MB/sec    1.01    209.5±0.64ms  1260.8 MB/sec
string_dictionary/zstd_parquet_2                   1.00    198.5±0.49ms  1330.4 MB/sec    1.00    199.4±0.17ms  1324.8 MB/sec
string_non_null/bloom_filter                       1.05   250.1±14.82ms     2.0 GB/sec    1.00    238.4±4.31ms     2.1 GB/sec
string_non_null/cdc                                1.01    266.1±8.91ms  1969.6 MB/sec    1.00    263.5±3.32ms  1988.9 MB/sec
string_non_null/default                            1.00   125.4±12.46ms     4.1 GB/sec    1.02    128.4±9.65ms     4.0 GB/sec
string_non_null/parquet_2                          1.05   139.7±11.29ms     3.7 GB/sec    1.00    132.7±0.46ms     3.9 GB/sec
string_non_null/zstd                               1.00    528.6±1.85ms   991.3 MB/sec    1.01    533.2±1.33ms   982.8 MB/sec
string_non_null/zstd_parquet_2                     1.00    504.8±2.15ms  1038.0 MB/sec    1.00    503.0±0.48ms  1041.7 MB/sec
struct_all_null/bloom_filter                       1.00      2.5±0.01ms     6.2 GB/sec    1.00      2.5±0.00ms     6.3 GB/sec
struct_all_null/cdc                                1.00      9.9±0.12ms  1633.5 MB/sec    1.01     10.0±0.11ms  1614.5 MB/sec
struct_all_null/default                            1.00      2.3±0.00ms     7.0 GB/sec    1.00      2.3±0.00ms     7.0 GB/sec
struct_all_null/parquet_2                          1.00      2.3±0.01ms     7.0 GB/sec    1.00      2.3±0.00ms     7.0 GB/sec
struct_all_null/zstd                               1.00      2.3±0.00ms     6.8 GB/sec    1.00      2.3±0.00ms     6.8 GB/sec
struct_all_null/zstd_parquet_2                     1.00      2.3±0.00ms     6.9 GB/sec    1.00      2.3±0.00ms     6.9 GB/sec
struct_non_null/bloom_filter                       1.00     45.9±0.18ms   348.8 MB/sec    1.05     48.3±0.23ms   331.4 MB/sec
struct_non_null/cdc                                1.00     45.1±0.17ms   354.6 MB/sec    1.02     45.9±0.27ms   348.3 MB/sec
struct_non_null/default                            1.00     31.8±0.17ms   503.2 MB/sec    1.03     32.7±0.14ms   488.8 MB/sec
struct_non_null/parquet_2                          1.00     40.6±0.49ms   394.5 MB/sec    1.03     41.6±0.11ms   384.7 MB/sec
struct_non_null/zstd                               1.00     40.6±0.15ms   394.2 MB/sec    1.02     41.5±0.15ms   385.6 MB/sec
struct_non_null/zstd_parquet_2                     1.00     54.5±0.13ms   293.7 MB/sec    1.02     55.3±0.17ms   289.1 MB/sec
struct_sparse_99pct_null/bloom_filter              1.01      7.4±0.02ms     2.1 GB/sec    1.00      7.4±0.05ms     2.1 GB/sec
struct_sparse_99pct_null/cdc                       1.07     15.3±0.08ms  1053.1 MB/sec    1.00     14.4±0.07ms  1121.7 MB/sec
struct_sparse_99pct_null/default                   1.01      6.9±0.05ms     2.3 GB/sec    1.00      6.9±0.06ms     2.3 GB/sec
struct_sparse_99pct_null/parquet_2                 1.00      6.9±0.03ms     2.3 GB/sec    1.00      6.9±0.04ms     2.3 GB/sec
struct_sparse_99pct_null/zstd                      1.00      8.3±0.01ms  1954.5 MB/sec    1.00      8.2±0.02ms  1963.5 MB/sec
struct_sparse_99pct_null/zstd_parquet_2            1.01      7.7±0.03ms     2.0 GB/sec    1.00      7.6±0.02ms     2.1 GB/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 1935.4s
Peak memory 6.6 GiB
Avg memory 6.4 GiB
CPU user 1876.9s
CPU sys 54.7s
Peak spill 0 B

branch

Metric Value
Wall time 2075.5s
Peak memory 6.8 GiB
Avg memory 6.7 GiB
CPU user 2028.4s
CPU sys 44.8s
Peak spill 0 B

File an issue against this benchmark runner

@adriangb
Copy link
Copy Markdown
Contributor Author

run benchmark arrow_writer

@adriangb adriangb force-pushed the parquet-page-size-mid-batch branch 2 times, most recently from 0fd6dcb to 24b83c7 Compare May 14, 2026 06:15
@adriangb
Copy link
Copy Markdown
Contributor Author

run benchmark arrow_writer

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4448145440-71-5z4xr 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing parquet-page-size-mid-batch (24b83c7) to 48fa8a7 (merge-base) diff
BENCH_NAME=arrow_writer
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_writer
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                              main                                   parquet-page-size-mid-batch
-----                                              ----                                   ---------------------------
bool/bloom_filter                                  1.02     13.2±0.11ms    18.9 MB/sec    1.00     13.0±0.08ms    19.2 MB/sec
bool/cdc                                           1.01     16.0±0.10ms    15.6 MB/sec    1.00     15.8±0.15ms    15.8 MB/sec
bool/default                                       1.02     11.2±0.09ms    22.4 MB/sec    1.00     10.9±0.04ms    22.9 MB/sec
bool/parquet_2                                     1.01     14.9±0.11ms    16.8 MB/sec    1.00     14.7±0.05ms    17.0 MB/sec
bool/zstd                                          1.02     11.6±0.07ms    21.5 MB/sec    1.00     11.5±0.06ms    21.8 MB/sec
bool/zstd_parquet_2                                1.02     15.3±0.10ms    16.4 MB/sec    1.00     15.1±0.07ms    16.6 MB/sec
bool_non_null/bloom_filter                         1.00      7.1±0.04ms    17.7 MB/sec    1.00      7.0±0.03ms    17.8 MB/sec
bool_non_null/cdc                                  1.01      7.0±0.08ms    18.0 MB/sec    1.00      6.9±0.10ms    18.1 MB/sec
bool_non_null/default                              1.00      4.3±0.03ms    29.1 MB/sec    1.00      4.3±0.02ms    29.1 MB/sec
bool_non_null/parquet_2                            1.00      9.1±0.04ms    13.7 MB/sec    1.00      9.1±0.04ms    13.8 MB/sec
bool_non_null/zstd                                 1.00      4.6±0.02ms    26.9 MB/sec    1.00      4.7±0.03ms    26.8 MB/sec
bool_non_null/zstd_parquet_2                       1.01      9.5±0.04ms    13.1 MB/sec    1.00      9.5±0.05ms    13.2 MB/sec
float_with_nans/bloom_filter                       1.00     94.5±1.08ms   148.1 MB/sec    1.01     95.7±2.49ms   146.3 MB/sec
float_with_nans/cdc                                1.00     83.0±0.82ms   168.8 MB/sec    1.01     83.7±1.57ms   167.2 MB/sec
float_with_nans/default                            1.00     75.5±1.07ms   185.5 MB/sec    1.00     75.3±1.01ms   185.9 MB/sec
float_with_nans/parquet_2                          1.00     97.2±0.72ms   144.1 MB/sec    1.00     97.4±1.82ms   143.7 MB/sec
float_with_nans/zstd                               1.00    113.2±1.37ms   123.7 MB/sec    1.01    114.0±1.22ms   122.8 MB/sec
float_with_nans/zstd_parquet_2                     1.00    133.3±1.99ms   105.0 MB/sec    1.02    135.3±1.48ms   103.4 MB/sec
large_string_non_null/bloom_filter                                                        1.00     84.8±1.96ms     2.9 GB/sec
large_string_non_null/cdc                                                                 1.00    243.8±1.68ms  1050.1 MB/sec
large_string_non_null/default                                                             1.00     64.6±1.09ms     3.9 GB/sec
large_string_non_null/parquet_2                                                           1.00     65.2±2.66ms     3.8 GB/sec
large_string_non_null/zstd                                                                1.00     63.7±2.72ms     3.9 GB/sec
large_string_non_null/zstd_parquet_2                                                      1.00     63.6±2.71ms     3.9 GB/sec
list_primitive/bloom_filter                        1.22   417.0±14.54ms  1308.0 MB/sec    1.00   340.4±11.36ms  1602.2 MB/sec
list_primitive/cdc                                 1.03   376.9±13.32ms  1447.0 MB/sec    1.00    364.3±4.07ms  1496.8 MB/sec
list_primitive/default                             1.28    324.3±7.20ms  1681.8 MB/sec    1.00    252.9±4.29ms     2.1 GB/sec
list_primitive/parquet_2                           1.24    336.6±2.67ms  1620.3 MB/sec    1.00    271.6±1.48ms  2008.1 MB/sec
list_primitive/zstd                                1.11    566.9±7.39ms   962.0 MB/sec    1.00    508.8±5.43ms  1072.0 MB/sec
list_primitive/zstd_parquet_2                      1.00    494.3±2.36ms  1103.3 MB/sec    1.01    497.0±2.60ms  1097.3 MB/sec
list_primitive_non_null/bloom_filter               1.11   496.8±24.03ms  1095.5 MB/sec    1.00   447.3±18.74ms  1216.7 MB/sec
list_primitive_non_null/cdc                        1.00    444.9±8.81ms  1223.3 MB/sec    1.00   445.4±13.12ms  1221.9 MB/sec
list_primitive_non_null/default                    1.15   344.7±21.96ms  1579.1 MB/sec    1.00    300.1±6.76ms  1813.4 MB/sec
list_primitive_non_null/parquet_2                  1.09    352.8±3.69ms  1542.5 MB/sec    1.00   322.6±22.73ms  1687.1 MB/sec
list_primitive_non_null/zstd                       1.05   765.1±24.38ms   711.3 MB/sec    1.00   730.8±20.34ms   744.7 MB/sec
list_primitive_non_null/zstd_parquet_2             1.00    695.9±3.50ms   782.1 MB/sec    1.01    703.1±7.88ms   774.1 MB/sec
list_primitive_sparse_99pct_null/bloom_filter      1.00     11.6±0.13ms     3.2 GB/sec    1.05     12.2±0.27ms     3.0 GB/sec
list_primitive_sparse_99pct_null/cdc               1.00     22.8±0.34ms  1637.4 MB/sec    1.03     23.6±0.18ms  1584.4 MB/sec
list_primitive_sparse_99pct_null/default           1.00     10.8±0.21ms     3.4 GB/sec    1.09     11.9±0.06ms     3.1 GB/sec
list_primitive_sparse_99pct_null/parquet_2         1.00     11.0±0.19ms     3.3 GB/sec    1.06     11.6±0.21ms     3.1 GB/sec
list_primitive_sparse_99pct_null/zstd              1.00     13.1±0.05ms     2.8 GB/sec    1.02     13.3±0.25ms     2.7 GB/sec
list_primitive_sparse_99pct_null/zstd_parquet_2    1.00     11.3±0.26ms     3.2 GB/sec    1.05     11.9±0.16ms     3.1 GB/sec
primitive/bloom_filter                             1.00    153.9±2.09ms   291.5 MB/sec    1.00    154.2±1.49ms   291.1 MB/sec
primitive/cdc                                      1.00    161.3±1.88ms   278.2 MB/sec    1.00    161.2±1.30ms   278.4 MB/sec
primitive/default                                  1.01    120.2±0.79ms   373.4 MB/sec    1.00    118.9±1.69ms   377.3 MB/sec
primitive/parquet_2                                1.00    134.5±1.65ms   333.7 MB/sec    1.01    135.2±1.74ms   332.0 MB/sec
primitive/zstd                                     1.01    150.0±0.77ms   299.2 MB/sec    1.00    149.1±1.93ms   301.0 MB/sec
primitive/zstd_parquet_2                           1.01    169.4±1.51ms   264.9 MB/sec    1.00    167.5±1.20ms   267.9 MB/sec
primitive_all_null/bloom_filter                    1.01     11.8±0.11ms     3.7 GB/sec    1.00     11.7±0.22ms     3.7 GB/sec
primitive_all_null/cdc                             1.01     30.8±0.48ms  1458.4 MB/sec    1.00     30.6±0.35ms  1468.6 MB/sec
primitive_all_null/default                         1.00     11.0±0.18ms     4.0 GB/sec    1.00     11.0±0.14ms     4.0 GB/sec
primitive_all_null/parquet_2                       1.01     11.0±0.18ms     4.0 GB/sec    1.00     10.9±0.10ms     4.0 GB/sec
primitive_all_null/zstd                            1.00     11.1±0.16ms     3.9 GB/sec    1.00     11.1±0.15ms     4.0 GB/sec
primitive_all_null/zstd_parquet_2                  1.00     11.0±0.20ms     4.0 GB/sec    1.01     11.1±0.19ms     3.9 GB/sec
primitive_non_null/bloom_filter                    1.06    115.9±2.51ms   379.8 MB/sec    1.00    109.3±2.72ms   402.4 MB/sec
primitive_non_null/cdc                             1.02     92.5±1.13ms   475.8 MB/sec    1.00     91.0±1.65ms   483.7 MB/sec
primitive_non_null/default                         1.00     69.2±0.52ms   635.6 MB/sec    1.00     69.2±1.37ms   636.1 MB/sec
primitive_non_null/parquet_2                       1.00     91.0±1.37ms   483.3 MB/sec    1.00     90.7±0.56ms   484.9 MB/sec
primitive_non_null/zstd                            1.07    106.1±1.85ms   414.7 MB/sec    1.00     99.5±1.06ms   442.1 MB/sec
primitive_non_null/zstd_parquet_2                  1.06    132.2±2.10ms   332.8 MB/sec    1.00    124.4±1.51ms   353.8 MB/sec
primitive_sparse_99pct_null/bloom_filter           1.00     18.6±0.64ms     2.4 GB/sec    1.05     19.6±0.16ms     2.2 GB/sec
primitive_sparse_99pct_null/cdc                    1.00     37.9±0.58ms  1185.4 MB/sec    1.00     37.8±0.30ms  1187.6 MB/sec
primitive_sparse_99pct_null/default                1.00     17.1±0.27ms     2.6 GB/sec    1.00     17.2±0.25ms     2.6 GB/sec
primitive_sparse_99pct_null/parquet_2              1.00     17.3±0.07ms     2.5 GB/sec    1.00     17.3±0.31ms     2.5 GB/sec
primitive_sparse_99pct_null/zstd                   1.00     20.4±0.26ms     2.2 GB/sec    1.00     20.4±0.28ms     2.1 GB/sec
primitive_sparse_99pct_null/zstd_parquet_2         1.01     19.3±0.10ms     2.3 GB/sec    1.00     19.2±0.32ms     2.3 GB/sec
short_string_non_null/bloom_filter                                                        1.00     28.2±0.27ms   425.8 MB/sec
short_string_non_null/cdc                                                                 1.00     20.3±0.24ms   590.7 MB/sec
short_string_non_null/default                                                             1.00     16.1±0.06ms   743.5 MB/sec
short_string_non_null/parquet_2                                                           1.00     25.7±0.05ms   466.2 MB/sec
short_string_non_null/zstd                                                                1.00     35.6±0.13ms   336.7 MB/sec
short_string_non_null/zstd_parquet_2                                                      1.00     28.5±0.09ms   421.1 MB/sec
string/bloom_filter                                1.05   241.4±29.36ms     2.1 GB/sec    1.00   229.5±20.33ms     2.2 GB/sec
string/cdc                                         1.01    226.3±8.62ms     2.3 GB/sec    1.00    224.0±7.48ms     2.3 GB/sec
string/default                                     1.22   149.0±25.96ms     3.4 GB/sec    1.00   121.8±12.81ms     4.2 GB/sec
string/parquet_2                                   1.05    128.5±2.18ms     4.0 GB/sec    1.00    121.8±2.90ms     4.2 GB/sec
string/zstd                                        1.01    430.1±4.75ms  1219.0 MB/sec    1.00    425.7±4.35ms  1231.6 MB/sec
string/zstd_parquet_2                              1.00    395.1±1.99ms  1326.7 MB/sec    1.01    399.0±1.66ms  1313.8 MB/sec
string_and_binary_view/bloom_filter                1.00     66.1±2.70ms   488.2 MB/sec    1.04     68.8±2.99ms   469.0 MB/sec
string_and_binary_view/cdc                         1.00     60.1±1.44ms   537.0 MB/sec    1.05     62.9±1.53ms   513.0 MB/sec
string_and_binary_view/default                     1.00     49.8±1.52ms   646.9 MB/sec    1.04     51.8±1.41ms   622.2 MB/sec
string_and_binary_view/parquet_2                   1.00     59.7±1.11ms   540.2 MB/sec    1.04     62.1±1.55ms   519.5 MB/sec
string_and_binary_view/zstd                        1.00     85.2±0.56ms   378.7 MB/sec    1.03     88.0±0.54ms   366.6 MB/sec
string_and_binary_view/zstd_parquet_2              1.00     74.0±0.88ms   436.0 MB/sec    1.04     76.9±1.08ms   419.5 MB/sec
string_dictionary/bloom_filter                     1.00    110.0±7.61ms     2.3 GB/sec    1.28    141.3±7.91ms  1868.8 MB/sec
string_dictionary/cdc                              1.00     78.8±2.29ms     3.3 GB/sec    1.30    102.3±3.80ms     2.5 GB/sec
string_dictionary/default                          1.00     65.6±3.25ms     3.9 GB/sec    1.39     90.9±3.71ms     2.8 GB/sec
string_dictionary/parquet_2                        1.00     55.2±0.59ms     4.7 GB/sec    1.85    102.1±1.87ms     2.5 GB/sec
string_dictionary/zstd                             1.00    222.4±3.69ms  1187.8 MB/sec    1.01   224.7±15.20ms  1175.5 MB/sec
string_dictionary/zstd_parquet_2                   1.00    198.5±1.33ms  1330.5 MB/sec    1.19    236.0±1.85ms  1119.4 MB/sec
string_non_null/bloom_filter                       1.00   254.5±21.18ms     2.0 GB/sec    1.00   255.3±15.69ms     2.0 GB/sec
string_non_null/cdc                                1.00   277.0±13.67ms  1891.6 MB/sec    1.02   281.6±12.38ms  1861.0 MB/sec
string_non_null/default                            1.09   149.0±12.15ms     3.4 GB/sec    1.00   136.7±14.33ms     3.7 GB/sec
string_non_null/parquet_2                          1.00    124.0±2.41ms     4.1 GB/sec    1.25    154.5±2.23ms     3.3 GB/sec
string_non_null/zstd                               1.00    565.8±9.78ms   926.1 MB/sec    1.04   591.1±33.18ms   886.5 MB/sec
string_non_null/zstd_parquet_2                     1.00    505.3±2.87ms  1037.0 MB/sec    1.04   523.1±10.81ms  1001.6 MB/sec
struct_all_null/bloom_filter                       1.01      2.6±0.02ms     6.2 GB/sec    1.00      2.5±0.03ms     6.2 GB/sec
struct_all_null/cdc                                1.01      9.8±0.13ms  1640.3 MB/sec    1.00      9.8±0.10ms  1648.9 MB/sec
struct_all_null/default                            1.00      2.3±0.00ms     7.0 GB/sec    1.00      2.3±0.00ms     7.0 GB/sec
struct_all_null/parquet_2                          1.00      2.3±0.00ms     7.0 GB/sec    1.00      2.3±0.00ms     7.0 GB/sec
struct_all_null/zstd                               1.00      2.3±0.00ms     6.8 GB/sec    1.00      2.3±0.00ms     6.8 GB/sec
struct_all_null/zstd_parquet_2                     1.00      2.3±0.00ms     6.9 GB/sec    1.00      2.3±0.00ms     6.9 GB/sec
struct_non_null/bloom_filter                       1.00     47.6±1.08ms   336.1 MB/sec    1.02     48.7±1.46ms   328.6 MB/sec
struct_non_null/cdc                                1.00     46.0±0.28ms   348.1 MB/sec    1.01     46.3±0.59ms   345.9 MB/sec
struct_non_null/default                            1.00     32.5±0.27ms   491.8 MB/sec    1.01     33.0±0.36ms   485.5 MB/sec
struct_non_null/parquet_2                          1.02     41.7±0.54ms   384.1 MB/sec    1.00     41.0±0.61ms   390.3 MB/sec
struct_non_null/zstd                               1.00     41.2±0.74ms   387.9 MB/sec    1.00     41.2±0.24ms   388.6 MB/sec
struct_non_null/zstd_parquet_2                     1.00     55.5±0.60ms   288.5 MB/sec    1.00     55.3±0.96ms   289.4 MB/sec
struct_sparse_99pct_null/bloom_filter              1.04      7.9±0.28ms  2047.0 MB/sec    1.00      7.6±0.29ms     2.1 GB/sec
struct_sparse_99pct_null/cdc                       1.00     15.9±0.12ms  1014.2 MB/sec    1.00     15.8±0.27ms  1018.4 MB/sec
struct_sparse_99pct_null/default                   1.03      7.3±0.09ms     2.2 GB/sec    1.00      7.1±0.17ms     2.2 GB/sec
struct_sparse_99pct_null/parquet_2                 1.00      7.2±0.20ms     2.2 GB/sec    1.01      7.2±0.16ms     2.2 GB/sec
struct_sparse_99pct_null/zstd                      1.00      8.4±0.18ms  1924.8 MB/sec    1.03      8.6±0.06ms  1868.8 MB/sec
struct_sparse_99pct_null/zstd_parquet_2            1.00      8.0±0.14ms  2015.1 MB/sec    1.00      8.0±0.18ms  2013.8 MB/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 2015.4s
Peak memory 6.2 GiB
Avg memory 6.0 GiB
CPU user 1897.3s
CPU sys 113.7s
Peak spill 0 B

branch

Metric Value
Wall time 2115.5s
Peak memory 6.5 GiB
Avg memory 6.3 GiB
CPU user 2032.5s
CPU sys 80.9s
Peak spill 0 B

File an issue against this benchmark runner

@adriangb
Copy link
Copy Markdown
Contributor Author

run benchmark arrow_writer

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4451856049-90-g2tzh 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing parquet-page-size-mid-batch (24b83c7) to 48fa8a7 (merge-base) diff
BENCH_NAME=arrow_writer
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_writer
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@adriangb
Copy link
Copy Markdown
Contributor Author

run benchmark arrow_writer

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4451897215-91-nvnwl 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing parquet-page-size-mid-batch (70dc497) to 48fa8a7 (merge-base) diff
BENCH_NAME=arrow_writer
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_writer
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                              main                                   parquet-page-size-mid-batch
-----                                              ----                                   ---------------------------
bool/bloom_filter                                  1.02     13.3±0.09ms    18.8 MB/sec    1.00     13.1±0.05ms    19.1 MB/sec
bool/cdc                                           1.01     15.9±0.08ms    15.7 MB/sec    1.00     15.9±0.05ms    15.8 MB/sec
bool/default                                       1.02     11.1±0.07ms    22.4 MB/sec    1.00     10.9±0.04ms    22.9 MB/sec
bool/parquet_2                                     1.01     14.9±0.10ms    16.8 MB/sec    1.00     14.7±0.05ms    17.0 MB/sec
bool/zstd                                          1.02     11.7±0.07ms    21.5 MB/sec    1.00     11.4±0.04ms    21.9 MB/sec
bool/zstd_parquet_2                                1.01     15.3±0.07ms    16.4 MB/sec    1.00     15.1±0.05ms    16.6 MB/sec
bool_non_null/bloom_filter                         1.00      7.1±0.02ms    17.7 MB/sec    1.00      7.1±0.02ms    17.7 MB/sec
bool_non_null/cdc                                  1.00      6.9±0.03ms    18.1 MB/sec    1.01      7.0±0.02ms    18.0 MB/sec
bool_non_null/default                              1.00      4.3±0.02ms    29.3 MB/sec    1.01      4.3±0.02ms    29.1 MB/sec
bool_non_null/parquet_2                            1.01      9.1±0.04ms    13.7 MB/sec    1.00      9.0±0.03ms    13.8 MB/sec
bool_non_null/zstd                                 1.00      4.6±0.02ms    27.0 MB/sec    1.02      4.7±0.08ms    26.5 MB/sec
bool_non_null/zstd_parquet_2                       1.00      9.5±0.02ms    13.1 MB/sec    1.00      9.5±0.04ms    13.2 MB/sec
float_with_nans/bloom_filter                       1.00     95.0±0.41ms   147.3 MB/sec    1.01     95.8±0.62ms   146.2 MB/sec
float_with_nans/cdc                                1.02     84.7±2.06ms   165.4 MB/sec    1.00     83.3±0.31ms   168.2 MB/sec
float_with_nans/default                            1.00     74.8±0.24ms   187.2 MB/sec    1.01     75.3±0.28ms   185.9 MB/sec
float_with_nans/parquet_2                          1.00     96.5±0.41ms   145.1 MB/sec    1.00     96.9±0.54ms   144.5 MB/sec
float_with_nans/zstd                               1.00    113.0±0.29ms   123.9 MB/sec    1.00    113.3±0.32ms   123.5 MB/sec
float_with_nans/zstd_parquet_2                     1.00    133.4±0.51ms   105.0 MB/sec    1.01    134.2±0.49ms   104.3 MB/sec
large_string_non_null/bloom_filter                                                        1.00     80.9±0.27ms     3.1 GB/sec
large_string_non_null/cdc                                                                 1.00    244.3±2.51ms  1048.1 MB/sec
large_string_non_null/default                                                             1.00     62.0±0.20ms     4.0 GB/sec
large_string_non_null/parquet_2                                                           1.00     61.8±0.21ms     4.0 GB/sec
large_string_non_null/zstd                                                                1.00     61.9±0.20ms     4.0 GB/sec
large_string_non_null/zstd_parquet_2                                                      1.00     61.8±0.22ms     4.0 GB/sec
list_primitive/bloom_filter                        1.00    331.7±2.11ms  1644.3 MB/sec    1.03    341.7±2.31ms  1596.0 MB/sec
list_primitive/cdc                                 1.00    360.2±1.78ms  1514.3 MB/sec    1.00    361.7±1.29ms  1507.9 MB/sec
list_primitive/default                             1.00    251.7±1.44ms     2.1 GB/sec    1.01    255.2±1.27ms     2.1 GB/sec
list_primitive/parquet_2                           1.00    269.9±0.95ms  2020.5 MB/sec    1.01    271.9±0.47ms  2005.8 MB/sec
list_primitive/zstd                                1.00    503.3±2.86ms  1083.6 MB/sec    1.01    509.6±6.03ms  1070.3 MB/sec
list_primitive/zstd_parquet_2                      1.00    493.2±0.54ms  1105.8 MB/sec    1.00    495.2±0.60ms  1101.4 MB/sec
list_primitive_non_null/bloom_filter               1.00   435.7±11.23ms  1249.1 MB/sec    1.01    439.1±9.47ms  1239.6 MB/sec
list_primitive_non_null/cdc                        1.01   443.8±10.63ms  1226.3 MB/sec    1.00   441.2±10.64ms  1233.5 MB/sec
list_primitive_non_null/default                    1.00    287.6±3.39ms  1892.4 MB/sec    1.05    300.7±4.21ms  1810.1 MB/sec
list_primitive_non_null/parquet_2                  1.00    318.2±0.89ms  1710.6 MB/sec    1.00   318.5±21.42ms  1708.6 MB/sec
list_primitive_non_null/zstd                       1.00    713.5±7.79ms   762.8 MB/sec    1.02   725.9±18.34ms   749.8 MB/sec
list_primitive_non_null/zstd_parquet_2             1.00    679.9±0.90ms   800.4 MB/sec    1.03    700.5±9.97ms   777.0 MB/sec
list_primitive_sparse_99pct_null/bloom_filter      1.00     11.6±0.05ms     3.2 GB/sec    1.05     12.1±0.07ms     3.0 GB/sec
list_primitive_sparse_99pct_null/cdc               1.00     23.1±0.05ms  1616.8 MB/sec    1.02     23.7±0.07ms  1578.7 MB/sec
list_primitive_sparse_99pct_null/default           1.00     11.3±0.29ms     3.2 GB/sec    1.04     11.8±0.03ms     3.1 GB/sec
list_primitive_sparse_99pct_null/parquet_2         1.00     11.1±0.04ms     3.3 GB/sec    1.06     11.8±0.04ms     3.1 GB/sec
list_primitive_sparse_99pct_null/zstd              1.00     13.0±0.03ms     2.8 GB/sec    1.05     13.7±0.04ms     2.7 GB/sec
list_primitive_sparse_99pct_null/zstd_parquet_2    1.00     11.3±0.04ms     3.2 GB/sec    1.06     12.0±0.04ms     3.0 GB/sec
primitive/bloom_filter                             1.01    155.5±5.71ms   288.5 MB/sec    1.00    153.7±0.79ms   292.0 MB/sec
primitive/cdc                                      1.00    161.6±0.56ms   277.6 MB/sec    1.00    161.2±0.64ms   278.4 MB/sec
primitive/default                                  1.01    121.2±1.14ms   370.3 MB/sec    1.00    119.8±0.50ms   374.7 MB/sec
primitive/parquet_2                                1.00    135.0±0.50ms   332.5 MB/sec    1.00    135.2±0.48ms   331.8 MB/sec
primitive/zstd                                     1.01    149.6±0.53ms   299.9 MB/sec    1.00    148.7±0.41ms   301.7 MB/sec
primitive/zstd_parquet_2                           1.01    168.6±0.51ms   266.2 MB/sec    1.00    167.7±0.39ms   267.5 MB/sec
primitive_all_null/bloom_filter                    1.00     11.8±0.07ms     3.7 GB/sec    1.01     11.9±0.26ms     3.7 GB/sec
primitive_all_null/cdc                             1.00     30.7±0.42ms  1461.3 MB/sec    1.00     30.8±0.41ms  1458.2 MB/sec
primitive_all_null/default                         1.00     10.9±0.10ms     4.0 GB/sec    1.00     11.0±0.07ms     4.0 GB/sec
primitive_all_null/parquet_2                       1.00     11.0±0.17ms     4.0 GB/sec    1.00     11.0±0.17ms     4.0 GB/sec
primitive_all_null/zstd                            1.00     11.1±0.19ms     3.9 GB/sec    1.00     11.1±0.14ms     3.9 GB/sec
primitive_all_null/zstd_parquet_2                  1.00     11.0±0.10ms     4.0 GB/sec    1.01     11.1±0.20ms     3.9 GB/sec
primitive_non_null/bloom_filter                    1.04    114.2±1.46ms   385.2 MB/sec    1.00    109.4±0.48ms   402.4 MB/sec
primitive_non_null/cdc                             1.00     91.1±0.54ms   482.9 MB/sec    1.01     91.6±0.24ms   480.3 MB/sec
primitive_non_null/default                         1.00     68.5±0.26ms   642.5 MB/sec    1.01     69.3±0.33ms   634.8 MB/sec
primitive_non_null/parquet_2                       1.00     91.3±0.51ms   482.0 MB/sec    1.00     91.0±0.33ms   483.3 MB/sec
primitive_non_null/zstd                            1.07    106.7±0.26ms   412.4 MB/sec    1.00     99.9±0.28ms   440.5 MB/sec
primitive_non_null/zstd_parquet_2                  1.06    131.5±1.88ms   334.5 MB/sec    1.00    124.3±0.38ms   353.9 MB/sec
primitive_sparse_99pct_null/bloom_filter           1.00     19.1±0.08ms     2.3 GB/sec    1.01     19.3±0.07ms     2.3 GB/sec
primitive_sparse_99pct_null/cdc                    1.00     37.7±0.28ms  1189.3 MB/sec    1.01     37.9±0.31ms  1183.2 MB/sec
primitive_sparse_99pct_null/default                1.00     17.2±0.04ms     2.5 GB/sec    1.00     17.3±0.04ms     2.5 GB/sec
primitive_sparse_99pct_null/parquet_2              1.00     17.2±0.04ms     2.5 GB/sec    1.00     17.3±0.03ms     2.5 GB/sec
primitive_sparse_99pct_null/zstd                   1.00     20.5±0.06ms     2.1 GB/sec    1.01     20.6±0.04ms     2.1 GB/sec
primitive_sparse_99pct_null/zstd_parquet_2         1.00     19.1±0.04ms     2.3 GB/sec    1.01     19.2±0.05ms     2.3 GB/sec
short_string_non_null/bloom_filter                                                        1.00     28.2±0.09ms   425.5 MB/sec
short_string_non_null/cdc                                                                 1.00     20.4±0.05ms   587.2 MB/sec
short_string_non_null/default                                                             1.00     16.1±0.05ms   744.8 MB/sec
short_string_non_null/parquet_2                                                           1.00     25.8±0.09ms   464.8 MB/sec
short_string_non_null/zstd                                                                1.00     35.8±0.10ms   335.6 MB/sec
short_string_non_null/zstd_parquet_2                                                      1.00     28.7±0.05ms   418.3 MB/sec
string/bloom_filter                                1.05   238.5±27.20ms     2.1 GB/sec    1.00   226.3±17.74ms     2.3 GB/sec
string/cdc                                         1.00    223.2±8.00ms     2.3 GB/sec    1.00    223.1±5.65ms     2.3 GB/sec
string/default                                     1.20   147.1±29.69ms     3.5 GB/sec    1.00   122.2±12.93ms     4.2 GB/sec
string/parquet_2                                   1.05    127.3±0.36ms     4.0 GB/sec    1.00    121.8±0.70ms     4.2 GB/sec
string/zstd                                        1.01    428.5±3.10ms  1223.6 MB/sec    1.00    423.7±1.44ms  1237.3 MB/sec
string/zstd_parquet_2                              1.00    396.2±1.24ms  1323.1 MB/sec    1.01    399.1±0.90ms  1313.7 MB/sec
string_and_binary_view/bloom_filter                1.00     65.9±0.55ms   489.5 MB/sec    1.07     70.4±0.76ms   458.4 MB/sec
string_and_binary_view/cdc                         1.00     59.2±0.33ms   544.4 MB/sec    1.05     62.4±0.26ms   516.8 MB/sec
string_and_binary_view/default                     1.00     48.6±0.22ms   664.2 MB/sec    1.07     52.0±0.23ms   620.7 MB/sec
string_and_binary_view/parquet_2                   1.00     59.3±0.27ms   543.4 MB/sec    1.06     62.8±0.34ms   513.3 MB/sec
string_and_binary_view/zstd                        1.00     85.0±0.28ms   379.2 MB/sec    1.04     88.3±0.22ms   365.4 MB/sec
string_and_binary_view/zstd_parquet_2              1.00     73.3±0.29ms   440.0 MB/sec    1.05     76.8±0.23ms   419.8 MB/sec
string_dictionary/bloom_filter                     1.00    109.2±1.30ms     2.4 GB/sec    1.33    145.2±2.49ms  1818.7 MB/sec
string_dictionary/cdc                              1.00     81.6±5.02ms     3.2 GB/sec    1.27    103.3±2.71ms     2.5 GB/sec
string_dictionary/default                          1.00     64.1±1.61ms     4.0 GB/sec    1.48     94.7±1.24ms     2.7 GB/sec
string_dictionary/parquet_2                        1.00     67.1±0.25ms     3.8 GB/sec    1.56    104.9±0.45ms     2.5 GB/sec
string_dictionary/zstd                             1.00    217.7±2.62ms  1213.5 MB/sec    1.04   225.3±15.26ms  1172.3 MB/sec
string_dictionary/zstd_parquet_2                   1.00    199.2±0.26ms  1326.0 MB/sec    1.20    238.4±0.39ms  1107.9 MB/sec
string_non_null/bloom_filter                       1.00   256.9±15.35ms  2039.7 MB/sec    1.05   269.0±13.29ms  1948.2 MB/sec
string_non_null/cdc                                1.00    258.5±1.05ms  2026.8 MB/sec    1.10   284.5±12.16ms  1841.6 MB/sec
string_non_null/default                            1.01   138.3±15.37ms     3.7 GB/sec    1.00   137.4±12.80ms     3.7 GB/sec
string_non_null/parquet_2                          1.00    151.5±0.40ms     3.4 GB/sec    1.04    157.7±0.75ms     3.2 GB/sec
string_non_null/zstd                               1.01   602.1±23.07ms   870.2 MB/sec    1.00   598.9±35.12ms   874.9 MB/sec
string_non_null/zstd_parquet_2                     1.01   528.8±12.56ms   990.9 MB/sec    1.00   525.6±11.06ms   996.9 MB/sec
struct_all_null/bloom_filter                       1.00      2.5±0.00ms     6.2 GB/sec    1.00      2.5±0.01ms     6.2 GB/sec
struct_all_null/cdc                                1.00      9.8±0.13ms  1639.3 MB/sec    1.00      9.8±0.12ms  1641.0 MB/sec
struct_all_null/default                            1.00      2.3±0.00ms     7.0 GB/sec    1.00      2.3±0.00ms     7.0 GB/sec
struct_all_null/parquet_2                          1.00      2.3±0.00ms     7.0 GB/sec    1.00      2.3±0.00ms     7.0 GB/sec
struct_all_null/zstd                               1.00      2.3±0.00ms     6.8 GB/sec    1.00      2.3±0.00ms     6.8 GB/sec
struct_all_null/zstd_parquet_2                     1.00      2.3±0.00ms     6.9 GB/sec    1.00      2.3±0.00ms     6.9 GB/sec
struct_non_null/bloom_filter                       1.00     46.3±0.19ms   345.3 MB/sec    1.07     49.5±1.67ms   323.1 MB/sec
struct_non_null/cdc                                1.00     45.7±0.18ms   349.9 MB/sec    1.01     46.2±0.18ms   346.5 MB/sec
struct_non_null/default                            1.00     32.2±0.11ms   496.3 MB/sec    1.02     32.8±0.12ms   487.7 MB/sec
struct_non_null/parquet_2                          1.00     41.1±0.18ms   389.3 MB/sec    1.01     41.4±0.11ms   386.1 MB/sec
struct_non_null/zstd                               1.00     41.2±0.14ms   388.4 MB/sec    1.01     41.4±0.09ms   386.0 MB/sec
struct_non_null/zstd_parquet_2                     1.00     55.1±0.23ms   290.5 MB/sec    1.00     55.3±0.15ms   289.6 MB/sec
struct_sparse_99pct_null/bloom_filter              1.00      7.7±0.07ms     2.0 GB/sec    1.04      8.0±0.08ms  2019.0 MB/sec
struct_sparse_99pct_null/cdc                       1.00     15.9±0.13ms  1014.7 MB/sec    1.00     15.9±0.10ms  1014.9 MB/sec
struct_sparse_99pct_null/default                   1.00      7.2±0.03ms     2.2 GB/sec    1.00      7.3±0.03ms     2.2 GB/sec
struct_sparse_99pct_null/parquet_2                 1.00      7.3±0.04ms     2.2 GB/sec    1.01      7.3±0.03ms     2.2 GB/sec
struct_sparse_99pct_null/zstd                      1.00      8.6±0.05ms  1869.8 MB/sec    1.00      8.6±0.05ms  1877.6 MB/sec
struct_sparse_99pct_null/zstd_parquet_2            1.01      8.1±0.68ms  1988.4 MB/sec    1.00      8.0±0.03ms  2004.7 MB/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 1965.4s
Peak memory 6.6 GiB
Avg memory 6.4 GiB
CPU user 1892.3s
CPU sys 70.4s
Peak spill 0 B

branch

Metric Value
Wall time 2115.5s
Peak memory 6.8 GiB
Avg memory 6.6 GiB
CPU user 2029.6s
CPU sys 83.4s
Peak spill 0 B

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                              main                                   parquet-page-size-mid-batch
-----                                              ----                                   ---------------------------
bool/bloom_filter                                  1.00     13.2±0.10ms    19.0 MB/sec    1.00     13.1±0.12ms    19.1 MB/sec
bool/cdc                                           1.00     15.9±0.11ms    15.8 MB/sec    1.00     15.8±0.11ms    15.8 MB/sec
bool/default                                       1.00     11.0±0.08ms    22.7 MB/sec    1.00     11.0±0.11ms    22.7 MB/sec
bool/parquet_2                                     1.00     14.9±0.08ms    16.8 MB/sec    1.00     14.9±0.11ms    16.8 MB/sec
bool/zstd                                          1.00     11.5±0.11ms    21.7 MB/sec    1.00     11.5±0.11ms    21.7 MB/sec
bool/zstd_parquet_2                                1.00     15.2±0.10ms    16.4 MB/sec    1.00     15.2±0.12ms    16.4 MB/sec
bool_non_null/bloom_filter                         1.01      7.1±0.05ms    17.7 MB/sec    1.00      7.0±0.03ms    17.8 MB/sec
bool_non_null/cdc                                  1.01      6.9±0.05ms    18.2 MB/sec    1.00      6.8±0.04ms    18.4 MB/sec
bool_non_null/default                              1.00      4.3±0.02ms    29.2 MB/sec    1.00      4.3±0.02ms    29.4 MB/sec
bool_non_null/parquet_2                            1.00      9.0±0.03ms    13.8 MB/sec    1.01      9.1±0.03ms    13.8 MB/sec
bool_non_null/zstd                                 1.01      4.6±0.04ms    27.0 MB/sec    1.00      4.6±0.02ms    27.1 MB/sec
bool_non_null/zstd_parquet_2                       1.00      9.5±0.05ms    13.2 MB/sec    1.00      9.5±0.04ms    13.1 MB/sec
float_with_nans/bloom_filter                       1.00     93.6±0.60ms   149.5 MB/sec    1.00     93.7±0.37ms   149.4 MB/sec
float_with_nans/cdc                                1.00     81.9±0.47ms   171.0 MB/sec    1.00     81.8±0.18ms   171.1 MB/sec
float_with_nans/default                            1.00     74.6±0.31ms   187.8 MB/sec    1.00     74.4±0.19ms   188.1 MB/sec
float_with_nans/parquet_2                          1.00     95.5±0.77ms   146.6 MB/sec    1.00     95.2±0.27ms   147.1 MB/sec
float_with_nans/zstd                               1.00    112.2±0.36ms   124.7 MB/sec    1.00    112.2±0.20ms   124.8 MB/sec
float_with_nans/zstd_parquet_2                     1.00    132.2±0.73ms   105.9 MB/sec    1.00    132.6±0.42ms   105.6 MB/sec
large_string_non_null/bloom_filter                                                        1.00     81.4±0.25ms     3.1 GB/sec
large_string_non_null/cdc                                                                 1.00    242.9±1.01ms  1053.9 MB/sec
large_string_non_null/default                                                             1.00     62.7±0.16ms     4.0 GB/sec
large_string_non_null/parquet_2                                                           1.00     62.6±0.20ms     4.0 GB/sec
large_string_non_null/zstd                                                                1.00     62.6±0.25ms     4.0 GB/sec
large_string_non_null/zstd_parquet_2                                                      1.00     62.7±0.23ms     4.0 GB/sec
list_primitive/bloom_filter                        1.00    326.1±2.39ms  1672.5 MB/sec    1.01    330.5±0.96ms  1650.2 MB/sec
list_primitive/cdc                                 1.00    358.5±1.37ms  1521.0 MB/sec    1.00    358.3±0.87ms  1522.2 MB/sec
list_primitive/default                             1.00    247.5±1.16ms     2.2 GB/sec    1.01    250.2±2.22ms     2.1 GB/sec
list_primitive/parquet_2                           1.00    267.9±0.57ms  2036.0 MB/sec    1.01    270.1±0.72ms  2019.5 MB/sec
list_primitive/zstd                                1.01    499.4±1.63ms  1092.1 MB/sec    1.00    494.7±1.32ms  1102.5 MB/sec
list_primitive/zstd_parquet_2                      1.00    491.4±0.65ms  1109.9 MB/sec    1.00    493.1±0.33ms  1106.0 MB/sec
list_primitive_non_null/bloom_filter               1.00    433.6±5.40ms  1255.1 MB/sec    1.00    434.0±8.63ms  1254.0 MB/sec
list_primitive_non_null/cdc                        1.00    441.9±8.75ms  1231.5 MB/sec    1.00   442.5±19.30ms  1229.8 MB/sec
list_primitive_non_null/default                    1.00    293.6±4.39ms  1853.7 MB/sec    1.04    304.1±7.62ms  1789.9 MB/sec
list_primitive_non_null/parquet_2                  1.01   311.0±13.65ms  1750.2 MB/sec    1.00   307.5±25.04ms  1769.6 MB/sec
list_primitive_non_null/zstd                       1.00    717.4±8.92ms   758.6 MB/sec    1.00   719.1±27.93ms   756.8 MB/sec
list_primitive_non_null/zstd_parquet_2             1.00    671.4±0.87ms   810.6 MB/sec    1.03    690.7±1.03ms   787.9 MB/sec
list_primitive_sparse_99pct_null/bloom_filter      1.00     11.2±0.15ms     3.3 GB/sec    1.09     12.3±0.10ms     3.0 GB/sec
list_primitive_sparse_99pct_null/cdc               1.00     22.8±0.15ms  1638.9 MB/sec    1.03     23.5±0.10ms  1592.1 MB/sec
list_primitive_sparse_99pct_null/default           1.00     10.9±0.12ms     3.3 GB/sec    1.08     11.8±0.12ms     3.1 GB/sec
list_primitive_sparse_99pct_null/parquet_2         1.00     10.9±0.12ms     3.3 GB/sec    1.09     11.9±0.07ms     3.1 GB/sec
list_primitive_sparse_99pct_null/zstd              1.00     12.9±0.10ms     2.8 GB/sec    1.05     13.6±0.07ms     2.7 GB/sec
list_primitive_sparse_99pct_null/zstd_parquet_2    1.00     11.0±0.13ms     3.3 GB/sec    1.08     11.9±0.09ms     3.1 GB/sec
primitive/bloom_filter                             1.01    152.5±0.81ms   294.2 MB/sec    1.00    151.5±1.06ms   296.3 MB/sec
primitive/cdc                                      1.01    161.3±0.76ms   278.2 MB/sec    1.00    160.4±0.80ms   279.8 MB/sec
primitive/default                                  1.00    119.6±0.69ms   375.3 MB/sec    1.00    119.0±0.83ms   377.2 MB/sec
primitive/parquet_2                                1.00    134.3±0.66ms   334.0 MB/sec    1.00    134.8±0.72ms   333.0 MB/sec
primitive/zstd                                     1.00    149.1±0.93ms   300.9 MB/sec    1.00    148.6±0.60ms   301.9 MB/sec
primitive/zstd_parquet_2                           1.00    167.7±0.76ms   267.6 MB/sec    1.00    167.5±0.65ms   267.9 MB/sec
primitive_all_null/bloom_filter                    1.00     11.6±0.18ms     3.8 GB/sec    1.00     11.6±0.14ms     3.8 GB/sec
primitive_all_null/cdc                             1.01     30.8±0.42ms  1458.6 MB/sec    1.00     30.6±0.40ms  1466.5 MB/sec
primitive_all_null/default                         1.00     10.9±0.13ms     4.0 GB/sec    1.01     11.0±0.17ms     4.0 GB/sec
primitive_all_null/parquet_2                       1.00     10.9±0.20ms     4.0 GB/sec    1.00     11.0±0.21ms     4.0 GB/sec
primitive_all_null/zstd                            1.00     11.0±0.16ms     4.0 GB/sec    1.01     11.1±0.19ms     3.9 GB/sec
primitive_all_null/zstd_parquet_2                  1.00     11.1±0.23ms     4.0 GB/sec    1.00     11.1±0.23ms     3.9 GB/sec
primitive_non_null/bloom_filter                    1.08    116.5±1.55ms   377.6 MB/sec    1.00    107.9±0.66ms   408.0 MB/sec
primitive_non_null/cdc                             1.00     90.9±0.67ms   484.0 MB/sec    1.00     90.7±0.51ms   485.0 MB/sec
primitive_non_null/default                         1.00     68.2±0.23ms   645.1 MB/sec    1.00     68.3±0.29ms   644.2 MB/sec
primitive_non_null/parquet_2                       1.00     90.0±0.38ms   488.7 MB/sec    1.00     89.6±0.36ms   491.1 MB/sec
primitive_non_null/zstd                            1.07    105.8±0.53ms   415.7 MB/sec    1.00     98.9±0.33ms   445.1 MB/sec
primitive_non_null/zstd_parquet_2                  1.06    130.8±1.86ms   336.3 MB/sec    1.00    123.3±0.36ms   357.0 MB/sec
primitive_sparse_99pct_null/bloom_filter           1.00     18.9±0.22ms     2.3 GB/sec    1.00     18.9±0.17ms     2.3 GB/sec
primitive_sparse_99pct_null/cdc                    1.00     37.6±0.32ms  1194.9 MB/sec    1.00     37.6±0.23ms  1194.8 MB/sec
primitive_sparse_99pct_null/default                1.00     16.9±0.07ms     2.6 GB/sec    1.01     17.1±0.06ms     2.6 GB/sec
primitive_sparse_99pct_null/parquet_2              1.00     17.0±0.07ms     2.6 GB/sec    1.00     17.0±0.09ms     2.6 GB/sec
primitive_sparse_99pct_null/zstd                   1.00     20.2±0.07ms     2.2 GB/sec    1.01     20.3±0.10ms     2.2 GB/sec
primitive_sparse_99pct_null/zstd_parquet_2         1.00     18.9±0.07ms     2.3 GB/sec    1.01     19.0±0.08ms     2.3 GB/sec
short_string_non_null/bloom_filter                                                        1.00     28.2±0.08ms   425.5 MB/sec
short_string_non_null/cdc                                                                 1.00     20.2±0.07ms   594.8 MB/sec
short_string_non_null/default                                                             1.00     15.9±0.10ms   752.7 MB/sec
short_string_non_null/parquet_2                                                           1.00     25.6±0.06ms   469.3 MB/sec
short_string_non_null/zstd                                                                1.00     36.0±0.10ms   333.5 MB/sec
short_string_non_null/zstd_parquet_2                                                      1.00     28.5±0.07ms   421.1 MB/sec
string/bloom_filter                                1.08   231.7±25.73ms     2.2 GB/sec    1.00   213.8±15.26ms     2.4 GB/sec
string/cdc                                         1.00    221.6±6.09ms     2.3 GB/sec    1.00    221.2±5.30ms     2.3 GB/sec
string/default                                     1.20   143.8±25.23ms     3.6 GB/sec    1.00   119.3±12.07ms     4.3 GB/sec
string/parquet_2                                   1.05    125.4±1.02ms     4.1 GB/sec    1.00    119.2±1.11ms     4.3 GB/sec
string/zstd                                        1.01    426.4±2.98ms  1229.6 MB/sec    1.00    420.4±1.72ms  1247.1 MB/sec
string/zstd_parquet_2                              1.00    394.0±0.73ms  1330.7 MB/sec    1.01    397.8±0.72ms  1317.9 MB/sec
string_and_binary_view/bloom_filter                1.00     66.2±0.68ms   486.9 MB/sec    1.02     67.8±0.21ms   475.8 MB/sec
string_and_binary_view/cdc                         1.00     59.0±0.28ms   546.6 MB/sec    1.05     61.7±0.13ms   522.6 MB/sec
string_and_binary_view/default                     1.00     48.6±0.34ms   664.1 MB/sec    1.05     50.9±0.14ms   633.0 MB/sec
string_and_binary_view/parquet_2                   1.00     59.3±0.43ms   544.3 MB/sec    1.04     61.7±0.13ms   522.3 MB/sec
string_and_binary_view/zstd                        1.00     85.1±0.44ms   378.8 MB/sec    1.03     87.6±0.11ms   368.1 MB/sec
string_and_binary_view/zstd_parquet_2              1.00     73.2±0.27ms   440.9 MB/sec    1.03     75.6±0.14ms   426.3 MB/sec
string_dictionary/bloom_filter                     1.00     91.1±1.88ms     2.8 GB/sec    1.51    137.4±1.18ms  1922.0 MB/sec
string_dictionary/cdc                              1.00     86.0±1.23ms     3.0 GB/sec    1.16     99.8±2.45ms     2.6 GB/sec
string_dictionary/default                          1.00     49.3±0.58ms     5.2 GB/sec    1.86     91.8±1.06ms     2.8 GB/sec
string_dictionary/parquet_2                        1.00     54.3±0.47ms     4.7 GB/sec    1.89    102.6±0.37ms     2.5 GB/sec
string_dictionary/zstd                             1.00    211.1±1.33ms  1251.3 MB/sec    1.06   222.9±14.78ms  1185.0 MB/sec
string_dictionary/zstd_parquet_2                   1.00    198.4±0.36ms  1331.0 MB/sec    1.19    236.2±0.36ms  1118.3 MB/sec
string_non_null/bloom_filter                       1.01   256.4±16.13ms  2043.6 MB/sec    1.00   254.1±12.29ms     2.0 GB/sec
string_non_null/cdc                                1.00    269.9±9.26ms  1941.5 MB/sec    1.03   279.0±11.17ms  1878.1 MB/sec
string_non_null/default                            1.00   128.9±13.26ms     4.0 GB/sec    1.03   133.0±12.02ms     3.8 GB/sec
string_non_null/parquet_2                          1.00   141.2±11.53ms     3.6 GB/sec    1.09    154.6±0.46ms     3.3 GB/sec
string_non_null/zstd                               1.00    533.5±2.20ms   982.2 MB/sec    1.10   586.3±34.00ms   893.7 MB/sec
string_non_null/zstd_parquet_2                     1.00    506.4±2.28ms  1034.7 MB/sec    1.03   520.9±10.85ms  1005.9 MB/sec
struct_all_null/bloom_filter                       1.01      2.5±0.01ms     6.2 GB/sec    1.00      2.5±0.00ms     6.2 GB/sec
struct_all_null/cdc                                1.00      9.8±0.13ms  1640.5 MB/sec    1.00      9.8±0.16ms  1638.0 MB/sec
struct_all_null/default                            1.00      2.3±0.00ms     7.0 GB/sec    1.00      2.3±0.00ms     7.0 GB/sec
struct_all_null/parquet_2                          1.00      2.3±0.00ms     7.0 GB/sec    1.00      2.3±0.00ms     7.0 GB/sec
struct_all_null/zstd                               1.00      2.3±0.00ms     6.8 GB/sec    1.00      2.3±0.00ms     6.8 GB/sec
struct_all_null/zstd_parquet_2                     1.00      2.3±0.00ms     6.9 GB/sec    1.00      2.3±0.00ms     6.9 GB/sec
struct_non_null/bloom_filter                       1.02     48.6±0.29ms   329.1 MB/sec    1.00     47.6±0.16ms   336.2 MB/sec
struct_non_null/cdc                                1.00     46.1±0.21ms   347.3 MB/sec    1.00     46.0±0.16ms   347.6 MB/sec
struct_non_null/default                            1.01     32.6±0.19ms   491.5 MB/sec    1.00     32.4±0.12ms   494.2 MB/sec
struct_non_null/parquet_2                          1.01     41.4±0.18ms   386.5 MB/sec    1.00     41.1±0.11ms   389.2 MB/sec
struct_non_null/zstd                               1.01     41.3±0.18ms   387.5 MB/sec    1.00     41.1±0.10ms   389.7 MB/sec
struct_non_null/zstd_parquet_2                     1.01     55.4±0.14ms   288.6 MB/sec    1.00     55.1±0.14ms   290.3 MB/sec
struct_sparse_99pct_null/bloom_filter              1.00      7.6±0.15ms     2.1 GB/sec    1.00      7.6±0.10ms     2.1 GB/sec
struct_sparse_99pct_null/cdc                       1.00     15.7±0.17ms  1028.6 MB/sec    1.00     15.7±0.18ms  1026.7 MB/sec
struct_sparse_99pct_null/default                   1.00      7.0±0.10ms     2.2 GB/sec    1.00      7.0±0.03ms     2.2 GB/sec
struct_sparse_99pct_null/parquet_2                 1.00      7.0±0.10ms     2.2 GB/sec    1.00      7.0±0.04ms     2.2 GB/sec
struct_sparse_99pct_null/zstd                      1.00      8.4±0.10ms  1909.6 MB/sec    1.00      8.4±0.04ms  1918.0 MB/sec
struct_sparse_99pct_null/zstd_parquet_2            1.00      7.8±0.12ms     2.0 GB/sec    1.00      7.8±0.04ms     2.0 GB/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 1945.4s
Peak memory 6.6 GiB
Avg memory 6.4 GiB
CPU user 1889.0s
CPU sys 55.7s
Peak spill 0 B

branch

Metric Value
Wall time 2105.5s
Peak memory 6.8 GiB
Avg memory 6.6 GiB
CPU user 2019.6s
CPU sys 81.0s
Peak spill 0 B

File an issue against this benchmark runner

@adriangb
Copy link
Copy Markdown
Contributor Author

run benchmark arrow_writer

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4452557769-94-2nbcr 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing parquet-page-size-mid-batch (bbe2b7e) to 48fa8a7 (merge-base) diff
BENCH_NAME=arrow_writer
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_writer
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@adriangb
Copy link
Copy Markdown
Contributor Author

run benchmark arrow_writer

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4453035593-96-5mrpz 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing parquet-page-size-mid-batch (bbe2b7e) to 48fa8a7 (merge-base) diff
BENCH_NAME=arrow_writer
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_writer
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@etseidl
Copy link
Copy Markdown
Contributor

etseidl commented May 14, 2026

Have you considered making the batch size configurable per column?

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                              main                                   parquet-page-size-mid-batch
-----                                              ----                                   ---------------------------
bool/bloom_filter                                  1.00     13.0±0.03ms    19.2 MB/sec    1.00     13.0±0.09ms    19.2 MB/sec
bool/cdc                                           1.00     15.6±0.04ms    16.0 MB/sec    1.00     15.7±0.12ms    16.0 MB/sec
bool/default                                       1.00     10.9±0.03ms    22.9 MB/sec    1.01     11.0±0.12ms    22.8 MB/sec
bool/parquet_2                                     1.00     14.7±0.04ms    17.0 MB/sec    1.00     14.8±0.12ms    16.9 MB/sec
bool/zstd                                          1.00     11.4±0.03ms    21.9 MB/sec    1.00     11.5±0.10ms    21.8 MB/sec
bool/zstd_parquet_2                                1.00     15.1±0.04ms    16.6 MB/sec    1.00     15.1±0.12ms    16.5 MB/sec
bool_non_null/bloom_filter                         1.00      7.0±0.02ms    17.8 MB/sec    1.00      7.0±0.02ms    17.8 MB/sec
bool_non_null/cdc                                  1.00      6.8±0.04ms    18.4 MB/sec    1.00      6.8±0.03ms    18.4 MB/sec
bool_non_null/default                              1.00      4.3±0.02ms    29.3 MB/sec    1.00      4.3±0.02ms    29.2 MB/sec
bool_non_null/parquet_2                            1.00      9.0±0.03ms    13.8 MB/sec    1.00      9.1±0.03ms    13.8 MB/sec
bool_non_null/zstd                                 1.00      4.6±0.02ms    27.1 MB/sec    1.00      4.6±0.02ms    27.0 MB/sec
bool_non_null/zstd_parquet_2                       1.00      9.5±0.04ms    13.2 MB/sec    1.00      9.5±0.04ms    13.2 MB/sec
float_with_nans/bloom_filter                       1.00     92.4±0.36ms   151.5 MB/sec    1.04     96.2±0.72ms   145.5 MB/sec
float_with_nans/cdc                                1.00     81.2±0.20ms   172.4 MB/sec    1.03     83.7±1.01ms   167.3 MB/sec
float_with_nans/default                            1.00     74.1±0.24ms   188.9 MB/sec    1.02     75.9±0.43ms   184.4 MB/sec
float_with_nans/parquet_2                          1.00     94.1±0.39ms   148.8 MB/sec    1.03     97.3±0.55ms   143.8 MB/sec
float_with_nans/zstd                               1.00    111.6±0.21ms   125.4 MB/sec    1.02    114.2±1.07ms   122.6 MB/sec
float_with_nans/zstd_parquet_2                     1.00    131.1±0.39ms   106.8 MB/sec    1.03    134.9±0.57ms   103.8 MB/sec
large_string_non_null/bloom_filter                                                        1.00     82.2±0.16ms     3.0 GB/sec
large_string_non_null/cdc                                                                 1.00    242.6±1.01ms  1055.1 MB/sec
large_string_non_null/default                                                             1.00     63.1±0.22ms     4.0 GB/sec
large_string_non_null/parquet_2                                                           1.00     63.3±0.20ms     4.0 GB/sec
large_string_non_null/zstd                                                                1.00     63.4±0.22ms     3.9 GB/sec
large_string_non_null/zstd_parquet_2                                                      1.00     63.4±0.23ms     3.9 GB/sec
list_primitive/bloom_filter                        1.00    323.2±0.77ms  1687.2 MB/sec    1.05    340.5±3.10ms  1601.6 MB/sec
list_primitive/cdc                                 1.00    357.4±1.04ms  1526.1 MB/sec    1.02    364.5±2.63ms  1496.3 MB/sec
list_primitive/default                             1.00   255.4±46.59ms     2.1 GB/sec    1.00    255.8±1.89ms     2.1 GB/sec
list_primitive/parquet_2                           1.00    268.4±0.72ms  2031.7 MB/sec    1.02    272.7±1.05ms  1999.7 MB/sec
list_primitive/zstd                                1.00    500.1±1.93ms  1090.5 MB/sec    1.02    511.1±1.79ms  1067.0 MB/sec
list_primitive/zstd_parquet_2                      1.00    490.5±0.49ms  1111.9 MB/sec    1.02    500.3±2.01ms  1090.1 MB/sec
list_primitive_non_null/bloom_filter               1.00    418.7±6.33ms  1300.0 MB/sec    1.26   526.0±24.06ms  1034.8 MB/sec
list_primitive_non_null/cdc                        1.00    440.4±7.41ms  1235.7 MB/sec    1.01    442.8±6.00ms  1229.0 MB/sec
list_primitive_non_null/default                    1.00    285.9±4.54ms  1903.7 MB/sec    1.32   376.1±18.32ms  1446.9 MB/sec
list_primitive_non_null/parquet_2                  1.00   306.8±12.97ms  1773.9 MB/sec    1.30    397.9±5.13ms  1367.7 MB/sec
list_primitive_non_null/zstd                       1.00    719.5±6.21ms   756.4 MB/sec    1.09   782.7±16.12ms   695.4 MB/sec
list_primitive_non_null/zstd_parquet_2             1.00    686.4±1.31ms   792.9 MB/sec    1.09    749.3±3.12ms   726.3 MB/sec
list_primitive_sparse_99pct_null/bloom_filter      1.00     11.1±0.22ms     3.3 GB/sec    1.05     11.7±0.11ms     3.1 GB/sec
list_primitive_sparse_99pct_null/cdc               1.00     22.5±0.09ms  1663.4 MB/sec    1.03     23.1±0.11ms  1615.5 MB/sec
list_primitive_sparse_99pct_null/default           1.00     10.7±0.04ms     3.4 GB/sec    1.05     11.2±0.11ms     3.2 GB/sec
list_primitive_sparse_99pct_null/parquet_2         1.00     10.7±0.04ms     3.4 GB/sec    1.04     11.2±0.05ms     3.3 GB/sec
list_primitive_sparse_99pct_null/zstd              1.00     12.6±0.05ms     2.9 GB/sec    1.04     13.1±0.05ms     2.8 GB/sec
list_primitive_sparse_99pct_null/zstd_parquet_2    1.00     10.9±0.04ms     3.4 GB/sec    1.04     11.4±0.08ms     3.2 GB/sec
primitive/bloom_filter                             1.00    148.3±0.47ms   302.7 MB/sec    1.00    148.8±0.66ms   301.6 MB/sec
primitive/cdc                                      1.00    158.7±0.53ms   282.8 MB/sec    1.00    158.8±0.86ms   282.6 MB/sec
primitive/default                                  1.00    117.3±0.24ms   382.6 MB/sec    1.01    117.9±0.68ms   380.5 MB/sec
primitive/parquet_2                                1.00    132.2±0.27ms   339.4 MB/sec    1.00    132.7±0.66ms   338.1 MB/sec
primitive/zstd                                     1.00    146.8±0.28ms   305.8 MB/sec    1.01    147.7±0.71ms   303.7 MB/sec
primitive/zstd_parquet_2                           1.00    165.5±0.33ms   271.1 MB/sec    1.00    166.1±0.59ms   270.1 MB/sec
primitive_all_null/bloom_filter                    1.00     11.5±0.11ms     3.8 GB/sec    1.02     11.8±0.17ms     3.7 GB/sec
primitive_all_null/cdc                             1.03     30.6±0.39ms  1465.9 MB/sec    1.00     29.8±0.46ms  1503.4 MB/sec
primitive_all_null/default                         1.00     10.9±0.21ms     4.0 GB/sec    1.00     10.9±0.11ms     4.0 GB/sec
primitive_all_null/parquet_2                       1.01     11.0±0.20ms     4.0 GB/sec    1.00     10.9±0.15ms     4.0 GB/sec
primitive_all_null/zstd                            1.00     11.0±0.14ms     4.0 GB/sec    1.02     11.2±0.21ms     3.9 GB/sec
primitive_all_null/zstd_parquet_2                  1.00     11.0±0.16ms     4.0 GB/sec    1.01     11.1±0.22ms     4.0 GB/sec
primitive_non_null/bloom_filter                    1.08    113.3±1.61ms   388.4 MB/sec    1.00    105.2±0.20ms   418.4 MB/sec
primitive_non_null/cdc                             1.01     90.0±0.50ms   489.1 MB/sec    1.00     89.3±0.29ms   492.9 MB/sec
primitive_non_null/default                         1.01     67.4±0.22ms   653.2 MB/sec    1.00     67.0±0.18ms   656.5 MB/sec
primitive_non_null/parquet_2                       1.00     89.1±0.20ms   493.9 MB/sec    1.00     88.8±0.14ms   495.7 MB/sec
primitive_non_null/zstd                            1.08    105.6±0.45ms   416.8 MB/sec    1.00     97.6±0.12ms   450.8 MB/sec
primitive_non_null/zstd_parquet_2                  1.06    130.1±1.86ms   338.1 MB/sec    1.00    122.2±0.18ms   360.0 MB/sec
primitive_sparse_99pct_null/bloom_filter           1.00     18.3±0.16ms     2.4 GB/sec    1.06     19.3±0.26ms     2.3 GB/sec
primitive_sparse_99pct_null/cdc                    1.00     37.2±0.38ms  1207.0 MB/sec    1.00     37.2±0.37ms  1205.8 MB/sec
primitive_sparse_99pct_null/default                1.00     16.8±0.06ms     2.6 GB/sec    1.03     17.3±0.08ms     2.5 GB/sec
primitive_sparse_99pct_null/parquet_2              1.00     16.7±0.06ms     2.6 GB/sec    1.03     17.3±0.10ms     2.5 GB/sec
primitive_sparse_99pct_null/zstd                   1.00     20.0±0.10ms     2.2 GB/sec    1.03     20.6±0.12ms     2.1 GB/sec
primitive_sparse_99pct_null/zstd_parquet_2         1.00     18.7±0.09ms     2.3 GB/sec    1.03     19.3±0.12ms     2.3 GB/sec
short_string_non_null/bloom_filter                                                        1.00     29.7±0.13ms   403.9 MB/sec
short_string_non_null/cdc                                                                 1.00     20.2±0.06ms   594.2 MB/sec
short_string_non_null/default                                                             1.00     16.4±0.13ms   733.4 MB/sec
short_string_non_null/parquet_2                                                           1.00     26.0±0.14ms   461.6 MB/sec
short_string_non_null/zstd                                                                1.00     38.2±6.26ms   314.3 MB/sec
short_string_non_null/zstd_parquet_2                                                      1.00     29.1±0.18ms   412.8 MB/sec
string/bloom_filter                                1.05   228.5±26.93ms     2.2 GB/sec    1.00   217.9±17.20ms     2.3 GB/sec
string/cdc                                         1.00    220.9±5.71ms     2.3 GB/sec    1.00    220.6±5.33ms     2.3 GB/sec
string/default                                     1.15   140.0±24.70ms     3.7 GB/sec    1.00   122.1±13.31ms     4.2 GB/sec
string/parquet_2                                   1.01    126.4±1.85ms     4.1 GB/sec    1.00    124.8±1.03ms     4.1 GB/sec
string/zstd                                        1.01    424.6±2.74ms  1234.6 MB/sec    1.00    421.6±1.32ms  1243.6 MB/sec
string/zstd_parquet_2                              1.00    394.7±1.28ms  1328.2 MB/sec    1.02    401.3±0.35ms  1306.2 MB/sec
string_and_binary_view/bloom_filter                1.00     63.4±0.24ms   508.7 MB/sec    1.09     69.2±0.17ms   465.9 MB/sec
string_and_binary_view/cdc                         1.00     58.3±0.12ms   553.1 MB/sec    1.05     61.4±0.15ms   525.6 MB/sec
string_and_binary_view/default                     1.00     47.7±0.10ms   676.3 MB/sec    1.10     52.5±0.18ms   614.2 MB/sec
string_and_binary_view/parquet_2                   1.00     58.5±0.12ms   551.0 MB/sec    1.08     63.2±0.32ms   510.4 MB/sec
string_and_binary_view/zstd                        1.00     84.2±0.15ms   382.9 MB/sec    1.06     89.3±0.37ms   360.9 MB/sec
string_and_binary_view/zstd_parquet_2              1.00     72.4±0.11ms   445.6 MB/sec    1.07     77.1±0.16ms   418.0 MB/sec
string_dictionary/bloom_filter                     1.00     88.8±0.66ms     2.9 GB/sec    1.54    136.8±2.03ms  1930.6 MB/sec
string_dictionary/cdc                              1.00     86.6±2.67ms     3.0 GB/sec    1.14     98.8±3.49ms     2.6 GB/sec
string_dictionary/default                          1.00     48.2±0.34ms     5.3 GB/sec    1.91     91.9±2.19ms     2.8 GB/sec
string_dictionary/parquet_2                        1.00     53.8±0.16ms     4.8 GB/sec    1.94    104.2±2.36ms     2.5 GB/sec
string_dictionary/zstd                             1.00    208.7±0.68ms  1265.8 MB/sec    1.07   223.1±14.95ms  1184.1 MB/sec
string_dictionary/zstd_parquet_2                   1.00    197.8±0.12ms  1335.0 MB/sec    1.20    236.6±1.77ms  1116.4 MB/sec
string_non_null/bloom_filter                       1.00   252.4±15.95ms     2.0 GB/sec    1.01   254.2±12.25ms     2.0 GB/sec
string_non_null/cdc                                1.00    267.6±9.15ms  1958.5 MB/sec    1.06   283.6±12.31ms  1847.7 MB/sec
string_non_null/default                            1.00   126.2±12.63ms     4.1 GB/sec    1.07   135.2±12.42ms     3.8 GB/sec
string_non_null/parquet_2                          1.00   141.3±12.34ms     3.6 GB/sec    1.11    157.1±1.99ms     3.3 GB/sec
string_non_null/zstd                               1.00    531.4±2.22ms   986.0 MB/sec    1.11   589.4±34.73ms   889.0 MB/sec
string_non_null/zstd_parquet_2                     1.00    505.7±2.52ms  1036.1 MB/sec    1.04   527.6±11.72ms   993.2 MB/sec
struct_all_null/bloom_filter                       1.00      2.5±0.00ms     6.2 GB/sec    1.01      2.6±0.02ms     6.2 GB/sec
struct_all_null/cdc                                1.00      9.9±0.12ms  1634.9 MB/sec    1.01     10.0±0.12ms  1611.5 MB/sec
struct_all_null/default                            1.00      2.2±0.00ms     7.0 GB/sec    1.00      2.3±0.00ms     7.0 GB/sec
struct_all_null/parquet_2                          1.00      2.3±0.00ms     7.0 GB/sec    1.00      2.3±0.00ms     7.0 GB/sec
struct_all_null/zstd                               1.00      2.3±0.00ms     6.9 GB/sec    1.00      2.3±0.00ms     6.8 GB/sec
struct_all_null/zstd_parquet_2                     1.00      2.3±0.00ms     6.9 GB/sec    1.00      2.3±0.00ms     6.9 GB/sec
struct_non_null/bloom_filter                       1.00     47.1±0.20ms   339.8 MB/sec    1.02     47.9±0.83ms   334.2 MB/sec
struct_non_null/cdc                                1.00     45.7±0.22ms   350.3 MB/sec    1.01     46.1±0.46ms   346.9 MB/sec
struct_non_null/default                            1.00     32.1±0.15ms   499.0 MB/sec    1.01     32.4±0.44ms   493.3 MB/sec
struct_non_null/parquet_2                          1.00     40.8±0.51ms   392.0 MB/sec    1.01     41.2±0.49ms   388.4 MB/sec
struct_non_null/zstd                               1.00     40.8±0.11ms   392.1 MB/sec    1.01     41.2±0.54ms   387.9 MB/sec
struct_non_null/zstd_parquet_2                     1.00     54.9±0.17ms   291.5 MB/sec    1.03     56.3±2.02ms   284.1 MB/sec
struct_sparse_99pct_null/bloom_filter              1.00      7.5±0.05ms     2.1 GB/sec    1.07      8.0±0.11ms  2019.0 MB/sec
struct_sparse_99pct_null/cdc                       1.00     15.3±0.08ms  1051.6 MB/sec    1.01     15.5±0.09ms  1040.0 MB/sec
struct_sparse_99pct_null/default                   1.00      7.0±0.04ms     2.3 GB/sec    1.04      7.3±0.07ms     2.2 GB/sec
struct_sparse_99pct_null/parquet_2                 1.00      6.9±0.03ms     2.3 GB/sec    1.05      7.3±0.07ms     2.2 GB/sec
struct_sparse_99pct_null/zstd                      1.00      8.3±0.02ms  1949.3 MB/sec    1.05      8.7±0.07ms  1864.2 MB/sec
struct_sparse_99pct_null/zstd_parquet_2            1.00      7.7±0.02ms     2.0 GB/sec    1.05      8.1±0.04ms  1998.4 MB/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 1940.4s
Peak memory 6.6 GiB
Avg memory 6.4 GiB
CPU user 1879.7s
CPU sys 57.5s
Peak spill 0 B

branch

Metric Value
Wall time 2155.5s
Peak memory 6.8 GiB
Avg memory 6.7 GiB
CPU user 2078.6s
CPU sys 76.3s
Peak spill 0 B

File an issue against this benchmark runner

@adriangb
Copy link
Copy Markdown
Contributor Author

Have you considered making the batch size configurable per column?

Yes, that may be a simpler approach. But I'm hoping we can get to a place where users don't have to think about / configure this. Given they gave us a page size limit it'd be nice if we can always adhere to that...

Comment thread parquet/src/data_type.rs Outdated
/// push a page far past the configured page byte limit before the
/// post-write size check fires.
#[inline]
fn byte_size(&self) -> usize {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to duplicate dict_encoding_size. Also, #9700 wants to rename dict_encoding_size and instead implement it pretty much the same way as here.

@etseidl
Copy link
Copy Markdown
Contributor

etseidl commented May 14, 2026

Another thought...maybe add another chunker like the CDC work added (

fn write_with_chunker(
). If we compute batches up front when we know the shape of the data that might be faster 🤷

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                              main                                   parquet-page-size-mid-batch
-----                                              ----                                   ---------------------------
bool/bloom_filter                                  1.01     13.2±0.08ms    18.9 MB/sec    1.00     13.1±0.06ms    19.1 MB/sec
bool/cdc                                           1.01     16.0±0.08ms    15.6 MB/sec    1.00     15.8±0.26ms    15.8 MB/sec
bool/default                                       1.01     11.0±0.07ms    22.6 MB/sec    1.00     10.9±0.09ms    22.9 MB/sec
bool/parquet_2                                     1.00     14.7±0.05ms    17.0 MB/sec    1.00     14.7±0.07ms    17.0 MB/sec
bool/zstd                                          1.02     11.6±0.06ms    21.6 MB/sec    1.00     11.4±0.06ms    22.0 MB/sec
bool/zstd_parquet_2                                1.01     15.2±0.08ms    16.5 MB/sec    1.00     15.1±0.27ms    16.6 MB/sec
bool_non_null/bloom_filter                         1.00      7.1±0.04ms    17.6 MB/sec    1.00      7.1±0.15ms    17.6 MB/sec
bool_non_null/cdc                                  1.00      7.0±0.05ms    17.9 MB/sec    1.00      7.0±0.13ms    17.8 MB/sec
bool_non_null/default                              1.00      4.3±0.03ms    29.1 MB/sec    1.00      4.3±0.10ms    29.0 MB/sec
bool_non_null/parquet_2                            1.00      9.1±0.05ms    13.7 MB/sec    1.00      9.1±0.24ms    13.7 MB/sec
bool_non_null/zstd                                 1.00      4.7±0.02ms    26.9 MB/sec    1.00      4.6±0.03ms    27.0 MB/sec
bool_non_null/zstd_parquet_2                       1.00      9.5±0.05ms    13.1 MB/sec    1.00      9.5±0.24ms    13.1 MB/sec
float_with_nans/bloom_filter                       1.00     95.7±2.15ms   146.3 MB/sec    1.01     97.0±2.24ms   144.3 MB/sec
float_with_nans/cdc                                1.00     82.9±1.46ms   168.8 MB/sec    1.03     85.0±1.07ms   164.7 MB/sec
float_with_nans/default                            1.00     76.1±2.46ms   184.0 MB/sec    1.01     76.9±1.74ms   182.0 MB/sec
float_with_nans/parquet_2                          1.00     97.5±2.31ms   143.7 MB/sec    1.00     97.7±2.52ms   143.3 MB/sec
float_with_nans/zstd                               1.01    114.5±2.02ms   122.2 MB/sec    1.00    113.0±1.76ms   123.9 MB/sec
float_with_nans/zstd_parquet_2                     1.00    134.1±2.59ms   104.4 MB/sec    1.01    135.2±2.50ms   103.6 MB/sec
large_string_non_null/bloom_filter                                                        1.00     84.5±3.68ms     3.0 GB/sec
large_string_non_null/cdc                                                                 1.00    244.0±2.19ms  1049.1 MB/sec
large_string_non_null/default                                                             1.00     64.3±2.55ms     3.9 GB/sec
large_string_non_null/parquet_2                                                           1.00     63.7±3.92ms     3.9 GB/sec
large_string_non_null/zstd                                                                1.00     60.7±0.26ms     4.1 GB/sec
large_string_non_null/zstd_parquet_2                                                      1.00     62.2±1.88ms     4.0 GB/sec
list_primitive/bloom_filter                        1.00    340.8±6.92ms  1600.0 MB/sec    1.08   369.1±15.47ms  1477.7 MB/sec
list_primitive/cdc                                 1.00    364.0±3.08ms  1498.2 MB/sec    1.02    371.0±8.13ms  1469.8 MB/sec
list_primitive/default                             1.00    255.5±6.60ms     2.1 GB/sec    1.07    274.6±8.33ms  1986.2 MB/sec
list_primitive/parquet_2                           1.00    271.3±3.24ms  2010.0 MB/sec    1.06    288.7±2.44ms  1888.7 MB/sec
list_primitive/zstd                                1.00    506.5±6.71ms  1076.6 MB/sec    1.03    520.8±7.25ms  1047.1 MB/sec
list_primitive/zstd_parquet_2                      1.00    496.0±3.27ms  1099.5 MB/sec    1.00    496.6±4.31ms  1098.2 MB/sec
list_primitive_non_null/bloom_filter               1.00   447.4±21.93ms  1216.4 MB/sec    1.16   517.1±33.75ms  1052.5 MB/sec
list_primitive_non_null/cdc                        1.01   450.7±12.72ms  1207.4 MB/sec    1.00   447.8±11.71ms  1215.3 MB/sec
list_primitive_non_null/default                    1.00   303.8±11.10ms  1791.3 MB/sec    1.25   379.8±19.28ms  1432.9 MB/sec
list_primitive_non_null/parquet_2                  1.00   318.8±16.26ms  1707.3 MB/sec    1.33   422.5±16.94ms  1288.2 MB/sec
list_primitive_non_null/zstd                       1.00   735.5±14.23ms   740.0 MB/sec    1.07   783.7±19.04ms   694.4 MB/sec
list_primitive_non_null/zstd_parquet_2             1.00    702.1±6.10ms   775.1 MB/sec    1.06    747.6±4.61ms   728.0 MB/sec
list_primitive_sparse_99pct_null/bloom_filter      1.00     11.3±0.24ms     3.2 GB/sec    1.04     11.7±0.06ms     3.1 GB/sec
list_primitive_sparse_99pct_null/cdc               1.02     23.0±0.23ms  1625.1 MB/sec    1.00     22.6±0.10ms  1652.1 MB/sec
list_primitive_sparse_99pct_null/default           1.03     11.2±0.05ms     3.3 GB/sec    1.00     10.9±0.04ms     3.4 GB/sec
list_primitive_sparse_99pct_null/parquet_2         1.00     11.2±0.14ms     3.3 GB/sec    1.02     11.3±0.07ms     3.2 GB/sec
list_primitive_sparse_99pct_null/zstd              1.00     13.1±0.29ms     2.8 GB/sec    1.01     13.2±0.29ms     2.8 GB/sec
list_primitive_sparse_99pct_null/zstd_parquet_2    1.00     10.9±0.05ms     3.3 GB/sec    1.03     11.3±0.31ms     3.2 GB/sec
primitive/bloom_filter                             1.02    156.1±3.23ms   287.4 MB/sec    1.00    153.2±2.93ms   293.0 MB/sec
primitive/cdc                                      1.00    161.7±3.06ms   277.5 MB/sec    1.01    162.6±3.18ms   276.1 MB/sec
primitive/default                                  1.02    120.6±0.82ms   372.0 MB/sec    1.00    118.3±1.10ms   379.3 MB/sec
primitive/parquet_2                                1.00    134.6±2.25ms   333.5 MB/sec    1.00    134.9±1.97ms   332.7 MB/sec
primitive/zstd                                     1.01    149.1±1.65ms   300.9 MB/sec    1.00    147.7±0.76ms   303.8 MB/sec
primitive/zstd_parquet_2                           1.00    168.8±1.53ms   265.9 MB/sec    1.00    168.8±1.52ms   265.8 MB/sec
primitive_all_null/bloom_filter                    1.00     11.9±0.18ms     3.7 GB/sec    1.00     11.8±0.23ms     3.7 GB/sec
primitive_all_null/cdc                             1.03     30.9±0.49ms  1450.1 MB/sec    1.00     30.1±0.51ms  1492.4 MB/sec
primitive_all_null/default                         1.00     10.9±0.11ms     4.0 GB/sec    1.00     11.0±0.16ms     4.0 GB/sec
primitive_all_null/parquet_2                       1.00     11.0±0.18ms     4.0 GB/sec    1.00     11.0±0.23ms     4.0 GB/sec
primitive_all_null/zstd                            1.00     11.0±0.13ms     4.0 GB/sec    1.00     11.0±0.17ms     4.0 GB/sec
primitive_all_null/zstd_parquet_2                  1.01     11.0±0.15ms     4.0 GB/sec    1.00     10.9±0.10ms     4.0 GB/sec
primitive_non_null/bloom_filter                    1.09    117.1±2.94ms   375.6 MB/sec    1.00    107.7±1.65ms   408.5 MB/sec
primitive_non_null/cdc                             1.00     92.0±1.42ms   478.3 MB/sec    1.00     91.8±1.31ms   479.5 MB/sec
primitive_non_null/default                         1.03     69.7±1.09ms   631.7 MB/sec    1.00     67.7±0.30ms   649.9 MB/sec
primitive_non_null/parquet_2                       1.01     91.7±1.91ms   479.9 MB/sec    1.00     91.2±1.32ms   482.7 MB/sec
primitive_non_null/zstd                            1.07    107.1±2.41ms   410.8 MB/sec    1.00     99.8±1.92ms   441.0 MB/sec
primitive_non_null/zstd_parquet_2                  1.05    131.4±1.99ms   334.8 MB/sec    1.00    124.6±1.30ms   353.2 MB/sec
primitive_sparse_99pct_null/bloom_filter           1.01     19.0±0.38ms     2.3 GB/sec    1.00     18.9±0.49ms     2.3 GB/sec
primitive_sparse_99pct_null/cdc                    1.01     37.5±0.40ms  1198.2 MB/sec    1.00     37.1±0.54ms  1208.8 MB/sec
primitive_sparse_99pct_null/default                1.02     17.3±0.16ms     2.5 GB/sec    1.00     16.9±0.06ms     2.6 GB/sec
primitive_sparse_99pct_null/parquet_2              1.00     17.4±0.09ms     2.5 GB/sec    1.00     17.5±0.08ms     2.5 GB/sec
primitive_sparse_99pct_null/zstd                   1.00     20.4±0.24ms     2.1 GB/sec    1.02     20.7±0.28ms     2.1 GB/sec
primitive_sparse_99pct_null/zstd_parquet_2         1.00     19.2±0.38ms     2.3 GB/sec    1.01     19.4±0.40ms     2.3 GB/sec
short_string_non_null/bloom_filter                                                        1.00     29.7±0.21ms   404.3 MB/sec
short_string_non_null/cdc                                                                 1.00     20.6±0.17ms   581.8 MB/sec
short_string_non_null/default                                                             1.00     16.9±0.20ms   710.8 MB/sec
short_string_non_null/parquet_2                                                           1.00     26.5±0.24ms   452.8 MB/sec
short_string_non_null/zstd                                                                1.00     36.5±0.12ms   328.8 MB/sec
short_string_non_null/zstd_parquet_2                                                      1.00     29.2±0.17ms   410.8 MB/sec
string/bloom_filter                                1.09   231.5±23.89ms     2.2 GB/sec    1.00   211.7±14.75ms     2.4 GB/sec
string/cdc                                         1.01    226.1±9.11ms     2.3 GB/sec    1.00   224.7±11.94ms     2.3 GB/sec
string/default                                     1.12   146.7±24.93ms     3.5 GB/sec    1.00   131.6±11.00ms     3.9 GB/sec
string/parquet_2                                   1.07    129.2±2.88ms     4.0 GB/sec    1.00    120.6±6.92ms     4.2 GB/sec
string/zstd                                        1.00    430.5±8.49ms  1217.7 MB/sec    1.03    441.5±9.87ms  1187.4 MB/sec
string/zstd_parquet_2                              1.00    397.8±4.26ms  1318.0 MB/sec    1.02    406.3±4.56ms  1290.2 MB/sec
string_and_binary_view/bloom_filter                1.00     67.2±3.49ms   479.6 MB/sec    1.01     68.2±0.39ms   472.6 MB/sec
string_and_binary_view/cdc                         1.00     59.4±1.21ms   543.2 MB/sec    1.06     62.8±2.45ms   513.8 MB/sec
string_and_binary_view/default                     1.00     48.7±0.95ms   662.0 MB/sec    1.11     54.0±2.59ms   597.0 MB/sec
string_and_binary_view/parquet_2                   1.00     58.9±0.58ms   547.7 MB/sec    1.11     65.2±0.86ms   494.4 MB/sec
string_and_binary_view/zstd                        1.00     85.5±0.95ms   377.2 MB/sec    1.06     90.8±1.48ms   355.0 MB/sec
string_and_binary_view/zstd_parquet_2              1.00     74.2±0.66ms   434.4 MB/sec    1.07     79.3±2.37ms   406.8 MB/sec
string_dictionary/bloom_filter                     1.03    104.5±0.90ms     2.5 GB/sec    1.00    101.1±5.73ms     2.6 GB/sec
string_dictionary/cdc                              1.38     78.4±2.15ms     3.3 GB/sec    1.00     56.7±3.11ms     4.6 GB/sec
string_dictionary/default                          1.29     65.3±3.55ms     3.9 GB/sec    1.00     50.6±0.56ms     5.1 GB/sec
string_dictionary/parquet_2                        1.19     67.8±0.44ms     3.8 GB/sec    1.00     56.8±1.57ms     4.5 GB/sec
string_dictionary/zstd                             1.02    218.7±3.32ms  1207.8 MB/sec    1.00    215.4±4.94ms  1226.0 MB/sec
string_dictionary/zstd_parquet_2                   1.00    200.1±2.51ms  1319.8 MB/sec    1.01    201.9±1.54ms  1308.5 MB/sec
string_non_null/bloom_filter                       1.03   270.4±28.27ms  1937.7 MB/sec    1.00   263.2±19.14ms  1990.8 MB/sec
string_non_null/cdc                                1.01   274.5±13.60ms  1909.1 MB/sec    1.00   272.9±11.94ms  1919.9 MB/sec
string_non_null/default                            1.00   148.1±14.93ms     3.5 GB/sec    1.03   152.1±16.41ms     3.4 GB/sec
string_non_null/parquet_2                          1.00    145.3±9.80ms     3.5 GB/sec    1.08    156.8±3.28ms     3.3 GB/sec
string_non_null/zstd                               1.00   573.6±17.33ms   913.5 MB/sec    1.02   584.3±20.28ms   896.8 MB/sec
string_non_null/zstd_parquet_2                     1.00   527.8±11.88ms   992.8 MB/sec    1.00   529.4±13.85ms   989.8 MB/sec
struct_all_null/bloom_filter                       1.00      2.6±0.04ms     6.1 GB/sec    1.01      2.6±0.04ms     6.1 GB/sec
struct_all_null/cdc                                1.00      9.9±0.17ms  1637.0 MB/sec    1.02     10.0±0.13ms  1612.6 MB/sec
struct_all_null/default                            1.00      2.3±0.00ms     7.0 GB/sec    1.00      2.3±0.00ms     7.0 GB/sec
struct_all_null/parquet_2                          1.00      2.3±0.00ms     7.0 GB/sec    1.00      2.3±0.00ms     7.0 GB/sec
struct_all_null/zstd                               1.00      2.3±0.00ms     6.8 GB/sec    1.00      2.3±0.00ms     6.8 GB/sec
struct_all_null/zstd_parquet_2                     1.00      2.3±0.00ms     6.9 GB/sec    1.00      2.3±0.00ms     6.9 GB/sec
struct_non_null/bloom_filter                       1.00     47.1±0.89ms   339.4 MB/sec    1.03     48.4±1.17ms   330.9 MB/sec
struct_non_null/cdc                                1.00     46.4±0.36ms   344.8 MB/sec    1.00     46.3±0.38ms   345.9 MB/sec
struct_non_null/default                            1.01     32.8±0.62ms   487.6 MB/sec    1.00     32.6±0.23ms   490.3 MB/sec
struct_non_null/parquet_2                          1.00     40.9±0.33ms   391.4 MB/sec    1.01     41.4±0.79ms   386.5 MB/sec
struct_non_null/zstd                               1.02     41.3±0.53ms   387.0 MB/sec    1.00     40.7±0.23ms   392.8 MB/sec
struct_non_null/zstd_parquet_2                     1.01     55.3±0.45ms   289.4 MB/sec    1.00     54.9±0.14ms   291.6 MB/sec
struct_sparse_99pct_null/bloom_filter              1.00      7.9±0.36ms  2033.3 MB/sec    1.02      8.1±0.33ms  1993.7 MB/sec
struct_sparse_99pct_null/cdc                       1.04     16.0±0.15ms  1005.3 MB/sec    1.00     15.5±0.39ms  1041.4 MB/sec
struct_sparse_99pct_null/default                   1.00      7.2±0.18ms     2.2 GB/sec    1.03      7.4±0.08ms     2.1 GB/sec
struct_sparse_99pct_null/parquet_2                 1.00      7.0±0.12ms     2.2 GB/sec    1.03      7.3±0.22ms     2.2 GB/sec
struct_sparse_99pct_null/zstd                      1.05      8.9±0.27ms  1820.8 MB/sec    1.00      8.5±0.09ms  1903.6 MB/sec
struct_sparse_99pct_null/zstd_parquet_2            1.00      7.8±0.10ms     2.0 GB/sec    1.03      8.1±0.22ms  1992.9 MB/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 1970.4s
Peak memory 6.6 GiB
Avg memory 6.4 GiB
CPU user 1897.3s
CPU sys 71.4s
Peak spill 0 B

branch

Metric Value
Wall time 2145.5s
Peak memory 6.8 GiB
Avg memory 6.6 GiB
CPU user 2083.9s
CPU sys 60.2s
Peak spill 0 B

File an issue against this benchmark runner

@adriangb
Copy link
Copy Markdown
Contributor Author

run benchmark arrow_writer

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4453799534-97-tqn4k 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing parquet-page-size-mid-batch (145ea5d) to 48fa8a7 (merge-base) diff
BENCH_NAME=arrow_writer
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_writer
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@adriangb
Copy link
Copy Markdown
Contributor Author

run benchmark arrow_writer

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4536140027-311-mrpcx 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing parquet-page-size-mid-batch (11c9d51) to e28fd0d (merge-base) diff
BENCH_NAME=arrow_writer
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_writer
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                              main                                   parquet-page-size-mid-batch
-----                                              ----                                   ---------------------------
bool/bloom_filter                                  1.00     13.1±0.04ms    19.2 MB/sec    1.00     13.1±0.06ms    19.1 MB/sec
bool/cdc                                           1.01     15.9±0.06ms    15.7 MB/sec    1.00     15.8±0.11ms    15.8 MB/sec
bool/default                                       1.00     10.9±0.03ms    22.8 MB/sec    1.00     11.0±0.06ms    22.8 MB/sec
bool/parquet_2                                     1.00     14.8±0.04ms    16.9 MB/sec    1.00     14.8±0.07ms    16.9 MB/sec
bool/zstd                                          1.00     11.5±0.04ms    21.8 MB/sec    1.00     11.5±0.06ms    21.7 MB/sec
bool/zstd_parquet_2                                1.00     15.2±0.05ms    16.5 MB/sec    1.00     15.2±0.06ms    16.5 MB/sec
bool_non_null/bloom_filter                         1.00      7.0±0.03ms    17.9 MB/sec    1.00      7.0±0.02ms    18.0 MB/sec
bool_non_null/cdc                                  1.00      6.8±0.02ms    18.5 MB/sec    1.00      6.8±0.09ms    18.4 MB/sec
bool_non_null/default                              1.01      4.2±0.02ms    29.6 MB/sec    1.00      4.2±0.03ms    29.8 MB/sec
bool_non_null/parquet_2                            1.00      9.0±0.03ms    13.9 MB/sec    1.00      9.0±0.05ms    13.9 MB/sec
bool_non_null/zstd                                 1.01      4.6±0.02ms    27.4 MB/sec    1.00      4.5±0.02ms    27.5 MB/sec
bool_non_null/zstd_parquet_2                       1.00      9.4±0.03ms    13.3 MB/sec    1.00      9.4±0.03ms    13.3 MB/sec
float_with_nans/bloom_filter                       1.00     92.9±0.23ms   150.6 MB/sec    1.01     93.5±0.46ms   149.8 MB/sec
float_with_nans/cdc                                1.00     81.7±0.20ms   171.4 MB/sec    1.00     81.9±0.24ms   171.0 MB/sec
float_with_nans/default                            1.00     74.5±0.14ms   187.9 MB/sec    1.01     75.0±0.36ms   186.8 MB/sec
float_with_nans/parquet_2                          1.00     94.7±0.22ms   147.9 MB/sec    1.01     95.5±0.28ms   146.6 MB/sec
float_with_nans/zstd                               1.00    111.9±0.19ms   125.1 MB/sec    1.00    112.5±0.28ms   124.5 MB/sec
float_with_nans/zstd_parquet_2                     1.00    131.9±0.23ms   106.2 MB/sec    1.01    132.7±0.24ms   105.5 MB/sec
large_string_non_null/bloom_filter                                                        1.00     72.9±0.14ms     3.4 GB/sec
large_string_non_null/cdc                                                                 1.00    242.5±1.13ms  1055.8 MB/sec
large_string_non_null/default                                                             1.00     55.3±0.10ms     4.5 GB/sec
large_string_non_null/parquet_2                                                           1.00     55.3±0.10ms     4.5 GB/sec
large_string_non_null/zstd                                                                1.00     55.4±0.12ms     4.5 GB/sec
large_string_non_null/zstd_parquet_2                                                      1.00     55.4±0.13ms     4.5 GB/sec
list_primitive/bloom_filter                        1.00    334.0±1.24ms  1632.6 MB/sec    1.07    355.8±2.91ms  1532.9 MB/sec
list_primitive/cdc                                 1.00    367.8±1.39ms  1482.6 MB/sec    1.03    377.7±6.60ms  1444.0 MB/sec
list_primitive/default                             1.00    256.3±1.58ms     2.1 GB/sec    1.07    274.2±6.32ms  1989.1 MB/sec
list_primitive/parquet_2                           1.00    277.5±0.59ms  1965.6 MB/sec    1.00    278.8±0.38ms  1956.0 MB/sec
list_primitive/zstd                                1.00    509.7±3.72ms  1070.0 MB/sec    1.00    511.4±2.46ms  1066.5 MB/sec
list_primitive/zstd_parquet_2                      1.00    500.7±0.46ms  1089.2 MB/sec    1.00    502.0±0.43ms  1086.4 MB/sec
list_primitive_non_null/bloom_filter               1.00    393.9±4.65ms  1381.8 MB/sec    1.15   454.5±10.66ms  1197.5 MB/sec
list_primitive_non_null/cdc                        1.01    441.7±7.12ms  1232.2 MB/sec    1.00    438.8±7.79ms  1240.2 MB/sec
list_primitive_non_null/default                    1.00    263.3±3.61ms     2.0 GB/sec    1.19   312.1±12.73ms  1743.8 MB/sec
list_primitive_non_null/parquet_2                  1.00    293.7±1.12ms  1853.2 MB/sec    1.19    349.2±0.64ms  1558.7 MB/sec
list_primitive_non_null/zstd                       1.00    686.2±4.92ms   793.1 MB/sec    1.04   714.2±15.84ms   762.0 MB/sec
list_primitive_non_null/zstd_parquet_2             1.00    669.8±1.26ms   812.6 MB/sec    1.00    669.5±0.80ms   812.9 MB/sec
list_primitive_sparse_99pct_null/bloom_filter      1.00     11.6±0.05ms     3.1 GB/sec    1.05     12.3±0.06ms     3.0 GB/sec
list_primitive_sparse_99pct_null/cdc               1.00     23.4±0.08ms  1598.7 MB/sec    1.04     24.2±0.09ms  1541.4 MB/sec
list_primitive_sparse_99pct_null/default           1.00     11.3±0.04ms     3.2 GB/sec    1.06     12.0±0.07ms     3.0 GB/sec
list_primitive_sparse_99pct_null/parquet_2         1.00     11.3±0.04ms     3.2 GB/sec    1.06     12.0±0.07ms     3.0 GB/sec
list_primitive_sparse_99pct_null/zstd              1.00     13.1±0.04ms     2.8 GB/sec    1.06     13.9±0.08ms     2.6 GB/sec
list_primitive_sparse_99pct_null/zstd_parquet_2    1.00     11.5±0.07ms     3.2 GB/sec    1.06     12.2±0.07ms     3.0 GB/sec
primitive/bloom_filter                             1.00    150.6±0.40ms   297.9 MB/sec    1.01    151.6±0.59ms   295.9 MB/sec
primitive/cdc                                      1.00    159.8±0.56ms   280.9 MB/sec    1.01    161.1±0.68ms   278.5 MB/sec
primitive/default                                  1.00    118.7±0.30ms   378.1 MB/sec    1.01    119.8±0.45ms   374.6 MB/sec
primitive/parquet_2                                1.00    133.6±0.30ms   335.8 MB/sec    1.01    134.6±0.42ms   333.3 MB/sec
primitive/zstd                                     1.00    148.5±0.31ms   302.1 MB/sec    1.01    149.4±0.42ms   300.4 MB/sec
primitive/zstd_parquet_2                           1.00    167.2±0.27ms   268.4 MB/sec    1.01    168.1±0.71ms   267.0 MB/sec
primitive_all_null/bloom_filter                    1.00    900.6±3.03µs    48.7 GB/sec    1.00    896.2±2.17µs    48.9 GB/sec
primitive_all_null/cdc                             1.03     19.2±0.37ms     2.3 GB/sec    1.00     18.7±0.33ms     2.3 GB/sec
primitive_all_null/default                         1.00    274.0±0.74µs   159.9 GB/sec    1.01    277.2±0.89µs   158.1 GB/sec
primitive_all_null/parquet_2                       1.01    280.5±1.09µs   156.2 GB/sec    1.00    279.0±0.89µs   157.1 GB/sec
primitive_all_null/zstd                            1.00    388.0±0.90µs   112.9 GB/sec    1.01    393.2±0.86µs   111.5 GB/sec
primitive_all_null/zstd_parquet_2                  1.00    356.4±1.22µs   122.9 GB/sec    1.01    360.4±0.92µs   121.6 GB/sec
primitive_non_null/bloom_filter                    1.00    108.7±0.29ms   404.7 MB/sec    1.00    108.4±0.28ms   406.1 MB/sec
primitive_non_null/cdc                             1.00     91.0±0.28ms   483.4 MB/sec    1.00     90.7±0.63ms   484.9 MB/sec
primitive_non_null/default                         1.00     68.1±0.16ms   646.4 MB/sec    1.00     68.3±0.20ms   644.1 MB/sec
primitive_non_null/parquet_2                       1.01     90.7±0.27ms   485.3 MB/sec    1.00     89.9±0.27ms   489.5 MB/sec
primitive_non_null/zstd                            1.07    105.6±0.94ms   416.8 MB/sec    1.00     98.9±0.23ms   445.0 MB/sec
primitive_non_null/zstd_parquet_2                  1.05    129.7±2.61ms   339.2 MB/sec    1.00    123.7±0.16ms   355.8 MB/sec
primitive_sparse_99pct_null/bloom_filter           1.00     12.3±0.09ms     3.6 GB/sec    1.03     12.7±0.16ms     3.4 GB/sec
primitive_sparse_99pct_null/cdc                    1.00     30.1±0.31ms  1489.3 MB/sec    1.05     31.6±0.24ms  1421.6 MB/sec
primitive_sparse_99pct_null/default                1.00     10.7±0.06ms     4.1 GB/sec    1.00     10.8±0.06ms     4.1 GB/sec
primitive_sparse_99pct_null/parquet_2              1.00     10.8±0.07ms     4.1 GB/sec    1.01     10.9±0.06ms     4.0 GB/sec
primitive_sparse_99pct_null/zstd                   1.00     14.2±0.07ms     3.1 GB/sec    1.01     14.3±0.07ms     3.1 GB/sec
primitive_sparse_99pct_null/zstd_parquet_2         1.00     12.8±0.06ms     3.4 GB/sec    1.00     12.8±0.09ms     3.4 GB/sec
short_string_non_null/bloom_filter                                                        1.00     27.5±0.06ms   436.9 MB/sec
short_string_non_null/cdc                                                                 1.00     20.1±0.06ms   597.2 MB/sec
short_string_non_null/default                                                             1.00     15.9±0.04ms   753.4 MB/sec
short_string_non_null/parquet_2                                                           1.00     25.7±0.19ms   466.6 MB/sec
short_string_non_null/zstd                                                                1.00     35.5±0.08ms   338.2 MB/sec
short_string_non_null/zstd_parquet_2                                                      1.00     28.4±0.06ms   422.1 MB/sec
string/bloom_filter                                1.00   215.7±19.88ms     2.4 GB/sec    1.00   215.9±14.00ms     2.4 GB/sec
string/cdc                                         1.00    220.3±4.46ms     2.3 GB/sec    1.00   221.2±10.07ms     2.3 GB/sec
string/default                                     1.00   125.3±20.12ms     4.1 GB/sec    1.03   129.1±15.98ms     4.0 GB/sec
string/parquet_2                                   1.00    111.2±6.38ms     4.6 GB/sec    1.67    186.2±0.55ms     2.7 GB/sec
string/zstd                                        1.00    415.9±1.93ms  1260.4 MB/sec    1.08   449.9±21.48ms  1165.2 MB/sec
string/zstd_parquet_2                              1.00    402.1±6.51ms  1303.9 MB/sec    1.01   405.4±16.19ms  1293.2 MB/sec
string_and_binary_view/bloom_filter                1.00     63.7±0.17ms   506.5 MB/sec    1.02     65.0±0.30ms   496.1 MB/sec
string_and_binary_view/cdc                         1.00     58.7±0.11ms   549.1 MB/sec    1.03     60.3±0.24ms   534.9 MB/sec
string_and_binary_view/default                     1.00     48.0±0.13ms   672.1 MB/sec    1.02     48.9±0.25ms   659.8 MB/sec
string_and_binary_view/parquet_2                   1.00     59.0±0.14ms   546.4 MB/sec    1.02     60.1±0.32ms   536.7 MB/sec
string_and_binary_view/zstd                        1.00     84.5±0.13ms   381.7 MB/sec    1.01     85.6±0.25ms   376.8 MB/sec
string_and_binary_view/zstd_parquet_2              1.00     72.9±0.12ms   442.7 MB/sec    1.02     73.9±0.31ms   436.1 MB/sec
string_dictionary/bloom_filter                     1.00     89.8±1.05ms     2.9 GB/sec    1.05     94.3±0.57ms     2.7 GB/sec
string_dictionary/cdc                              1.00     52.4±0.95ms     4.9 GB/sec    1.04     54.4±1.61ms     4.7 GB/sec
string_dictionary/default                          1.00     46.6±0.85ms     5.5 GB/sec    1.10     51.0±0.29ms     5.1 GB/sec
string_dictionary/parquet_2                        1.00     54.1±0.24ms     4.8 GB/sec    1.02     55.1±0.50ms     4.7 GB/sec
string_dictionary/zstd                             1.00    209.2±1.56ms  1262.7 MB/sec    1.01    212.1±0.80ms  1245.0 MB/sec
string_dictionary/zstd_parquet_2                   1.00    198.5±0.33ms  1330.6 MB/sec    1.00    199.1±0.42ms  1326.4 MB/sec
string_non_null/bloom_filter                       1.01   250.8±13.88ms     2.0 GB/sec    1.00    247.6±6.37ms     2.1 GB/sec
string_non_null/cdc                                1.00    266.8±2.89ms  1963.7 MB/sec    1.02   271.2±11.39ms  1932.2 MB/sec
string_non_null/default                            1.00   137.2±11.83ms     3.7 GB/sec    1.00    136.7±8.38ms     3.7 GB/sec
string_non_null/parquet_2                          1.06    131.1±3.11ms     3.9 GB/sec    1.00    123.3±0.71ms     4.2 GB/sec
string_non_null/zstd                               1.00    537.0±3.08ms   975.7 MB/sec    1.04    558.4±5.22ms   938.5 MB/sec
string_non_null/zstd_parquet_2                     1.00    503.2±0.58ms  1041.3 MB/sec    1.00    504.0±0.48ms  1039.7 MB/sec
struct_all_null/bloom_filter                       1.02    377.9±1.34µs    41.7 GB/sec    1.00    370.1±0.96µs    42.5 GB/sec
struct_all_null/cdc                                1.04      7.9±0.18ms  2047.5 MB/sec    1.00      7.6±0.07ms     2.1 GB/sec
struct_all_null/default                            1.00    119.0±0.37µs   132.4 GB/sec    1.01    120.5±0.43µs   130.7 GB/sec
struct_all_null/parquet_2                          1.00    120.6±0.36µs   130.6 GB/sec    1.00    121.0±0.35µs   130.2 GB/sec
struct_all_null/zstd                               1.00    166.6±0.44µs    94.5 GB/sec    1.01    168.1±0.27µs    93.7 GB/sec
struct_all_null/zstd_parquet_2                     1.00    153.0±0.54µs   102.9 GB/sec    1.01    154.5±0.51µs   101.9 GB/sec
struct_non_null/bloom_filter                       1.01     46.8±0.14ms   341.7 MB/sec    1.00     46.4±0.16ms   344.9 MB/sec
struct_non_null/cdc                                1.00     45.9±0.12ms   348.2 MB/sec    1.00     45.7±0.15ms   349.8 MB/sec
struct_non_null/default                            1.00     32.3±0.10ms   495.1 MB/sec    1.00     32.2±0.17ms   496.6 MB/sec
struct_non_null/parquet_2                          1.01     41.1±0.12ms   389.6 MB/sec    1.00     40.9±0.11ms   391.6 MB/sec
struct_non_null/zstd                               1.00     41.0±0.11ms   390.0 MB/sec    1.00     41.0±0.10ms   390.5 MB/sec
struct_non_null/zstd_parquet_2                     1.00     55.1±0.12ms   290.6 MB/sec    1.00     54.8±0.09ms   292.0 MB/sec
struct_sparse_99pct_null/bloom_filter              1.01      6.6±0.07ms     2.4 GB/sec    1.00      6.5±0.03ms     2.4 GB/sec
struct_sparse_99pct_null/cdc                       1.00     13.7±0.09ms  1179.1 MB/sec    1.07     14.6±0.10ms  1105.2 MB/sec
struct_sparse_99pct_null/default                   1.00      6.0±0.04ms     2.6 GB/sec    1.00      6.0±0.02ms     2.6 GB/sec
struct_sparse_99pct_null/parquet_2                 1.00      6.0±0.04ms     2.6 GB/sec    1.00      5.9±0.02ms     2.6 GB/sec
struct_sparse_99pct_null/zstd                      1.00      7.3±0.04ms     2.1 GB/sec    1.00      7.4±0.04ms     2.1 GB/sec
struct_sparse_99pct_null/zstd_parquet_2            1.00      6.8±0.05ms     2.3 GB/sec    1.01      6.8±0.04ms     2.3 GB/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 1920.4s
Peak memory 6.6 GiB
Avg memory 6.4 GiB
CPU user 1883.3s
CPU sys 33.0s
Peak spill 0 B

branch

Metric Value
Wall time 2080.5s
Peak memory 6.8 GiB
Avg memory 6.6 GiB
CPU user 2011.8s
CPU sys 64.3s
Peak spill 0 B

File an issue against this benchmark runner

@adriangb adriangb requested a review from etseidl May 26, 2026 15:34
@adriangb
Copy link
Copy Markdown
Contributor Author

@etseidl after much profiling, debugging, etc. I've been able to get this to work with no performance impact (within noise). I recognize this is a non-trivial change but it introduces no public APIs and in theory if it is problematic in any way we can back out of it. The benefit of doing things this way is that we automatically patch buggy / problematic page size blowouts for everyone, without code changes needed on their end or guessing of column sizes necessary.

One thing we could do to derisk if you want: add a config option to disable this behavior.

Thanks for reviewing this, I hope we can make it work 😄

@etseidl sorry to bug you again. I've re-stacked the commits to make the diff more reviewable, a good chunk of the diff is regression tests and benchmarks which I've tried to split out into their own commits to make things more reviewable. Let me know if there's anything else I can do to help make this more palatable.

@etseidl
Copy link
Copy Markdown
Contributor

etseidl commented May 26, 2026

Sorry @adriangb, I've been too slammed recently to follow along on this one. I'll try to carve out some time to do a deep dive. Thanks for your patience 🙏

@adriangb
Copy link
Copy Markdown
Contributor Author

No worries, thanks for taking the time. I've seen all of the amazing work you've been doing for Parquet itself!

/// for the case where individual values are small enough that the byte-budget
/// based sub-batch sizing in `write_batch_internal` should always resolve to
/// the full chunk (no granular splitting, no regression vs. current behavior).
fn create_short_string_bench_batch(size: usize) -> Result<RecordBatch> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's break the bench update into a separate PR so we can see the difference in the large string case. I'm seeing a 13% slowdown vs main, but that may just be the price for getting smaller batches.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, here you go! #10021

etseidl pushed a commit that referenced this pull request May 26, 2026
…10021)

# Which issue does this PR close?

Split out of #9972 per [this review
comment](#9972 (comment)).

# Rationale for this change

#9972 makes the parquet writer's mini-batch sizing byte-budget aware so
large variable-width values don't produce oversized data pages. To
measure that change against a stable baseline — and in particular to see
the difference in the large-string case — these benchmarks belong on
`main` first.

# What changes are included in this PR?

Adds two BYTE_ARRAY write cases to the `arrow_writer` criterion bench:

- **`short_string_non_null`** — 1M fixed-width 8-byte strings. The
small-value hot path, where byte-budget-based sub-batch sizing should
always resolve to the full chunk (no granular splitting, no regression).
- **`large_string_non_null`** — 1024 × 256 KiB strings (256 MiB total).
The large-value case: with the default 1 MiB page byte limit each value
needs its own page, and a `write_batch_size` of 1024 would otherwise
buffer all 256 MiB before the post-write size check runs.

No library code changes — benchmarks only.

# Are there any user-facing changes?

No.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@etseidl
Copy link
Copy Markdown
Contributor

etseidl commented May 26, 2026

run benchmark arrow_writer

env:
  BENCH_FILTER: large|short

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4549962922-339-bslq9 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing parquet-page-size-mid-batch (11c9d51) to e28fd0d (merge-base) diff
BENCH_NAME=arrow_writer
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_writer
BENCH_FILTER=large|short
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                   parquet-page-size-mid-batch
-----                                   ---------------------------
large_string_non_null/bloom_filter      1.00     72.7±0.09ms     3.4 GB/sec
large_string_non_null/cdc               1.00    242.0±0.99ms  1058.0 MB/sec
large_string_non_null/default           1.00     55.3±0.08ms     4.5 GB/sec
large_string_non_null/parquet_2         1.00     55.2±0.09ms     4.5 GB/sec
large_string_non_null/zstd              1.00     55.5±1.70ms     4.5 GB/sec
large_string_non_null/zstd_parquet_2    1.00     55.4±0.26ms     4.5 GB/sec
short_string_non_null/bloom_filter      1.00     27.9±0.37ms   429.8 MB/sec
short_string_non_null/cdc               1.00     19.9±0.06ms   601.9 MB/sec
short_string_non_null/default           1.00     15.7±0.08ms   762.9 MB/sec
short_string_non_null/parquet_2         1.00     25.5±0.11ms   469.8 MB/sec
short_string_non_null/zstd              1.00     35.5±0.09ms   338.3 MB/sec
short_string_non_null/zstd_parquet_2    1.00     28.3±0.08ms   423.9 MB/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 15.0s
Peak memory 5.9 GiB
Avg memory 4.9 GiB
CPU user 13.5s
CPU sys 1.2s
Peak spill 0 B

branch

Metric Value
Wall time 155.0s
Peak memory 6.4 GiB
Avg memory 6.2 GiB
CPU user 150.8s
CPU sys 0.8s
Peak spill 0 B

File an issue against this benchmark runner

@etseidl
Copy link
Copy Markdown
Contributor

etseidl commented May 27, 2026

run benchmark arrow_writer

env:
  BENCH_FILTER: large|short

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4550051754-340-kfv4w 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing parquet-page-size-mid-batch (79507a2) to bbbe8a6 (merge-base) diff
BENCH_NAME=arrow_writer
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_writer
BENCH_FILTER=large|short
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                   main                                   parquet-page-size-mid-batch
-----                                   ----                                   ---------------------------
large_string_non_null/bloom_filter      1.08     76.5±0.45ms     3.3 GB/sec    1.00     70.9±0.11ms     3.5 GB/sec
large_string_non_null/cdc               1.00    239.5±1.07ms  1068.8 MB/sec    1.01    240.9±0.98ms  1062.5 MB/sec
large_string_non_null/default           1.07     57.3±0.40ms     4.4 GB/sec    1.00     53.4±0.10ms     4.7 GB/sec
large_string_non_null/parquet_2         1.08     57.2±0.41ms     4.4 GB/sec    1.00     53.2±0.10ms     4.7 GB/sec
large_string_non_null/zstd              1.08     57.3±0.38ms     4.4 GB/sec    1.00     53.2±0.09ms     4.7 GB/sec
large_string_non_null/zstd_parquet_2    1.07     57.2±0.38ms     4.4 GB/sec    1.00     53.4±0.23ms     4.7 GB/sec
short_string_non_null/bloom_filter      1.00     27.0±0.08ms   443.7 MB/sec    1.03     27.9±0.10ms   429.4 MB/sec
short_string_non_null/cdc               1.00     19.7±0.11ms   608.2 MB/sec    1.00     19.8±0.07ms   605.4 MB/sec
short_string_non_null/default           1.00     15.5±0.07ms   773.7 MB/sec    1.01     15.7±0.07ms   766.4 MB/sec
short_string_non_null/parquet_2         1.01     25.6±0.12ms   469.0 MB/sec    1.00     25.4±0.23ms   471.6 MB/sec
short_string_non_null/zstd              1.00     35.2±0.13ms   340.6 MB/sec    1.00     35.4±0.11ms   339.3 MB/sec
short_string_non_null/zstd_parquet_2    1.01     28.4±0.06ms   423.1 MB/sec    1.00     28.2±0.08ms   425.7 MB/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 155.0s
Peak memory 6.4 GiB
Avg memory 6.2 GiB
CPU user 152.1s
CPU sys 1.5s
Peak spill 0 B

branch

Metric Value
Wall time 150.0s
Peak memory 6.4 GiB
Avg memory 6.3 GiB
CPU user 148.9s
CPU sys 0.9s
Peak spill 0 B

File an issue against this benchmark runner

@etseidl
Copy link
Copy Markdown
Contributor

etseidl commented May 27, 2026

Interesting, on my WS the large string benches were consistently slower (probably due to earlier abandonment of the dictionary encoder). On ARM it seems to tilt the other way 🤷

Anyway, I did a first pass and I think this looks nice. I want to test it locally some with some icky files I have. Then I'll do a final pass through.

Thanks @adriangb!

@adriangb
Copy link
Copy Markdown
Contributor Author

Yeah I had to spend a lot of time messing with code structure because small layout differences cascade into measurable differences. Thanks for digging deep into this.

let len = (offsets[idx + 1] - offsets[*idx]).as_usize() + prefix_overhead;
cum = cum.saturating_add(len);
if cum > byte_budget {
return i.max(1);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some testing on a file consisting of 128b strings. If I set max page size to 64000, then I wind up with a file with a pattern of pages of size 968/540/540 values. This is because this line will return a size of 484 (floor of 64000/132). So what happens is the first mini batch of 484 is just under the 64k threshold, so we add the next 484 from the batch to get 968. That leaves 56 rows left. The next iter appends the next 484 to the 56 to get 540, and then we have 484+56 left in that batch of 1024, so we wind up writing that 540. And then repeat. If we instead just return i + 1 here, that eliminates the need for the .max(1), and also gives us a mini-batch size just over the requested threshold. Now I see a pattern of 485/539 repeating.

I wonder if there's a way to smooth this some. We can't really change the batch size being passed in, but given a batch, maybe we can add some kind of heuristic here that can figure out first that multiple mini-batches are needed then divides the batch size by the number of batches to smooth this out some. Naively something like

        if cum > byte_budget {
            //return (i+1).max(1);
            let num_batches = 1.max(n/(i+1));
            return 1.max(n / num_batches)
        }

This overshoots some by producing mini-batches of 512. Dividing by 3 would undershoot, but then we need two mini-batches to fill a page so that winds up overshooting even more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option given the above is do 3 batches of around 341, but change write_mini_batch to take a flag to immediately flush a page after writing. Then we wind up with more pages, but they're all under the budget.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

I applied your return i + 1 suggestion in f8f2a52.

I'd rather keep something simple and obviously cheap, some overshoot is okay as long as it's not unbounded as it was before, I don't think we need to be exact, best effort is still okay.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather keep something simple and obviously cheap, some overshoot is okay as long as it's not unbounded as it was before, I don't think we need to be exact, best effort is still okay.

Sounds reasonable. I like simple :D

adriangb and others added 5 commits May 28, 2026 10:54
Add the regression tests first, before any fix (TDD). With unmodified
`main` the page-bounding assertions fail: the column writer only checks
the data/dictionary page byte limit *after* each `write_batch_size`
mini-batch, so large variable-width values pile into a single oversized
page (we've observed 2 GiB data pages and ~256x dictionary-page
overshoot at default settings).

Column-writer tests (`ColumnValueEncoderImpl` path):
- large BYTE_ARRAY values cap data pages near one value each
- large values inside a repeated/list column (record-boundary stepping)
- nullable column (value vs level counting)
- dictionary spill then plain-encode large values
- large distinct values bound the dictionary page
- FIXED_LEN_BYTE_ARRAY byte budget

Arrow-writer tests (`ByteArrayEncoder` path, what real users hit):
- large `Utf8` strings via `ArrowWriter`
- mixed small/large strings round-trip bit-identically
- large `Utf8View` strings
- all-null string column stays correct

The subsequent commits make each of these pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make the mini-batch size byte-budget aware so the post-write page check
fires before a data page grows unbounded:

- New `ColumnValueEncoder::count_values_within_byte_budget{,_gather}`
  trait methods (default `None` = "no estimate; stay batched"), with a
  concrete impl on `ColumnValueEncoderImpl` driven by
  `plain_encoded_byte_size`. Fixed-width physical types answer in one
  division; only variable-width BYTE_ARRAY/FLBA walk values, exiting at
  the first that overruns the budget.
- New `LevelDataRef::value_count` converts a chunk's level span into a
  leaf-value count (O(1) for flat columns, def-level scan for
  nullable/nested), with a unit test.
- New `ByteBudgetChunker` picks the largest sub-batch that fits one page
  budget. For the common case (small or fixed-width values) it returns
  the full chunk with no value inspection, so the hot path is unchanged.
- `write_batch_internal` consults the chunker per chunk and, only when a
  chunk would overflow, routes through the new `write_granular_chunk`,
  which sub-batches so the post-write check fires in time.
  Repeated/nested columns step on record (rep == 0) boundaries so a
  record never spans pages.

This makes the column-writer data-page, list, nullable and FLBA
regression tests pass. Dictionary-encoding columns are still left on the
batched path (the data page holds only small RLE indices) — bounding the
dictionary page is a separate commit, so the two dictionary tests and
the arrow `ByteArrayEncoder` tests do not pass yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implement `ColumnValueEncoder::count_values_within_byte_budget_gather`
for `ByteArrayEncoder`, the encoder real `ArrowWriter` users hit. This
makes the page-size bound fire for arrow string/binary columns; the
previous commit only wired the generic column-writer path. Makes the
arrow-writer regression tests pass.

The implementation stays off the hot path for small values via cheap
O(1) upper bounds before any per-value walk:

- Offset-backed arrays (`Utf8`/`LargeUtf8`/`Binary`/`LargeBinary`): the
  span `offsets[last+1] - offsets[first]` bounds the chunk's payload in
  O(1); if it fits, every value fits. The span is exact even for
  nullable columns (skipped positions are nulls with zero offset delta),
  so sparse `indices` skip the per-value walk too.
- View arrays (`Utf8View`/`BinaryView`): lengths live in the low 32 bits
  of each view word, so an O(1) `n * (max_value_len + 4)` bound skips the
  scan in the common case; otherwise scan lengths with no data-buffer
  deref.
- Dictionary input: treated as always-fitting — dict-encoded arrow input
  implies values small enough to dedup, the opposite of the blob case
  this targets, and a per-key walk measurably regressed the bench.
- FixedSizeBinary: falls through to the generic accessor walk.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
While a column dictionary-encodes, the data page holds only small RLE
indices but the *dictionary* page accumulates the distinct values
themselves, and its spill check runs only once per mini-batch. A
mini-batch of large distinct values therefore interns
`write_batch_size * value_size` bytes into the dictionary page before
the check fires — ~256x the limit in the worst case.

Extend `ByteBudgetChunker` to bound the dictionary-encoding phase too:
when `has_dictionary()`, size the mini-batch against the dictionary
page's *remaining* budget (`dict_page_byte_limit - estimated_dict_page_size`)
rather than the data page. Fixed-width columns short-circuit via a
precomputed `static_dict_always_fits`, so only large variable-width
distinct values pay the per-value walk. Makes the two dictionary
regression tests pass.

`arrow_writer_layout`'s `test_string` updates accordingly: the
dictionary page is now bounded at its 1000-byte limit and spills one
mini-batch earlier (125 rows rather than 130) instead of overshooting to
1040.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The variable-width byte-budget walks returned the largest count whose
cumulative encoded size was *under* the budget, so each mini-batch
ended just short of the page threshold. When the input row batch did
not divide evenly into mini-batches, the remainder rolled into the
next page and produced a bimodal page-size pattern (e.g. 128B values,
64KB budget, 1024-row batches: 968 / 540 / 540 ... values per page).

Return the boundary value's index + 1 instead, so the mini-batch
crosses the threshold by exactly one value and the caller's page-flush
check trips immediately, with no leftover sliver carried into the next
page. The worst-case overshoot per page is one value's encoded size,
which already matched the previous behavior whenever a single value
alone exceeded the budget (the dropped .max(1) floor).

Reported by Ed Seidel in apache#9972 review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@adriangb adriangb force-pushed the parquet-page-size-mid-batch branch from 79507a2 to f8f2a52 Compare May 28, 2026 16:29
@adriangb
Copy link
Copy Markdown
Contributor Author

run benchmark arrow_writer

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4566182879-352-rtw82 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing parquet-page-size-mid-batch (f8f2a52) to e470187 (merge-base) diff
BENCH_NAME=arrow_writer
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_writer
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

The mini-batch byte-budget walk now includes the value that crosses
the budget, so the dictionary in the spill sub-test fills at 126 rows
(1008 bytes) instead of 125 rows (1000 bytes), and the downstream
plain page picks up 1254 rows / 10032 bytes instead of 1255 / 10040.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@etseidl etseidl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @adriangb, I think this is a nice addition. It's a lot to go through, but it is well documented and pretty easy to follow. I'd like to get more eyes on this, however. Perhaps @alamb or @HippoBaro can take a look.

@adriangbot
Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                              main                                   parquet-page-size-mid-batch
-----                                              ----                                   ---------------------------
bool/bloom_filter                                  1.04     13.2±0.07ms    18.9 MB/sec    1.00     12.8±0.04ms    19.6 MB/sec
bool/cdc                                           1.03     16.2±0.08ms    15.4 MB/sec    1.00     15.7±0.04ms    15.9 MB/sec
bool/default                                       1.04     11.1±0.06ms    22.5 MB/sec    1.00     10.7±0.03ms    23.4 MB/sec
bool/parquet_2                                     1.03     15.0±0.07ms    16.7 MB/sec    1.00     14.5±0.04ms    17.2 MB/sec
bool/zstd                                          1.04     11.7±0.06ms    21.4 MB/sec    1.00     11.2±0.03ms    22.3 MB/sec
bool/zstd_parquet_2                                1.05     15.6±0.13ms    16.0 MB/sec    1.00     14.9±0.04ms    16.8 MB/sec
bool_non_null/bloom_filter                         1.02      7.0±0.02ms    17.8 MB/sec    1.00      6.9±0.03ms    18.1 MB/sec
bool_non_null/cdc                                  1.04      6.9±0.23ms    18.1 MB/sec    1.00      6.6±0.04ms    18.8 MB/sec
bool_non_null/default                              1.03      4.3±0.02ms    29.2 MB/sec    1.00      4.1±0.02ms    30.2 MB/sec
bool_non_null/parquet_2                            1.01      9.1±0.03ms    13.8 MB/sec    1.00      9.0±0.04ms    13.9 MB/sec
bool_non_null/zstd                                 1.03      4.6±0.02ms    27.0 MB/sec    1.00      4.5±0.02ms    27.7 MB/sec
bool_non_null/zstd_parquet_2                       1.00      9.5±0.03ms    13.2 MB/sec    1.00      9.4±0.04ms    13.3 MB/sec
float_with_nans/bloom_filter                       1.00     92.4±0.31ms   151.5 MB/sec    1.00     92.5±0.28ms   151.3 MB/sec
float_with_nans/cdc                                1.00     81.4±0.17ms   172.0 MB/sec    1.00     81.6±0.70ms   171.5 MB/sec
float_with_nans/default                            1.00     73.9±0.13ms   189.5 MB/sec    1.00     74.2±0.23ms   188.8 MB/sec
float_with_nans/parquet_2                          1.00     94.2±0.26ms   148.6 MB/sec    1.00     94.2±0.18ms   148.6 MB/sec
float_with_nans/zstd                               1.00    111.5±0.14ms   125.6 MB/sec    1.00    111.9±0.29ms   125.1 MB/sec
float_with_nans/zstd_parquet_2                     1.00    131.1±0.24ms   106.8 MB/sec    1.00    131.7±0.42ms   106.3 MB/sec
large_string_non_null/bloom_filter                 1.00     72.4±0.20ms     3.5 GB/sec    1.01     72.8±0.09ms     3.4 GB/sec
large_string_non_null/cdc                          1.00    239.1±1.09ms  1070.7 MB/sec    1.02    244.3±1.15ms  1048.1 MB/sec
large_string_non_null/default                      1.00     52.5±0.19ms     4.8 GB/sec    1.05     55.0±0.08ms     4.5 GB/sec
large_string_non_null/parquet_2                    1.00     52.6±0.17ms     4.7 GB/sec    1.05     55.0±0.08ms     4.5 GB/sec
large_string_non_null/zstd                         1.00     52.6±0.20ms     4.8 GB/sec    1.05     55.0±0.08ms     4.5 GB/sec
large_string_non_null/zstd_parquet_2               1.00     52.6±0.18ms     4.8 GB/sec    1.05     55.1±0.09ms     4.5 GB/sec
list_primitive/bloom_filter                        1.07    356.5±1.63ms  1529.9 MB/sec    1.00    332.2±0.96ms  1641.5 MB/sec
list_primitive/cdc                                 1.02   374.1±10.22ms  1457.9 MB/sec    1.00    365.8±0.76ms  1490.7 MB/sec
list_primitive/default                             1.09    278.1±2.15ms  1961.0 MB/sec    1.00    255.4±1.31ms     2.1 GB/sec
list_primitive/parquet_2                           1.08    298.0±0.66ms  1830.3 MB/sec    1.00    277.0±0.27ms  1969.1 MB/sec
list_primitive/zstd                                1.02    516.4±2.55ms  1056.2 MB/sec    1.00    505.5±0.62ms  1078.9 MB/sec
list_primitive/zstd_parquet_2                      1.00    500.8±1.42ms  1088.9 MB/sec    1.00    500.1±0.29ms  1090.5 MB/sec
list_primitive_non_null/bloom_filter               1.09   444.9±10.80ms  1223.4 MB/sec    1.00    408.9±4.09ms  1331.0 MB/sec
list_primitive_non_null/cdc                        1.00   435.7±10.38ms  1249.1 MB/sec    1.00    436.8±7.09ms  1246.1 MB/sec
list_primitive_non_null/default                    1.13   314.2±14.83ms  1732.1 MB/sec    1.00    277.3±3.98ms  1962.8 MB/sec
list_primitive_non_null/parquet_2                  1.20    363.5±0.50ms  1497.2 MB/sec    1.00    302.5±7.83ms  1798.9 MB/sec
list_primitive_non_null/zstd                       1.00   698.8±14.82ms   778.9 MB/sec    1.00    698.1±5.95ms   779.7 MB/sec
list_primitive_non_null/zstd_parquet_2             1.06   714.0±18.42ms   762.3 MB/sec    1.00    670.5±0.49ms   811.7 MB/sec
list_primitive_sparse_99pct_null/bloom_filter      1.00     11.8±0.04ms     3.1 GB/sec    1.03     12.1±0.03ms     3.0 GB/sec
list_primitive_sparse_99pct_null/cdc               1.00     23.2±0.06ms  1609.8 MB/sec    1.02     23.6±0.05ms  1581.8 MB/sec
list_primitive_sparse_99pct_null/default           1.00     11.5±0.02ms     3.2 GB/sec    1.03     11.8±0.02ms     3.1 GB/sec
list_primitive_sparse_99pct_null/parquet_2         1.00     11.5±0.03ms     3.2 GB/sec    1.03     11.8±0.03ms     3.1 GB/sec
list_primitive_sparse_99pct_null/zstd              1.00     13.3±0.03ms     2.7 GB/sec    1.03     13.6±0.03ms     2.7 GB/sec
list_primitive_sparse_99pct_null/zstd_parquet_2    1.00     11.6±0.03ms     3.1 GB/sec    1.03     12.0±0.03ms     3.0 GB/sec
primitive/bloom_filter                             1.01    150.6±0.51ms   297.9 MB/sec    1.00    148.6±0.44ms   301.9 MB/sec
primitive/cdc                                      1.01    159.0±0.60ms   282.3 MB/sec    1.00    157.8±0.52ms   284.3 MB/sec
primitive/default                                  1.01    118.7±0.47ms   378.0 MB/sec    1.00    117.6±0.26ms   381.5 MB/sec
primitive/parquet_2                                1.01    133.7±0.47ms   335.7 MB/sec    1.00    132.2±0.23ms   339.5 MB/sec
primitive/zstd                                     1.01    148.7±0.73ms   301.8 MB/sec    1.00    146.9±0.26ms   305.6 MB/sec
primitive/zstd_parquet_2                           1.01    166.9±0.45ms   268.9 MB/sec    1.00    165.3±0.26ms   271.5 MB/sec
primitive_all_null/bloom_filter                    1.00    890.3±2.09µs    49.2 GB/sec    1.02    905.9±2.28µs    48.4 GB/sec
primitive_all_null/cdc                             1.00     18.4±0.28ms     2.4 GB/sec    1.00     18.4±0.41ms     2.4 GB/sec
primitive_all_null/default                         1.00    272.5±0.60µs   160.8 GB/sec    1.02    278.3±1.40µs   157.5 GB/sec
primitive_all_null/parquet_2                       1.00    279.5±0.86µs   156.8 GB/sec    1.00    279.2±0.93µs   157.0 GB/sec
primitive_all_null/zstd                            1.00    386.5±0.94µs   113.4 GB/sec    1.02    392.9±1.14µs   111.5 GB/sec
primitive_all_null/zstd_parquet_2                  1.00    355.3±1.04µs   123.3 GB/sec    1.00    356.5±1.07µs   122.9 GB/sec
primitive_non_null/bloom_filter                    1.00    106.1±0.42ms   414.8 MB/sec    1.00    106.5±0.45ms   413.2 MB/sec
primitive_non_null/cdc                             1.00     89.8±0.34ms   490.1 MB/sec    1.01     90.7±0.48ms   485.4 MB/sec
primitive_non_null/default                         1.00     66.9±0.26ms   657.3 MB/sec    1.01     67.8±0.15ms   648.7 MB/sec
primitive_non_null/parquet_2                       1.00     88.6±0.23ms   496.5 MB/sec    1.01     89.5±0.23ms   491.7 MB/sec
primitive_non_null/zstd                            1.00     97.6±0.23ms   450.7 MB/sec    1.01     98.4±0.15ms   447.3 MB/sec
primitive_non_null/zstd_parquet_2                  1.00    122.5±0.27ms   359.1 MB/sec    1.00    123.0±0.14ms   357.7 MB/sec
primitive_sparse_99pct_null/bloom_filter           1.00     11.8±0.07ms     3.7 GB/sec    1.01     11.9±0.05ms     3.7 GB/sec
primitive_sparse_99pct_null/cdc                    1.00     29.0±0.25ms  1549.6 MB/sec    1.01     29.1±0.38ms  1539.5 MB/sec
primitive_sparse_99pct_null/default                1.00     10.5±0.03ms     4.2 GB/sec    1.01     10.6±0.03ms     4.2 GB/sec
primitive_sparse_99pct_null/parquet_2              1.00     10.5±0.03ms     4.2 GB/sec    1.01     10.6±0.04ms     4.1 GB/sec
primitive_sparse_99pct_null/zstd                   1.00     13.8±0.04ms     3.2 GB/sec    1.02     14.0±0.04ms     3.1 GB/sec
primitive_sparse_99pct_null/zstd_parquet_2         1.00     12.4±0.04ms     3.5 GB/sec    1.01     12.5±0.03ms     3.5 GB/sec
short_string_non_null/bloom_filter                 1.02     27.7±0.06ms   433.6 MB/sec    1.00     27.0±0.07ms   443.8 MB/sec
short_string_non_null/cdc                          1.00     19.8±0.05ms   607.5 MB/sec    1.01     19.9±0.06ms   604.1 MB/sec
short_string_non_null/default                      1.00     15.6±0.04ms   770.1 MB/sec    1.01     15.7±0.05ms   762.4 MB/sec
short_string_non_null/parquet_2                    1.00     25.5±0.07ms   471.3 MB/sec    1.00     25.4±0.10ms   471.8 MB/sec
short_string_non_null/zstd                         1.00     35.2±0.10ms   340.7 MB/sec    1.00     35.3±0.07ms   339.7 MB/sec
short_string_non_null/zstd_parquet_2               1.00     28.3±0.04ms   424.8 MB/sec    1.00     28.2±0.09ms   425.1 MB/sec
string/bloom_filter                                1.04   225.7±26.00ms     2.3 GB/sec    1.00   216.1±16.66ms     2.4 GB/sec
string/cdc                                         1.00    213.8±7.38ms     2.4 GB/sec    1.03    221.1±9.36ms     2.3 GB/sec
string/default                                     1.08   128.5±22.92ms     4.0 GB/sec    1.00    119.0±8.24ms     4.3 GB/sec
string/parquet_2                                   1.00    108.0±5.73ms     4.7 GB/sec    1.74   187.5±10.07ms     2.7 GB/sec
string/zstd                                        1.00    415.9±1.51ms  1260.5 MB/sec    1.08   451.0±19.27ms  1162.3 MB/sec
string/zstd_parquet_2                              1.00   408.4±11.63ms  1283.7 MB/sec    1.02   416.6±19.27ms  1258.3 MB/sec
string_and_binary_view/bloom_filter                1.00     64.1±0.31ms   503.1 MB/sec    1.00     64.0±0.21ms   503.9 MB/sec
string_and_binary_view/cdc                         1.00     59.1±0.22ms   546.0 MB/sec    1.00     59.2±0.16ms   544.7 MB/sec
string_and_binary_view/default                     1.00     48.4±0.17ms   666.0 MB/sec    1.00     48.3±0.11ms   668.0 MB/sec
string_and_binary_view/parquet_2                   1.00     59.7±0.23ms   539.8 MB/sec    1.00     59.5±0.14ms   541.8 MB/sec
string_and_binary_view/zstd                        1.01     85.4±0.25ms   377.5 MB/sec    1.00     85.0±0.14ms   379.5 MB/sec
string_and_binary_view/zstd_parquet_2              1.00     73.3±0.21ms   440.0 MB/sec    1.00     73.5±0.14ms   438.9 MB/sec
string_dictionary/bloom_filter                     1.00     89.8±0.67ms     2.9 GB/sec    1.02     91.4±0.20ms     2.8 GB/sec
string_dictionary/cdc                              1.00     53.3±0.51ms     4.8 GB/sec    1.00     53.2±0.58ms     4.9 GB/sec
string_dictionary/default                          1.01     48.9±0.43ms     5.3 GB/sec    1.00     48.6±0.98ms     5.3 GB/sec
string_dictionary/parquet_2                        1.00     54.0±0.22ms     4.8 GB/sec    1.00     54.2±0.30ms     4.8 GB/sec
string_dictionary/zstd                             1.00    208.7±0.42ms  1265.4 MB/sec    1.01    210.0±1.04ms  1257.9 MB/sec
string_dictionary/zstd_parquet_2                   1.00    198.5±0.17ms  1330.5 MB/sec    1.00    199.0±0.22ms  1327.0 MB/sec
string_non_null/bloom_filter                       1.07   252.6±15.97ms     2.0 GB/sec    1.00    236.3±1.68ms     2.2 GB/sec
string_non_null/cdc                                1.02   275.1±14.26ms  1904.4 MB/sec    1.00    269.3±6.78ms  1945.7 MB/sec
string_non_null/default                            1.00   137.2±15.05ms     3.7 GB/sec    1.00   136.9±15.20ms     3.7 GB/sec
string_non_null/parquet_2                          1.16    139.1±5.97ms     3.7 GB/sec    1.00    120.2±0.66ms     4.3 GB/sec
string_non_null/zstd                               1.06   567.8±14.14ms   922.9 MB/sec    1.00    533.5±0.97ms   982.1 MB/sec
string_non_null/zstd_parquet_2                     1.00    503.0±4.83ms  1041.7 MB/sec    1.00    502.5±0.50ms  1042.8 MB/sec
struct_all_null/bloom_filter                       1.00    378.2±1.64µs    41.6 GB/sec    1.00    377.0±1.08µs    41.8 GB/sec
struct_all_null/cdc                                1.00      7.3±0.08ms     2.2 GB/sec    1.01      7.4±0.09ms     2.1 GB/sec
struct_all_null/default                            1.00    118.5±0.26µs   132.9 GB/sec    1.02    121.3±0.55µs   129.8 GB/sec
struct_all_null/parquet_2                          1.00    119.7±0.33µs   131.6 GB/sec    1.01    121.0±0.31µs   130.1 GB/sec
struct_all_null/zstd                               1.00    164.8±1.10µs    95.6 GB/sec    1.03    169.5±0.46µs    92.9 GB/sec
struct_all_null/zstd_parquet_2                     1.00    151.7±0.41µs   103.8 GB/sec    1.02    154.0±0.24µs   102.2 GB/sec
struct_non_null/bloom_filter                       1.00     45.7±0.18ms   350.0 MB/sec    1.01     46.1±0.26ms   347.2 MB/sec
struct_non_null/cdc                                1.00     45.1±0.18ms   354.6 MB/sec    1.01     45.7±0.16ms   350.2 MB/sec
struct_non_null/default                            1.00     31.7±0.16ms   505.4 MB/sec    1.03     32.5±0.09ms   493.0 MB/sec
struct_non_null/parquet_2                          1.00     40.5±0.20ms   395.4 MB/sec    1.01     41.0±0.12ms   390.0 MB/sec
struct_non_null/zstd                               1.00     40.3±0.17ms   397.3 MB/sec    1.02     40.9±0.11ms   390.7 MB/sec
struct_non_null/zstd_parquet_2                     1.00     54.4±0.17ms   294.4 MB/sec    1.01     55.1±0.12ms   290.6 MB/sec
struct_sparse_99pct_null/bloom_filter              1.00      6.4±0.02ms     2.5 GB/sec    1.01      6.5±0.02ms     2.4 GB/sec
struct_sparse_99pct_null/cdc                       1.00     13.2±0.09ms  1220.7 MB/sec    1.01     13.3±0.09ms  1214.5 MB/sec
struct_sparse_99pct_null/default                   1.00      5.9±0.01ms     2.7 GB/sec    1.01      5.9±0.01ms     2.7 GB/sec
struct_sparse_99pct_null/parquet_2                 1.00      5.9±0.02ms     2.7 GB/sec    1.00      5.9±0.01ms     2.7 GB/sec
struct_sparse_99pct_null/zstd                      1.00      7.2±0.02ms     2.2 GB/sec    1.00      7.3±0.01ms     2.2 GB/sec
struct_sparse_99pct_null/zstd_parquet_2            1.00      6.7±0.02ms     2.4 GB/sec    1.00      6.7±0.01ms     2.4 GB/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 2090.5s
Peak memory 6.8 GiB
Avg memory 6.6 GiB
CPU user 2008.3s
CPU sys 77.0s
Peak spill 0 B

branch

Metric Value
Wall time 2045.4s
Peak memory 6.8 GiB
Avg memory 6.7 GiB
CPU user 1997.3s
CPU sys 45.7s
Peak spill 0 B

File an issue against this benchmark runner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants