feat: Native Parquet Iceberg Data File Writes In Comet by jordepic · Pull Request #4487 · apache/datafusion-comet

jordepic · 2026-05-28T01:50:42Z

Which issue does this PR close?

Closes #4322.

Rationale for this change

Comet, up until this point, has mainly been focused with accelerating reads from iceberg tables. However, a significant resources are being spent across various companies in order to rewrite iceberg data. Large tables need to be compacted to maintain their sort/Z order, and general data pipelines may write significant amounts of iceberg data. Having to do a large transpose between columnar and row-wise data is inefficient, and we'd much prefer to go directly from arrow-based column batches to parquet on disk.

What changes are included in this PR?

This change is split into three parts.

Splitting the existing iceberg V2 spark write command into two: a "writer" spark operator and a "committer" spark operator.

This allows the iceberg data file write to be treated identically to a spark V1 parquet writing command, thereby allowing using similar code to handle the operator
Without it, both the data file writing and committing would live outside of the scope of spark adaptive query execution operator wrappers - we instead want the write operator to be within AQE so that as the data feeds into it gets re-planned we can determine whether the upstream data of the write is columnar

Determining which operators should be converted to native code

This follows a simple philosophy: only convert writes to native code if they produce an "identical" outcome as the Java path
Disclaimer: nothing truly produces an identical result because parquet row group flushing is different
Besides that though, we can generally convert "normal" writes (parquet, default settings, some flexibility to change other settings as outlined in iceberg-writes.md, no delete files since iceberg-rust doesn't support positional deletes/DVs)

Native iceberg write operator

The JVM is responsible for computing all iceberg writing settings and using a protobuf object to pass them to rust
Rust uses iceberg-rs in order to write the file and return avro-encoded dummy manifest bytes back to the JVM

How are these changes tested?

This change is tested extensively.

Tests to ensure that iceberg writes are replaced by our "two-operator" structure
Tests to ensure that the comet JVM properly serializes relevant data to protocol buffer form to go to the native layer
Tests to ensure that native writes are only performed under very specific iceberg table properties
Tests to ensure that native writes actually function as expected
Tests to ensure that compaction/sorting/z-ordering can now be fully accelerated with native writing

Jordan Epstein added 3 commits May 27, 2026 16:33

feat: add split-operator plan for Iceberg V2 writes

e111676

feat: detect Iceberg V2 writes and emit fall-back reasons

b8a51b3

feat: implement native Iceberg V2 writer via iceberg-rust

7d93223

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Native Parquet Iceberg Data File Writes In Comet#4487

feat: Native Parquet Iceberg Data File Writes In Comet#4487
jordepic wants to merge 3 commits into
apache:mainfrom
jordepic:main

jordepic commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jordepic commented May 28, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant