Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions benchmarks/tpc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,8 +179,8 @@ sudo ./drop-caches.sh
python3 run.py --engine comet-iceberg --benchmark tpch
```

The benchmark uses `spark.comet.scan.icebergNative.enabled=true` to enable Comet's native iceberg-rust
integration. Verify native scanning is active by checking for `CometIcebergNativeScanExec` in the
The benchmark uses Comet's native iceberg-rust integration, which is enabled by default.
Verify native scanning is active by checking for `CometIcebergNativeScanExec` in the
physical plan output.

### create-iceberg-tables.py options
Expand Down
2 changes: 1 addition & 1 deletion common/src/main/scala/org/apache/comet/CometConf.scala
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ object CometConf extends ShimCometConf {
"Iceberg tables are read directly through native execution, bypassing Spark's " +
"DataSource V2 API for better performance.")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://datafusion.apache.org/comet/user-guide/latest/iceberg.html says that Iceberg support is experimental. Some other config entries mention that they are experimental.

Suggested change
"DataSource V2 API for better performance.")
"DataSource V2 API for better performance. This feature is experimental.")

.booleanConf
.createWithDefault(false)
.createWithDefault(true)

val COMET_ICEBERG_DATA_FILE_CONCURRENCY_LIMIT: ConfigEntry[Int] =
conf("spark.comet.scan.icebergNative.dataFileConcurrencyLimit")
Expand Down
6 changes: 2 additions & 4 deletions docs/source/user-guide/latest/iceberg.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@ then serialized to Comet's native execution engine (see
[PR #2528](https://github.com/apache/datafusion-comet/pull/2528)).

The example below uses Spark's package downloader to retrieve Comet 0.14.0 and Iceberg
1.8.1, but Comet has been tested with Iceberg 1.5, 1.7, 1.8, 1.9, and 1.10. The key configuration
to enable fully-native Iceberg is `spark.comet.scan.icebergNative.enabled=true`.
1.8.1, but Comet has been tested with Iceberg 1.5, 1.7, 1.8, 1.9, and 1.10. The native Iceberg
reader is enabled by default. To disable it, set `spark.comet.scan.icebergNative.enabled=false`.

```shell
$SPARK_HOME/bin/spark-shell \
Expand All @@ -43,7 +43,6 @@ $SPARK_HOME/bin/spark-shell \
--conf spark.plugins=org.apache.spark.CometPlugin \
--conf spark.shuffle.manager=org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager \
--conf spark.sql.extensions=org.apache.comet.CometSparkSessionExtensions \
--conf spark.comet.scan.icebergNative.enabled=true \
--conf spark.comet.explainFallback.enabled=true \
--conf spark.memory.offHeap.enabled=true \
--conf spark.memory.offHeap.size=2g
Expand Down Expand Up @@ -120,7 +119,6 @@ $SPARK_HOME/bin/spark-shell \
--conf spark.plugins=org.apache.spark.CometPlugin \
--conf spark.shuffle.manager=org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager \
--conf spark.sql.extensions=org.apache.comet.CometSparkSessionExtensions \
--conf spark.comet.scan.icebergNative.enabled=true \
--conf spark.comet.explainFallback.enabled=true \
--conf spark.memory.offHeap.enabled=true \
--conf spark.memory.offHeap.size=2g
Expand Down
Loading