Skip to content

Conversation

@majinghe
Copy link

@majinghe majinghe commented Dec 25, 2025

As discussed in #14638, minio is under maintenance mode, so replacing the minio with RustFS. Testing works fine locally.

Testing steps:

  • Generating spark default conf

     spark.sql.extensions                   org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
     spark.sql.catalog.demo                 org.apache.iceberg.spark.SparkCatalog
     spark.sql.catalog.demo.type            rest
     spark.sql.catalog.demo.uri             http://rest:8181
     spark.sql.catalog.demo.io-impl         org.apache.iceberg.aws.s3.S3FileIO
     spark.sql.catalog.demo.warehouse       s3://warehouse/wh
     spark.sql.catalog.demo.s3.endpoint     http://rustfs:9000
     spark.sql.defaultCatalog               demo
     spark.eventLog.enabled                 true
     spark.eventLog.dir                     /home/iceberg/spark-events
     spark.history.fs.logDirectory          /home/iceberg/spark-events
     spark.sql.catalogImplementation        in-memory
     spark.sql.catalog.demo.s3.path-style-access true
    
  • Running container

    Running docker command to run all containers

     docker compose up -d
    
  • Insert data

     docker exec -it spark-iceberg spark-sql
     Setting default log level to "WARN".
     To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
     25/12/25 01:41:54 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
     25/12/25 01:42:05 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
     Spark Web UI available at http://4dbf6384ac7a:4041
     Spark master: local[*], Application Id: local-1766626926764
     spark-sql ()> 
                 > CREATE NAMESPACE demo.nyc;
     Time taken: 4.89 seconds
     spark-sql ()> CREATE TABLE demo.nyc.taxis
                 > (
                 >   vendor_id bigint,
                 >   trip_id bigint,
                 >   trip_distance float,
                 >   fare_amount double,
                 >   store_and_fwd_flag string
                 > )
                 > PARTITIONED BY (vendor_id);
     Time taken: 6.362 seconds
     spark-sql ()> INSERT INTO demo.nyc.taxis
                 > VALUES (1, 1000371, 1.8, 15.32, 'N'), (2, 1000372, 2.5, 22.15, 'N'), (2, 1000373, 0.9, 9.01, 'N'), (1, 1000374, 8.4, 42.13, 'Y');
     Time taken: 17.706 seconds
     spark-sql ()> 
    
  • Data verification

    Checking inserted data on RustFS instance

    截屏2025-12-25 09 44 13

@github-actions github-actions bot added the docs label Dec 25, 2025
@majinghe majinghe changed the title replace minio with rustfs in quick start docs: replace minio with rustfs in quick start Dec 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant