-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Labels
Description
Since we started this there's been more research into parquet compression for geospatial data, and the latest consensus is that zstd with a level 15 compression is ideal.
It'd be good to change default compression to zstd. And I think it's ok if we don't expose the compression level in the CLI, but just hard code it to 15 (instead of 3, which is the default for geopandas / pyarrow). We should be able to pass the compression level even though we rewrote the geopandas writer, as it's just an extra call to the underlying library:
gdf.to_parquet(
"data.parquet",
compression="zstd",
compression_level=3 # same as implicit default
)