Skip to content

Use zstd with compression 15 by default #178

@cholmes

Description

@cholmes

Since we started this there's been more research into parquet compression for geospatial data, and the latest consensus is that zstd with a level 15 compression is ideal.

It'd be good to change default compression to zstd. And I think it's ok if we don't expose the compression level in the CLI, but just hard code it to 15 (instead of 3, which is the default for geopandas / pyarrow). We should be able to pass the compression level even though we rewrote the geopandas writer, as it's just an extra call to the underlying library:

gdf.to_parquet(
    "data.parquet",
    compression="zstd",
    compression_level=3  # same as implicit default
)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions