TL;DR: It would be great if we could use rustac to search STAC items in geoparquet and return sufficient info to open Icechunks in Xarray.
I made a catalog of STAC items for each of the dynamical.org Icechunk datasets on AWS Open Data, and then used rustac to create a geoparquet version. I pushed both to a bucket, mostly because I wanted to keep the regular catalog for the STAC browser.
I then asked Claude Code to make a notebook that queries the geoparquet with rustac and open one of the returned items in xarray: https://nbviewer.org/gist/rsignell/42cba4d8db34f49ed91538ef47375b32
What it built works. But after querying the geoparquet file, it uses the JSON STAC items to get the asset info necessary to load into Xarray, saying:
"rustac drops non-spec top-level attributes (like storage:schemes) when writing GeoParquet, so we look the item up by ID in the original JSON catalog, which has the full metadata needed by xpystac to open the icechunk store."
So it would be great if we could store the storage:schemes in geoparquet also, allowing us to open and start working with the icechunk assets from the items that got returned.
In case it's useful, here are the notebooks I used for the dynamical data:
The notebook to create the STAC catalog: https://github.com/OpenScienceComputing/NCICS-2026/blob/main/build_dynamical_catalog_full.ipynb
The notebook to create geoparquet:
https://github.com/OpenScienceComputing/NCICS-2026/blob/main/build_geoparquet.ipynb
TL;DR: It would be great if we could use rustac to search STAC items in geoparquet and return sufficient info to open Icechunks in Xarray.
I made a catalog of STAC items for each of the dynamical.org Icechunk datasets on AWS Open Data, and then used rustac to create a geoparquet version. I pushed both to a bucket, mostly because I wanted to keep the regular catalog for the STAC browser.
I then asked Claude Code to make a notebook that queries the geoparquet with rustac and open one of the returned items in xarray: https://nbviewer.org/gist/rsignell/42cba4d8db34f49ed91538ef47375b32
What it built works. But after querying the geoparquet file, it uses the JSON STAC items to get the asset info necessary to load into Xarray, saying:
"rustac drops non-spec top-level attributes (like storage:schemes) when writing GeoParquet, so we look the item up by ID in the original JSON catalog, which has the full metadata needed by xpystac to open the icechunk store."
So it would be great if we could store the
storage:schemesin geoparquet also, allowing us to open and start working with the icechunk assets from the items that got returned.In case it's useful, here are the notebooks I used for the dynamical data:
The notebook to create the STAC catalog: https://github.com/OpenScienceComputing/NCICS-2026/blob/main/build_dynamical_catalog_full.ipynb
The notebook to create geoparquet:
https://github.com/OpenScienceComputing/NCICS-2026/blob/main/build_geoparquet.ipynb