Skip to content

feat(rust/sedona-geoparquet,rust/sedona-datasource): Add support for partition columns and discovery#906

Draft
paleolimbot wants to merge 5 commits into
apache:mainfrom
paleolimbot:discover-partitioning
Draft

feat(rust/sedona-geoparquet,rust/sedona-datasource): Add support for partition columns and discovery#906
paleolimbot wants to merge 5 commits into
apache:mainfrom
paleolimbot:discover-partitioning

Conversation

@paleolimbot
Copy link
Copy Markdown
Member

Still have to sort out some of the details here but the gist is

import sedona.db

sd = sedona.db.connect()

t = sd.funcs.table.sd_random_geometry()
t.select(
    grp=sd.funcs.floor(t.id / 100), geom=sd.funcs.st_setsrid(t.geometry, 3857)
).to_parquet("out_dir", partition_by="grp")

sd.read_parquet("out_dir").show(5)
# ┌──────────────────────────────────────────────┬──────┐
# │                     geom                     ┆  grp │
# │                   geometry                   ┆ utf8 │
# ╞══════════════════════════════════════════════╪══════╡
# │ POINT(59.63310565314948 22.488904786369556)  ┆ 9    │
# ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
# │ POINT(83.12076970430842 21.796712564259458)  ┆ 9    │
# ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
# │ POINT(30.891948399833048 18.511790453362643) ┆ 9    │
# ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
# │ POINT(87.62900829596556 90.36699740467581)   ┆ 9    │
# ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
# │ POINT(89.95104383376538 57.37266977967324)   ┆ 9    │
# └──────────────────────────────────────────────┴──────┘

Before this PR, the grp col wasn't shown (and there was no way to reconstruct it)

Closes #765.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rust/sedona-geoparquet: Read of hive style partitions are not picked up by default

1 participant