-
Notifications
You must be signed in to change notification settings - Fork 85
Open
Labels
apiAPI design and consistencyAPI design and consistencybackend-coverageAdding missing dask/cupy/dask+cupy backend supportAdding missing dask/cupy/dask+cupy backend supportenhancementNew feature or requestNew feature or requesthigh-priorityzonal tools
Description
Problem
zonal.apply() (xrspatial/zonal.py lines 1190–1297) has multiple issues that make it unusable in production:
- Calls
.values(lines 1256, 1260) — silently materialises dask arrays and copies GPU arrays to host, destroying scalability. - Uses
np.vectorize(func)(line 1289) — a Python loop disguised as vectorisation, with massive per-element overhead. - Mutates the input DataArray in-place (
values.values = ...line 1291) — the only function in the library that does this. Incompatible with dask's lazy evaluation and violates the principle of least surprise. - No
ArrayTypeFunctionMappingdispatch — pure numpy-only, unlike every other function in the library.
Proposed Fix
- Rewrite with
ArrayTypeFunctionMappingdispatch pattern. - Add
_apply_numpy,_apply_dask_numpy,_apply_cupy,_apply_dask_cupybackends. - Return a new DataArray instead of mutating the input (breaking change — document in changelog).
- For dask: use
map_blockssince zones and values are chunk-aligned. - Replace
np.vectorizewith proper masked array operations. - Update README feature matrix row for Apply.
Breaking Change
The current API mutates values in-place and returns None. The new API should return a new DataArray. This is a deliberate breaking change to align with the rest of the library.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
apiAPI design and consistencyAPI design and consistencybackend-coverageAdding missing dask/cupy/dask+cupy backend supportAdding missing dask/cupy/dask+cupy backend supportenhancementNew feature or requestNew feature or requesthigh-priorityzonal tools