Skip to content

Replace O(n⁴) regions() with scipy union-find, add dask/cupy backends#898

Merged
brendancol merged 1 commit intomasterfrom
fix-regions-algorithm
Feb 25, 2026
Merged

Replace O(n⁴) regions() with scipy union-find, add dask/cupy backends#898
brendancol merged 1 commit intomasterfrom
fix-regions-algorithm

Conversation

@brendancol
Copy link
Contributor

Summary

  • Replace O(n⁴) _area_connectivity with scipy.ndimage.label (union-find, ~O(n) per unique value). The old algorithm did full-array scans inside the main pixel loop for conflict resolution.
  • Add backend dispatch for dask (compute-and-delegate with memory guard), cupy (cupyx.scipy.ndimage.label), and dask+cupy arrays.
  • Add scipy to install_requires — it was already a transitive dependency of datashader and in test deps.
  • Parametrise region tests over ['numpy', 'dask+numpy'] backends and add edge-case tests (single pixel, all-same, all-NaN, numpy/dask match).
  • Update README feature matrix: Regions row now shows ✅ for all four backends.

Test plan

  • All 16 region tests pass (pytest -k regions -v)
  • Full zonal test suite passes (61 tests)
  • Smoke test confirms numpy and dask produce identical results
  • Verify cupy backend on a GPU-equipped machine

The old _area_connectivity algorithm did full-array scans inside the
main pixel loop for conflict resolution, giving O(n⁴) worst case.
Replace it with scipy.ndimage.label (union-find, ~O(n)) per unique
value, and add backend dispatch for dask, cupy, and dask+cupy arrays.
@brendancol brendancol merged commit cffcc4e into master Feb 25, 2026
10 checks passed
@brendancol brendancol deleted the fix-regions-algorithm branch February 26, 2026 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant