Skip to content

Fixes #875: add out-of-core dask CPU viewshed#897

Merged
brendancol merged 5 commits intomasterfrom
fix-875-dask-viewshed
Feb 25, 2026
Merged

Fixes #875: add out-of-core dask CPU viewshed#897
brendancol merged 5 commits intomasterfrom
fix-875-dask-viewshed

Conversation

@brendancol
Copy link
Contributor

@brendancol brendancol commented Feb 25, 2026

Summary

  • Add dask+numpy backend for viewshed() with a three-tier strategy:
    • Tier A (max_distance set): extract spatial window from dask array, compute only relevant chunks, run exact R2 angular sweep on the small numpy window
    • Tier B (grid fits in memory): compute full dask array, run R2 directly — gives exact results
    • Tier C (out-of-core): horizon-profile distance-sweep algorithm processes cells in Chebyshev-ring order with an LRU chunk cache, keeping memory bounded regardless of grid size
  • Add max_distance parameter to viewshed() for all backends
  • Add memory guard that raises MemoryError before allocating if output grid exceeds 80% of available RAM
  • Add four new dask-specific tests (flat terrain, dask-vs-numpy match, max_distance, forced Tier C)

Closes #875

Test plan

  • All 18 viewshed tests pass (pytest xrspatial/tests/test_viewshed.py -x -v)
  • Smoke test: flat 20×20 dask grid, all cells visible, observer = 180
  • Verify on larger dask-backed raster (e.g. 1000×1000 chunked) to confirm Tier B path
  • Verify Tier A path produces correct partial viewshed with max_distance

Add a dask+numpy backend for viewshed() using a three-tier strategy:

A. max_distance specified → extract spatial window, compute, run exact
   R2 angular sweep on the small numpy window.
B. Full R2 fits in memory (280 bytes/cell < 50% available RAM) →
   compute the full dask array, run R2 directly.
C. Otherwise → horizon-profile distance sweep algorithm that processes
   cells in Chebyshev-ring order with an LRU chunk cache, keeping
   memory bounded regardless of grid size.

Also adds a max_distance parameter to viewshed() for all backends,
a memory guard for Tier C, and four new dask-specific tests.
- Extract max_distance window logic into _viewshed_windowed() that
  works for numpy, cupy, dask+numpy, and dask+cupy uniformly
- Handle dask+cupy in _viewshed_dask(): Tier B computes to cupy and
  uses GPU RTX when available; Tier C converts cupy chunks to numpy
  for the distance-sweep fallback
- Add tests: numpy max_distance, max_distance-vs-full comparison
  (numpy + cupy parametrized)
…ndow

_viewshed_windowed was allocating np.full((H, W), ...) for the output
before wrapping it as dask — instant OOM on a 30TB input even with
max_distance set.  Now for dask inputs the output is built chunk-by-chunk:
overlapping chunks get a concrete numpy block, all others are lazy
da.full blocks that consume no memory until materialized.

Adds test_viewshed_dask_max_distance_lazy_output which creates a
100k x 100k (80GB) dask raster and verifies the output stays lazy.
The window extraction uses a square (Chebyshev) region but max_distance
is a Euclidean radius.  Add a circular mask after computing the window
result so cells at the corners beyond max_distance are set to INVISIBLE.
Update tests to account for the circular boundary.
@brendancol brendancol merged commit 05fc5f0 into master Feb 25, 2026
9 of 10 checks passed
@brendancol brendancol deleted the fix-875-dask-viewshed branch February 26, 2026 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

viewshed: no dask support, .values forces full materialisation

1 participant