Skip to content

Add zonal.stats() dask+cupy backend #902

@brendancol

Description

@brendancol

Problem

zonal.stats() supports numpy, dask+numpy, and cupy, but not dask+cupy. Line 687 shows:

dask_cupy_func=lambda *args: not_implemented_func(
    *args, messages='stats() does not support dask with cupy backed DataArray'
),

The cupy backend (_stats_cupy at line 370) already works for single-GPU arrays. The gap is only dask+cupy — the multi-GPU / larger-than-VRAM case.

Proposed Fix

Implement _stats_dask_cupy() by adapting the existing _stats_dask_numpy() approach:

  • Use dask.delayed to process each (zones_block, values_block) pair with the existing cupy sort-and-stride logic from _stats_cupy.
  • Aggregate block-level results on the host (same pattern as the dask+numpy path).
  • The existing cupy function handles sorting and unique-finding on GPU; the dask wrapper just orchestrates blocks.

Impact

Zonal statistics is a core GIS operation and arguably the most commonly used analytical function in the library. Users with multi-GPU setups or rasters larger than GPU VRAM need dask+cupy support.

Metadata

Metadata

Assignees

No one assigned

    Labels

    backend-coverageAdding missing dask/cupy/dask+cupy backend supportenhancementNew feature or requestgpuCuPy / CUDA GPU supportzonal tools

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions