Use flox for median in groupby/resample operations#11239
Use flox for median in groupby/resample operations#11239sdiebolt wants to merge 3 commits intopydata:mainfrom
Conversation
Add flox support to median() methods in: - DataArrayGroupByAggregations - DatasetGroupByAggregations - DataArrayResampleAggregations - DatasetResampleAggregations This aligns the implementation with the documentation which already claimed flox was used when available. The fix provides significant performance improvements when flox can process the data. A fallback to the non-flox implementation is included for cases where flox's median aggregation requires blockwise processing but the data chunking doesn't support it. Closes pydata#11238
There was a problem hiding this comment.
Thanks for taking this on!
Let's raise Now that I think about it; we can do this in flox automatically: xarray-contrib/flox#501FutureWarning here asking the user to shuffle_to_chunks or rechunk appropriately and that this behaviour will be deprecated in 6 months (please open an issue for that).
The reason I hadn't done this so far is that dask will do some auto-rechunking to make it work; and so some workloads will break.
Also this file is generated by generate_aggregations.py. Please make the edit there instead
Will dot! Should I also run |
Yes, and hmmm.... the output shouldn't chnage that much. |
Add flox support to median() methods in:
This aligns the implementation with the documentation which already claimed flox was used when available. The fix provides significant performance improvements when flox can process the data. A fallback to the non-flox implementation is included for cases where flox's median aggregation requires blockwise processing but the data chunking doesn't support it.
Closes #11238
whats-new.rst