feat!: fetch only specific bands instead of all bands#251
feat!: fetch only specific bands instead of all bands#251
Conversation
| &self, | ||
| x: usize, | ||
| y: usize, | ||
| bands: Option<Range<usize>>, |
There was a problem hiding this comment.
- maybe it makes sense to create an options object for cleaner forwards compatibility
- This should probably be a
Vec<usize>instead of a Range. There's no requirement for the bands to be contiguous.
There was a problem hiding this comment.
yep, agree with 2. Can you elaborate on 1? What's an options object?
There was a problem hiding this comment.
I just mean something like
struct FetchOptions {
bands: Option<Vec<usize>>,
}so that we can add more options in the future if necessary without changing the primary function signature here again.
There was a problem hiding this comment.
Right, I was thinking something like that with a Default impl trait (that returns all bands by default) so Option<> won't be needed (unsure if it's possible since bands are runtime determined), or using a builder pattern like .fetch_tiles(x, y, ...).with_bands(0..3).build().
Ok to go with FetchOptions, since we can reuse it for .fetch_tiles() as well.
There was a problem hiding this comment.
Yeah we can implement Default for FetchOptions. But right I think we'll still need to represent bands: None to support all bands when those band indices aren't known till later.
Yes we can reuse it for fetch_tiles and I think we can ignore the niche case where someone might want to fetch different bands for each tile in fetch_tiles.
Change from passing in a Range<usize> to a Vec<usize> so specific bands can be indexed.
Create a FetchOptions struct that allows passing specific parameters (currently only `bands`) when doing a fetch_tile operation. Keep the existing option of passing only None to do the default action of fetching of all bands.
With some special handling to handle 0-based indexing.
Use the same FetchOptions struct, included another unit test using the eox_cloudless.tif data as test case.
23a18f6 to
7a3ca77
Compare
| &self, | ||
| x: usize, | ||
| y: usize, | ||
| fetch_options: Option<FetchOptions>, |
There was a problem hiding this comment.
Continuing from #251 (comment):
Yeah we can implement
DefaultforFetchOptions. But right I think we'll still need to representbands: Noneto support all bands when those band indices aren't known till later.Yes we can reuse it for
fetch_tilesand I think we can ignore the niche case where someone might want to fetch different bands for each tile infetch_tiles.
I went with Option<FetchOptions> here, which makes it easier to pass just None if wanting the default all bands instead of instantiating something like FetchOptions {bands: None}, but not sure how I feel about this tbh.
Almost thinking if we should just go back to plain bands: Option<Vec<usize>>. Realistically speaking, are there other fetch related configuration options we might need to handle later to require future-proofing the API like this?
There was a problem hiding this comment.
Realistically speaking, are there other fetch related configuration options we might need to handle later to require future-proofing the API like this?
Rust doesn't have keyword arguments, only positional arguments. In Python, I have a set of heuristics for when I define positional vs keyword-only arguments. In particular, as much as possible (at least for public facing APIs), arguments that are allowed to be positional should be plainly obvious what they do. E.g. in fetch_tile(x, y) it can be reasonably understood what fetch_tile(1, 2) means without having to look up the function definition or documentation.
Once you get further away from that, though, it becomes hard to know what fetch_tile(1, 2, None) means, without going back to look at what the named parameter for None was.
My suggestion above is essentially applying this same heuristic to Rust. Even if we don't expect more options (which I'm not confident in), then for public APIs like this it may still be better to have an "options object" which essentially acts as Python keyword-only parameters.
I went with
Option<FetchOptions>here, which makes it easier to pass justNoneif wanting the default all bands instead of instantiating something likeFetchOptions {bands: None}, but not sure how I feel about this tbh.
I'd suggest against it. I'd rather have FetchOptions and the user can pass in Default::default() if they want default options. Then it's more readable at the callsite that you're getting the default options, instead of having to recall that None means "use default options".
There was a problem hiding this comment.
I went with
Option<FetchOptions>here, which makes it easier to pass justNoneif wanting the default all bands instead of instantiating something likeFetchOptions {bands: None}, but not sure how I feel about this tbh.I'd suggest against it. I'd rather have
FetchOptionsand the user can pass inDefault::default()if they want default options. Then it's more readable at the callsite that you're getting the default options, instead of having to recall thatNonemeans "use default options".
Gotcha. I'll refactor this after #261. Using Option<FetchOptions> did sound weird, I need to change my Python-way of thinking that None means default 😆
|
I think after #261 this should become very simple and straightforward of just indexing into the ranges array for planar data. |
|
I went ahead and updated this PR given #261... but then I realized we have to make a choice... is our band indexing 0-based or 1-based 🙈 async-geotiff has adopted 1-based indexing to match GDAL/Rasterio, but I think it may be best here in async-tiff to use 0-based indexing, since we're trying to be generic for all TIFFs. And here the band numbers are used to index into a list of offsets, which for a general computer science person would obviously be 0-based. @weiji14 do you have any thoughts? |
|
Thanks for handling the merge conflict and adding the Python-bindings!
Yeah, would prefer 0-based here (math is easier that way). I secretly wanted to 👎 developmentseed/async-geotiff#14 but won't argue with the experts 🙈 |
|
I think given the geospatial ecosystem in Python is already extremely reliant on 1-based band indexing, it would be more disruptive to switch to 0-based band indexing there. But here I'm open to it. |
| fn fetch_tile<'py>( | ||
| &'py self, | ||
| py: Python<'py>, | ||
| x: usize, | ||
| y: usize, | ||
| z: usize, | ||
| ) -> PyResult<Bound<'py, PyAny>> { | ||
| let reader = self.reader.clone(); | ||
| let ifd = self | ||
| .ifds | ||
| .get(z) | ||
| .ok_or_else(|| PyIndexError::new_err(format!("No IFD found for z={z}")))? | ||
| .clone(); | ||
| future_into_py(py, async move { | ||
| let tile = ifd | ||
| .fetch_tile(x, y, reader.as_ref()) | ||
| .fetch_tile(x, y, reader.as_ref(), Default::default()) |
There was a problem hiding this comment.
Should the TIFF.fetch_tile() method also expose specific band fetching? Or limit it to be only on the IFD level?
There was a problem hiding this comment.
Yes it should expose band fetching; good catch
There was a problem hiding this comment.
I'm happy to just keep it at the IFD level to be honest. But I suppose you want to align with the Typescript implementation in deck.gl-raster?
There was a problem hiding this comment.
I think it makes sense to mirror the same options in both sets of fetch_tile. And it's a pretty simple option to pass along
And GDAL overview 0 is IFD level 1. I have internalized mental reminders to keep track. 🙏 |
@mdsumner you may be better served by using https://github.com/developmentseed/async-geotiff, if you're not already, because it automatically associates images with overviews and their masks, and makes it easy to access specific tiles in a geospatial context. |
What I am changing
ifd.fetch_tile(0, 0, &reader)toifd.fetch_tile(0, 0, None, &reader)How I did it
.fetch_tile().fetch_tiles()method.How you can test it
cargo testlocally.Related Issues