Skip to content

feat!: fetch only specific bands instead of all bands#251

Draft
weiji14 wants to merge 19 commits intomainfrom
specific_band_fetch
Draft

feat!: fetch only specific bands instead of all bands#251
weiji14 wants to merge 19 commits intomainfrom
specific_band_fetch

Conversation

@weiji14
Copy link
Copy Markdown
Member

@weiji14 weiji14 commented Feb 24, 2026

What I am changing

  • Enable fetching only specific bands from an IFD, instead of all bands, useful when data is band-interleaved and/or there are a large number of bands that don't need to be fetched
  • Considered a breaking change since call changes from ifd.fetch_tile(0, 0, &reader) to ifd.fetch_tile(0, 0, None, &reader)

How I did it

  • Initial implementation to fetch a range of bands by modifying .fetch_tile()
  • Refactor to allow fetching disjoint band indexes
  • Better handle errors when trying to fetch invalid bands (kinda done, but not ergonomic)
  • Propagate to .fetch_tiles() method.
  • Propagate to python bindings

How you can test it

  • Run cargo test locally.

Related Issues

Comment thread src/ifd.rs Outdated
&self,
x: usize,
y: usize,
bands: Option<Range<usize>>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. maybe it makes sense to create an options object for cleaner forwards compatibility
  2. This should probably be a Vec<usize> instead of a Range. There's no requirement for the bands to be contiguous.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, agree with 2. Can you elaborate on 1? What's an options object?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just mean something like

struct FetchOptions {
    bands: Option<Vec<usize>>,
}

so that we can add more options in the future if necessary without changing the primary function signature here again.

Copy link
Copy Markdown
Member Author

@weiji14 weiji14 Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I was thinking something like that with a Default impl trait (that returns all bands by default) so Option<> won't be needed (unsure if it's possible since bands are runtime determined), or using a builder pattern like .fetch_tiles(x, y, ...).with_bands(0..3).build().

Ok to go with FetchOptions, since we can reuse it for .fetch_tiles() as well.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we can implement Default for FetchOptions. But right I think we'll still need to represent bands: None to support all bands when those band indices aren't known till later.

Yes we can reuse it for fetch_tiles and I think we can ignore the niche case where someone might want to fetch different bands for each tile in fetch_tiles.

Change from passing in a Range<usize> to a Vec<usize> so specific bands can be indexed.
Create a FetchOptions struct that allows passing specific parameters (currently only `bands`) when doing a fetch_tile operation. Keep the existing option of passing only None to do the default action of fetching of all bands.
With some special handling to handle 0-based indexing.
Use the same FetchOptions struct, included another unit test using the eox_cloudless.tif data as test case.
@weiji14 weiji14 force-pushed the specific_band_fetch branch from 23a18f6 to 7a3ca77 Compare March 2, 2026 05:30
Comment thread src/ifd.rs Outdated
&self,
x: usize,
y: usize,
fetch_options: Option<FetchOptions>,
Copy link
Copy Markdown
Member Author

@weiji14 weiji14 Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuing from #251 (comment):

Yeah we can implement Default for FetchOptions. But right I think we'll still need to represent bands: None to support all bands when those band indices aren't known till later.

Yes we can reuse it for fetch_tiles and I think we can ignore the niche case where someone might want to fetch different bands for each tile in fetch_tiles.

I went with Option<FetchOptions> here, which makes it easier to pass just None if wanting the default all bands instead of instantiating something like FetchOptions {bands: None}, but not sure how I feel about this tbh.

Almost thinking if we should just go back to plain bands: Option<Vec<usize>>. Realistically speaking, are there other fetch related configuration options we might need to handle later to require future-proofing the API like this?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realistically speaking, are there other fetch related configuration options we might need to handle later to require future-proofing the API like this?

Rust doesn't have keyword arguments, only positional arguments. In Python, I have a set of heuristics for when I define positional vs keyword-only arguments. In particular, as much as possible (at least for public facing APIs), arguments that are allowed to be positional should be plainly obvious what they do. E.g. in fetch_tile(x, y) it can be reasonably understood what fetch_tile(1, 2) means without having to look up the function definition or documentation.

Once you get further away from that, though, it becomes hard to know what fetch_tile(1, 2, None) means, without going back to look at what the named parameter for None was.

My suggestion above is essentially applying this same heuristic to Rust. Even if we don't expect more options (which I'm not confident in), then for public APIs like this it may still be better to have an "options object" which essentially acts as Python keyword-only parameters.

I went with Option<FetchOptions> here, which makes it easier to pass just None if wanting the default all bands instead of instantiating something like FetchOptions {bands: None}, but not sure how I feel about this tbh.

I'd suggest against it. I'd rather have FetchOptions and the user can pass in Default::default() if they want default options. Then it's more readable at the callsite that you're getting the default options, instead of having to recall that None means "use default options".

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with Option<FetchOptions> here, which makes it easier to pass just None if wanting the default all bands instead of instantiating something like FetchOptions {bands: None}, but not sure how I feel about this tbh.

I'd suggest against it. I'd rather have FetchOptions and the user can pass in Default::default() if they want default options. Then it's more readable at the callsite that you're getting the default options, instead of having to recall that None means "use default options".

Gotcha. I'll refactor this after #261. Using Option<FetchOptions> did sound weird, I need to change my Python-way of thinking that None means default 😆

@kylebarron
Copy link
Copy Markdown
Member

I think after #261 this should become very simple and straightforward of just indexing into the ranges array for planar data.

@kylebarron
Copy link
Copy Markdown
Member

I went ahead and updated this PR given #261... but then I realized we have to make a choice... is our band indexing 0-based or 1-based 🙈

async-geotiff has adopted 1-based indexing to match GDAL/Rasterio, but I think it may be best here in async-tiff to use 0-based indexing, since we're trying to be generic for all TIFFs. And here the band numbers are used to index into a list of offsets, which for a general computer science person would obviously be 0-based.

@weiji14 do you have any thoughts?

@weiji14
Copy link
Copy Markdown
Member Author

weiji14 commented Mar 3, 2026

Thanks for handling the merge conflict and adding the Python-bindings!

async-geotiff has adopted 1-based indexing to match GDAL/Rasterio, but I think it may be best here in async-tiff to use 0-based indexing, since we're trying to be generic for all TIFFs. And here the band numbers are used to index into a list of offsets, which for a general computer science person would obviously be 0-based.

@weiji14 do you have any thoughts?

Yeah, would prefer 0-based here (math is easier that way). I secretly wanted to 👎 developmentseed/async-geotiff#14 but won't argue with the experts 🙈

@kylebarron
Copy link
Copy Markdown
Member

I think given the geospatial ecosystem in Python is already extremely reliant on 1-based band indexing, it would be more disruptive to switch to 0-based band indexing there. But here I'm open to it.

Comment thread python/src/tiff.rs
Comment on lines 86 to +101
fn fetch_tile<'py>(
&'py self,
py: Python<'py>,
x: usize,
y: usize,
z: usize,
) -> PyResult<Bound<'py, PyAny>> {
let reader = self.reader.clone();
let ifd = self
.ifds
.get(z)
.ok_or_else(|| PyIndexError::new_err(format!("No IFD found for z={z}")))?
.clone();
future_into_py(py, async move {
let tile = ifd
.fetch_tile(x, y, reader.as_ref())
.fetch_tile(x, y, reader.as_ref(), Default::default())
Copy link
Copy Markdown
Member Author

@weiji14 weiji14 Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the TIFF.fetch_tile() method also expose specific band fetching? Or limit it to be only on the IFD level?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it should expose band fetching; good catch

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to just keep it at the IFD level to be honest. But I suppose you want to align with the Typescript implementation in deck.gl-raster?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to mirror the same options in both sets of fetch_tile. And it's a pretty simple option to pass along

@mdsumner
Copy link
Copy Markdown

I think given the geospatial ecosystem in Python is already extremely reliant on 1-based band indexing, it would be more disruptive to switch to 0-based band indexing there. But here I'm open to it.

And GDAL overview 0 is IFD level 1. I have internalized mental reminders to keep track. 🙏

@kylebarron
Copy link
Copy Markdown
Member

I think given the geospatial ecosystem in Python is already extremely reliant on 1-based band indexing, it would be more disruptive to switch to 0-based band indexing there. But here I'm open to it.

And GDAL overview 0 is IFD level 1. I have internalized mental reminders to keep track. 🙏

@mdsumner you may be better served by using https://github.com/developmentseed/async-geotiff, if you're not already, because it automatically associates images with overviews and their masks, and makes it easy to access specific tiles in a geospatial context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants