16 global spatial embedding#48
Open
meiertgrootes wants to merge 7 commits into
Open
Conversation
added 5 commits
May 28, 2026 01:24
…e choice to train.
|
Meiert Grootes seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request adds fully sphere aware geo position and scale encoding for patches.
Geo position is encoded using real-valued spherical harmonics as basis.
To create embeddings real-valued spherical harmonics at the lat/lon positions of pixels of the input data (i.e. native resolution) are calculated up to a user-defined order L. This results in an (L+1)^2 dimensional embedding vector. L ~ 10 should be fine.
Subsequently a sphere aware area-weighted PCA is performed on the native resolution SH embdding grid, with the requirment that the target diemension for the PCA (the sh_embed_dim) be smaller than (L+1)^2. The ranked PCA components up to sh_embed_dim are retained to be used as basis functions. The SH embedding vectors are then reprojected to this basis and scaled to zero mean and unit variance with tanh based soft-clipping at ~3 sigma to suppress pathological outliers in high order harmonics.
For each patch, a patch position embedding is then constructed as the area weighted mean of the token/pixel embeddings in the patch.
In addition for each patch a scale embedding is constructed consisting of: the patch physical extent in lat and lon directions, the patch area, the anisotropy of the patch extent, the pixel scale in phyical units [m] in lat/lon, anisotropy and isotropized linear scale, as well as finally effective harmonic order cutoff in lat/lon.
Both embeddings are precalculated once for all patches
The 10-dimensional scale embedding is concatenated with the geo position embedding for each patch. During training an trainable MLP is used to project the concatented embeddings into the desired embedding diemension for additive incorporation.
Note, it makes sense to choose the hidden diemension for the projection larger than bith the dimension of the concatenated embeddings, as well as the target embedding dimension.