Skip to content

Conversation

@dfulu
Copy link
Member

@dfulu dfulu commented Dec 23, 2025

Pull Request

Context

Previously, I did an experiment which showed that giving the model a feature encoding whether t0 is HH:00 or HH:30 increased the model accuracy. The experiment was motivated by the fact the the NWP data is hourly and the model may want to learn to use the NWP differently at HH:00 and HH:00 t0 times. The experiment write up is here

PR description

This PR adds the option to include embeddings of the t0 time. I've tried to generalised the functionality so that we can choose the periodicity of the embedding features. i.e. we can encode how far through each hour t0 is. For 3-hourly NWP steps we can also generate hour far through each 3-hour period t0 is. We can also use it to make time-of-day (by adding "24h" in periods) and day-of-year features (by adding "1y" in periods).

I've made it configurable so that we can choose a representation for each periodicity. For example, for time-of-day we likely want to sin-cos embed the times ("cyclic" in the code) since 11:59pm is close to 00:00am. But for time-past-the-hour features we likely likely want linear encodings where HH:00 -> 0 and HH:59 -> ~1. This time-past-the-hour feature is supposed to inform the network about how to interpolate the NWP data.We shift the NWP inputs by a full hour between HH:59 (HH+1):00. So in this way HH:00 is maximally dissimilar HH:59 - so we would want our time feature to reflect this.

Extra notes

I did a couple of extra semi-related clean-ups in this PR that I couldn't resist. I've noted them

Future usage

I think it's likely that this function could replace the usage of encode_datetimes() entirely. Our PVNet models are trained to predict for fixed set of horizons, so if we already have the time-of-day and day-of-year features of t0, there is no additional information in knowing the other forecast datetimes. Once we (or the model) know t0 we already know all forecast datetimes.

I think we should try to move to sunset the encode_datetimes() function but I've left it in for now. We'll need to do a little testing and I don't want to break backwards compatibility.

How Has This Been Tested?

I've added tests for the exact numeric values which get_t0_embedding() produced and checked that it works inside the PVNetDataset class.

Checklist:

  • My code follows OCF's coding style guidelines
  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked my code and corrected any misspellings

# Add datetime features
datetimes = pd.DatetimeIndex(da_generation.time_utc.values)
datetime_features = encode_datetimes(datetimes=datetimes)
# Add datetime features
Copy link
Member Author

@dfulu dfulu Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly unrelated to the main PR here. I wanted to decouple the datetime features from the generation dataarray the same way the solar features are decoupled

This is really just a bit of a clean up

# Only add solar position if explicitly configured
if self.config.input_data.solar_position is not None:
solar_config = self.config.input_data.solar_position
# Only add solar position if explicitly configured
Copy link
Member Author

@dfulu dfulu Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another semi-related clean up I spotted along the way. Adding the solar features should not depend on the generation data being there

@dfulu dfulu changed the title T0 features t0 embedding features Dec 23, 2025
@dfulu dfulu changed the title t0 embedding features Add t0 embedding features Dec 23, 2025
@dfulu dfulu changed the title Add t0 embedding features Add optional t0 embedding features Dec 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants