generated from openclimatefix/ocf-template
-
-
Notifications
You must be signed in to change notification settings - Fork 47
Add optional t0 embedding features #385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
dfulu
commented
Dec 23, 2025
dfulu
commented
Dec 23, 2025
dfulu
commented
Dec 23, 2025
Sukh-P
reviewed
Jan 5, 2026
Sukh-P
reviewed
Jan 5, 2026
Sukh-P
reviewed
Jan 5, 2026
Sukh-P
approved these changes
Jan 5, 2026
Member
Sukh-P
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to see this feature being added, great work!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request
Context
Previously, I did an experiment which showed that giving the model a feature encoding whether t0 is HH:00 or HH:30 increased the model accuracy. The experiment was motivated by the fact the the NWP data is hourly and the model may want to learn to use the NWP differently at HH:00 and HH:00 t0 times. The experiment write up is here
PR description
This PR adds the option to include embeddings of the t0 time. I've tried to generalised the functionality so that we can choose the periodicity of the embedding features. i.e. we can encode how far through each hour t0 is. For 3-hourly NWP steps we can also generate hour far through each 3-hour period t0 is. We can also use it to make time-of-day (by adding "24h" in
periods) and day-of-year features (by adding "1y" inperiods).I've made it configurable so that we can choose a representation for each periodicity. For example, for time-of-day we likely want to sin-cos embed the times ("cyclic" in the code) since 11:59pm is close to 00:00am. But for time-past-the-hour features we likely likely want linear encodings where
HH:00 -> 0andHH:59 -> ~1. This time-past-the-hour feature is supposed to inform the network about how to interpolate the NWP data.We shift the NWP inputs by a full hour between HH:59 (HH+1):00. So in this way HH:00 is maximally dissimilar HH:59 - so we would want our time feature to reflect this.Extra notes
I did a couple of extra semi-related clean-ups in this PR that I couldn't resist. I've noted them
Future usage
I think it's likely that this function could replace the usage of
encode_datetimes()entirely. Our PVNet models are trained to predict for fixed set of horizons, so if we already have the time-of-day and day-of-year features of t0, there is no additional information in knowing the other forecast datetimes. Once we (or the model) know t0 we already know all forecast datetimes.I think we should try to move to sunset the
encode_datetimes()function but I've left it in for now. We'll need to do a little testing and I don't want to break backwards compatibility.How Has This Been Tested?
I've added tests for the exact numeric values which
get_t0_embedding()produced and checked that it works inside thePVNetDatasetclass.Checklist: