Skip to content

updating mse_ens and linting#2371

Open
wael-mika wants to merge 2 commits into
ecmwf:developfrom
wael-mika:wm/mse_ens_loss
Open

updating mse_ens and linting#2371
wael-mika wants to merge 2 commits into
ecmwf:developfrom
wael-mika:wm/mse_ens_loss

Conversation

@wael-mika
Copy link
Copy Markdown
Contributor

Description

Implemented mse_ens with the updated signature with 2 options:
1- applying mse on each ensemble member vs target and then take the mean (default)
2- calculating the mean of all ensembles and then calculate the loss (configurable)

The default one to be decided after discussion

Issue Number

Closes #2250

Is this PR a draft? Mark it as draft.

Checklist before asking for review

  • I have performed a self-review of my code
  • My changes comply with basic sanity checks:
    • I have fixed formatting issues with ./scripts/actions.sh lint
    • I have run unit tests with ./scripts/actions.sh unit-test
    • I have documented my code and I have updated the docstrings.
    • I have added unit tests, if relevant
  • I have tried my changes with data and code:
    • I have run the integration tests with ./scripts/actions.sh integration-test
    • (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
    • (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
  • I have informed and aligned with people impacted by my change:
    • for config changes: the MatterMost channels and/or a design doc
    • for changes of dependencies: the MatterMost software development channel

@github-actions github-actions Bot added bug Something isn't working model Related to model training or definition (not generic infra) labels May 18, 2026
@clessig
Copy link
Copy Markdown
Collaborator

clessig commented May 18, 2026

I think 1.) should be the default since it is a well-defined, self-contained loss. 2.) requires another loss term to control the higher order moments.

Copy link
Copy Markdown
Collaborator

@clessig clessig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the implementation. I think we can make the code simpler by using the lp_loss or mse_loss functions.

weights_channels : shape (num_channels,) or None
weights_points : shape (num_data_points,) or None
"""
mask_nan = ~torch.isnan(target)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please re-use the lp_loss or mse_loss functions below

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will push an update

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think the correct way to control the spread when using option 2?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working model Related to model training or definition (not generic infra)

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Bug report: mse_ens loss function has the old signature

2 participants