[DRAFT] Mh/jk/diffusion full pipeline forecast by moritzhauschulz · Pull Request #2396 · ecmwf/WeatherGenerator

moritzhauschulz · 2026-05-20T16:51:46Z

Description

DRAFT PR to assess diff between my conditioning branch and the current main diffusion branch.

Issue Number

Is this PR a draft? Mark it as draft.

Checklist before asking for review

I have performed a self-review of my code
My changes comply with basic sanity checks:
- I have fixed formatting issues with ./scripts/actions.sh lint
- I have run unit tests with ./scripts/actions.sh unit-test
- I have documented my code and I have updated the docstrings.
- I have added unit tests, if relevant
I have tried my changes with data and code:
- I have run the integration tests with ./scripts/actions.sh integration-test
- (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
- (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
I have informed and aligned with people impacted by my change:
- for config changes: the MatterMost channels and/or a design doc
- for changes of dependencies: the MatterMost software development channel

…ning not implemented)

MatKbauer

Two small modifications, but looks good. Currently training some models to see convergence.

MatKbauer · 2026-05-21T16:10:19Z

            tokens, posteriors = self.encoder.encoder(model_params=model_params, batch=batch)
+            shape = (len(batch), batch.get_num_steps(), *tokens.shape[1:])
+            tokens_multi = tokens.reshape(shape)
+            tokens = tokens_multi[:, -1]


Let's revert this back again

MatKbauer · 2026-05-21T16:25:29Z

+        # Reshape tokens to [B, T, ...]
+        tokens = tokens.reshape(shape)
+
+        if self.cf.get("fe_diffusion_model", False):


To allow unconditional and non-forecast conditional training, this check should be

if self.cf.get("fe_diffusion_model_conditioning", None) == "forecast":

MatKbauer

Some more suggestions to make code more robust

MatKbauer · 2026-05-22T05:30:52Z

        self.streams = cf.streams
        self.rank = cf.rank
        self.world_size = cf.world_size
+        self.diffusion_model_conditioning = cf.fe_diffusion_model_conditioning


cf.get("fe...", None)

MatKbauer · 2026-05-22T05:32:38Z

            embedding_dim=self.embedding_dim, frequency_embedding_dim=self.frequency_embedding_dim
        )
-        self.datetime_embedder = DateTimeEncoder()
+        self.conditioning = self.cf.fe_diffusion_model_conditioning


self.cf.get("fe_diffusion_model_conditioning", None")

MatKbauer · 2026-05-22T05:34:10Z

+        if self.cf.fe_diffusion_model_conditioning in ["date_time", "date", "time"]:
+            c = meta_info["ERA5"].params["timestamp"]
+        elif self.cf.fe_diffusion_model_conditioning == "forecast":
+            c = meta_info["ERA5"].params["conditioning_tokens"]          # X_{t-1} as conditioning (model.py extracts last step as target, passes second-to-last here)


self.conditioning in both places

MatKbauer · 2026-05-22T05:39:00Z

-        # Extract conditioning from meta_info (same as training_forward)
+        # Extract conditioning (mirrors training_forward).
        c = None
+        if self.cf.fe_diffusion_model_conditioning in ["date_time", "date", "time"]:


self.conditioning

MatKbauer · 2026-05-22T05:45:44Z

+                            qk_norm_type=self.cf.get("qk_norm_type", self.cf.norm_type),
+                            norm_eps=self.cf.norm_eps,
+                            attention_dtype=get_dtype(self.cf.attention_dtype),
+                            is_dit=self.cf.fe_diffusion_model,


Let's also make this a self.cf.get("fe_diffusion_model", False)

moritzhauschulz added 21 commits May 18, 2026 19:18

bug fix

366ae77

bug fix

dc6d82d

config change

aaeb073

plot config

bce064d

plot config

77f6f43

update stage handling in diffusion

140d1ca

re-implement conditioning, update adalayernorm and embedding function

825f841

remove debugging tool

dd97830

implement time only / day only conditioning

77d5e0a

date_time conditioning

11c9e6a

activate swiglu, xsa

b6ace25

initial commit with data flow for the forecast conditioning (conditio…

4dca300

…ning not implemented)

offset 1

d2e7b5d

bug fix from merge

fde1230

change ada_ln argument passing

3f89769

naive implementation of conditioning via concatenation

d74029d

remove CLAUDE.md

8a1b698

implemented cross-attn in fe engine

1844cbf

removed concatenation option

372bab4

date in config

13560fc

comment in config

66ff754

github-project-automation Bot added this to WeatherGen-dev May 20, 2026

github-actions Bot added the model Related to model training or definition (not generic infra) label May 20, 2026

moritzhauschulz added 6 commits May 20, 2026 19:03

minor improvements

7369439

assert offset zero

810987a

roll back data flow (not working)

58bf2d4

cleanup rollback

9b652f4

inter commit

27a3b1b

fixes – forecast + cross_attn should run now

8534dd2

MatKbauer reviewed May 21, 2026

View reviewed changes

moritzhauschulz added 2 commits May 21, 2026 23:37

apply PR review comments

fb57fb9

additional check for num_input_steps

5ec9446

MatKbauer reviewed May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT] Mh/jk/diffusion full pipeline forecast#2396

[DRAFT] Mh/jk/diffusion full pipeline forecast#2396
moritzhauschulz wants to merge 29 commits into
ecmwf:jk/develop/diffusion-full-pipelinefrom
moritzhauschulz:mh/jk/diffusion-full-pipeline-forecast

moritzhauschulz commented May 20, 2026

Uh oh!

MatKbauer left a comment

Uh oh!

MatKbauer May 21, 2026

Uh oh!

MatKbauer May 21, 2026

Uh oh!

MatKbauer left a comment

Uh oh!

MatKbauer May 22, 2026

Uh oh!

MatKbauer May 22, 2026

Uh oh!

MatKbauer May 22, 2026

Uh oh!

MatKbauer May 22, 2026

Uh oh!

MatKbauer May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

moritzhauschulz commented May 20, 2026

Description

Issue Number

Checklist before asking for review

Uh oh!

MatKbauer left a comment

Choose a reason for hiding this comment

Uh oh!

MatKbauer May 21, 2026

Choose a reason for hiding this comment

Uh oh!

MatKbauer May 21, 2026

Choose a reason for hiding this comment

Uh oh!

MatKbauer left a comment

Choose a reason for hiding this comment

Uh oh!

MatKbauer May 22, 2026

Choose a reason for hiding this comment

Uh oh!

MatKbauer May 22, 2026

Choose a reason for hiding this comment

Uh oh!

MatKbauer May 22, 2026

Choose a reason for hiding this comment

Uh oh!

MatKbauer May 22, 2026

Choose a reason for hiding this comment

Uh oh!

MatKbauer May 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants