Hello all! I am a Senior Production Analyst at the National Weather Service Central Operations in the United States. My team uses ecFlow to manage our operational suite for all our numerical models and directly related products. Thank you for maintaining this software so we can effectively and reliably deliver our products.
I have a design problem that we frequently run into. We need to deliver forecast products as large, long-running forecast models are in progress. Our current solution is to run a "manager job" that waits for a target file to arrive, then calls ecflow_client --event to trigger a job that performs postprocessing. This solution works, but due to our supercomputer scheduler configuration, the manager job reserves an entire compute node for a 1-core task. We would like to avoid using a manager job to conserve compute resources.
Similarly, some of our data arrives from external sources and our jobs try to wait for the file to arrive. We use time triggers in this case, but there is of course some variability in file arrival times.
Does ECMWF have a standard approach for managing data and task dependencies that is more efficient than our current approaches? I can appreciate that parts of this discussion may not be directly related to ecFlow itself, so I would be happy to take the discussion elsewhere if more appropriate. Thank you for your time!
Hello all! I am a Senior Production Analyst at the National Weather Service Central Operations in the United States. My team uses ecFlow to manage our operational suite for all our numerical models and directly related products. Thank you for maintaining this software so we can effectively and reliably deliver our products.
I have a design problem that we frequently run into. We need to deliver forecast products as large, long-running forecast models are in progress. Our current solution is to run a "manager job" that waits for a target file to arrive, then calls
ecflow_client --eventto trigger a job that performs postprocessing. This solution works, but due to our supercomputer scheduler configuration, the manager job reserves an entire compute node for a 1-core task. We would like to avoid using a manager job to conserve compute resources.Similarly, some of our data arrives from external sources and our jobs try to wait for the file to arrive. We use time triggers in this case, but there is of course some variability in file arrival times.
Does ECMWF have a standard approach for managing data and task dependencies that is more efficient than our current approaches? I can appreciate that parts of this discussion may not be directly related to ecFlow itself, so I would be happy to take the discussion elsewhere if more appropriate. Thank you for your time!