[ana6, com8]: Add ana6Optimisation Module, apply changes in com8MoTPSA#1245
[ana6, com8]: Add ana6Optimisation Module, apply changes in com8MoTPSA#1245RolandFischbacher wants to merge 35 commits into
Conversation
Analysis for project
|
| Tool | Category | Rule | Count | |
|---|---|---|---|---|
| black | Style | Incorrect formatting, autoformat by running qlty fmt. |
5 | ❌ |
| ruff | Lint | Local variable e is assigned to but never used |
3 | ❌ |
| ripgrep | Lint | # TODO: consider using cfgHandling/createInfoDF instead | 2 | ❌ |
| ruff | Lint | avaframe\.runScripts\.runPlotAreaRefDiffs\.createArealIndicatorPickle imported but unused |
1 | ❌ |
| qlty | Structure | Function with many parameters (count = 12): plotAreaDiff | 2 |
@qltysh one-click actions:
- Auto-fix formatting (
qlty fmt && git push)
f66f702 to
38e42ca
Compare
Squash of 20 commits from RF_com8MoTPSA branch including: - com8MoTPSA workflow improvements (chunked multiprocessing, path handling) - Bayesian optimisation integration (ana6Optimisation module) - Morris sensitivity analysis scripts - AIMEC runout reference implementation - probAna pickle saving and bounds - Plotting and config improvements
|
Coverage Impact ⬇️ Merging this pull request will decrease total coverage on Modified Components (1)
Modified Files with Diff Coverage (2)
🤖 Increase coverage with AI coding...🚦 See full report on Qlty Cloud » 🛟 Help
|
- Add bounds to paramValuesD in createSamplesWithVariation (StandardParameters) - Add writing of visualisation scenario and sampling method to com8MoTPSACfg.ini
awirb
left a comment
There was a problem hiding this comment.
comments for the optimisation part and plotting still missing
| cfgAIMEC = cfgUtils.getModuleConfig(ana3AIMEC) | ||
| rasterTransfo, resAnalysisDF, plotDict, _, pathDict = ana3AIMEC.fullAimecAnalysis(avalancheDir, cfgAIMEC) |
There was a problem hiding this comment.
instead of using the passed module, consider loading the config already in the runScript before you call the function and just pass the config, then also the override is easier (using ana3AIMEC_ana3AIMEC_override) or is there a special reason for passing the module?
There was a problem hiding this comment.
passing the module in to the calcArealIndicatorsAndAIMEC function is not necessary, since the AIMEC settings are not overridden, i think that not passing the module and loading config here should be sufficient?
| ) | ||
| raise ValueError(message) | ||
|
|
||
| paramLossSubsetDF = paramLossDF.sort_values(by='Loss', ascending=True)[:N] |
There was a problem hiding this comment.
why is the [:N] needed - is that from start to the end if len(DF) is N no?
There was a problem hiding this comment.
It defines how much of the best ranked morris samples to use for statistics. (e.g. parameter distribution within this topN samples). I changed the name to topN.
|
|
||
| def createDFParameterLoss(df, paramSelected): | ||
| """ | ||
| Create DataFrames linking selected parameters with the loss function. |
There was a problem hiding this comment.
does selected mean - the ones that were used for the parameter variation using the morris sampling method?
There was a problem hiding this comment.
selected depends on scenario: if morris is not run prior, then selected means all parameters that were varied, and if morris is run prior, selected means take only topN most important parameters
… 2 (1 more important),
…consistent with runOptimisationCfg.ini) and add comments.
| - The top-N most influential parameters are selected for optimisation. | ||
|
|
||
| Scenario 2 (Manual definition): | ||
| - No prior Morris screening. |
There was a problem hiding this comment.
If I understood correctly, Morris analysis could have been performed previously to decide which parameters should be considered in the optimisation and which ones do not have a strong effect on the loss function and are therefore not considered? So scenario 2 just means that first simulations have to be performed to start the optimisation with or used from the ana4Prob run?
There was a problem hiding this comment.
so in contrast in scenario 1 the simulations performed for the morris analysis using the morris sampling are used directly?
There was a problem hiding this comment.
Yes, the statement is correct.
| def loadVariationData(cfgOpt, outDir, avaDir): | ||
| """ | ||
| Load parameter bounds and selected parameters for optimisation. Two execution modes are supported, controlled via | ||
| cfgOpt['PARAM_BOUNDS']['scenario']: |
There was a problem hiding this comment.
in the description of the scenarios below, both say that the parameter bounds are either read from sa_parameter_bounds.pkl (scenario 1) or from paramValuesD.pickle created in the runAna4ProbAna (scenario 2) - so cfgOpt['PARAM_BOUNDS'] is not used? consider mentioning already here that this relies on previous simulation runs performed using Morris analysis or ana4ProbAna
There was a problem hiding this comment.
cfgOpt['PARAM_BOUNDS'] is used to determine which file is read, either sa_parameter_bounds.pkl or paramValuesD.pickle
| paramBounds, paramSelected = optimisationUtils.loadVariationData(cfgOpt, inDir, avalancheDir) | ||
|
|
||
| # Calculate Areal indicators and AIMEC and save the results in Outputs/ana3AIMEC and Outputs/out1Peak | ||
| optimisationUtils.calcArealIndicatorsAndAimec(cfgOpt, avalancheDir, ana3AIMEC) |
There was a problem hiding this comment.
so for the areal indicators, the settings are read from cfgOpt for aimec from the aimecCfg - here we could use the override config functionality
…IMECcfg.ini settings and not pass AIMEC module for calcArealIndicatorsAndAIMEC
…st runoutLineDiff comparison and only for one simulation not all always
| inputsDF.reset_index().merge(configurationDF, on=["simName", "modelType"]).set_index("index") | ||
| ) | ||
| configFound = True | ||
| except (NotADirectoryError, FileNotFoundError) as e: |
| ) | ||
| configFound = True | ||
| except (NotADirectoryError, FileNotFoundError) as e: | ||
| configFound = False |
| from avaframe.com1DFA import com1DFA | ||
| from avaframe.ana3AIMEC import ana3AIMEC | ||
| from avaframe.ana4Stats import probAna | ||
| from avaframe.runScripts.runPlotAreaRefDiffs import runPlotAreaRefDiffs, createArealIndicatorPickle |
There was a problem hiding this comment.
| # inDir = pathlib.Path(avalancheDir, ("Outputs/%s/configurationFiles" % modName)) | ||
| # Read parameterSetDF | ||
| # paramSetDF = readParamSetDF(inDir, varParList) | ||
| # TODO: consider using cfgHandling/createInfoDF instead |
| inputsDF = ( | ||
| inputsDF.reset_index().merge(configurationDF, on=["simName", "modelType"]).set_index("index") | ||
| ) | ||
| except (NotADirectoryError, FileNotFoundError) as e: |
| keepColumnsCleaned = [kC for kC in keepColumns if kC in inputsDF.columns] | ||
| infoDF = inputsDF[keepColumnsCleaned + ["parameterSet"]] | ||
|
|
||
| # TODO: this is required in ana6Optimisation |
| indicatorDict, | ||
| simName, | ||
| cropFile=None, | ||
| Tversky="", |
| alpha, | ||
| beta, | ||
| allResults, | ||
| resType, |
| inputsDF.reset_index().merge(configurationDF, on=["simName", "modelType"]).set_index("index") | ||
| ) | ||
| configFound = True | ||
| except (NotADirectoryError, FileNotFoundError) as e: |
| ) | ||
| configFound = True | ||
| except (NotADirectoryError, FileNotFoundError) as e: | ||
| configFound = False |
| from avaframe.com1DFA import com1DFA | ||
| from avaframe.ana3AIMEC import ana3AIMEC | ||
| from avaframe.ana4Stats import probAna | ||
| from avaframe.runScripts.runPlotAreaRefDiffs import runPlotAreaRefDiffs, createArealIndicatorPickle |
There was a problem hiding this comment.
| # inDir = pathlib.Path(avalancheDir, ("Outputs/%s/configurationFiles" % modName)) | ||
| # Read parameterSetDF | ||
| # paramSetDF = readParamSetDF(inDir, varParList) | ||
| # TODO: consider using cfgHandling/createInfoDF instead |
| inputsDF = ( | ||
| inputsDF.reset_index().merge(configurationDF, on=["simName", "modelType"]).set_index("index") | ||
| ) | ||
| except (NotADirectoryError, FileNotFoundError) as e: |
| keepColumnsCleaned = [kC for kC in keepColumns if kC in inputsDF.columns] | ||
| infoDF = inputsDF[keepColumnsCleaned + ["parameterSet"]] | ||
|
|
||
| # TODO: this is required in ana6Optimisation |
| indicatorDict, | ||
| simName, | ||
| cropFile=None, | ||
| Tversky="", |
| alpha, | ||
| beta, | ||
| allResults, | ||
| resType, |
| inputsDF.reset_index().merge(configurationDF, on=["simName", "modelType"]).set_index("index") | ||
| ) | ||
| configFound = True | ||
| except (NotADirectoryError, FileNotFoundError) as e: |
| ) | ||
| configFound = True | ||
| except (NotADirectoryError, FileNotFoundError) as e: | ||
| configFound = False |
| from avaframe.com1DFA import com1DFA | ||
| from avaframe.ana3AIMEC import ana3AIMEC | ||
| from avaframe.ana4Stats import probAna | ||
| from avaframe.runScripts.runPlotAreaRefDiffs import runPlotAreaRefDiffs, createArealIndicatorPickle |
There was a problem hiding this comment.
| # inDir = pathlib.Path(avalancheDir, ("Outputs/%s/configurationFiles" % modName)) | ||
| # Read parameterSetDF | ||
| # paramSetDF = readParamSetDF(inDir, varParList) | ||
| # TODO: consider using cfgHandling/createInfoDF instead |
| inputsDF = ( | ||
| inputsDF.reset_index().merge(configurationDF, on=["simName", "modelType"]).set_index("index") | ||
| ) | ||
| except (NotADirectoryError, FileNotFoundError) as e: |
| keepColumnsCleaned = [kC for kC in keepColumns if kC in inputsDF.columns] | ||
| infoDF = inputsDF[keepColumnsCleaned + ["parameterSet"]] | ||
|
|
||
| # TODO: this is required in ana6Optimisation |
| indicatorDict, | ||
| simName, | ||
| cropFile=None, | ||
| Tversky="", |
| alpha, | ||
| beta, | ||
| allResults, | ||
| resType, |
| inputsDF.reset_index().merge(configurationDF, on=["simName", "modelType"]).set_index("index") | ||
| ) | ||
| configFound = True | ||
| except (NotADirectoryError, FileNotFoundError) as e: |
| ) | ||
| configFound = True | ||
| except (NotADirectoryError, FileNotFoundError) as e: | ||
| configFound = False |
| from avaframe.com1DFA import com1DFA | ||
| from avaframe.ana3AIMEC import ana3AIMEC | ||
| from avaframe.ana4Stats import probAna | ||
| from avaframe.runScripts.runPlotAreaRefDiffs import runPlotAreaRefDiffs, createArealIndicatorPickle |
There was a problem hiding this comment.
| # inDir = pathlib.Path(avalancheDir, ("Outputs/%s/configurationFiles" % modName)) | ||
| # Read parameterSetDF | ||
| # paramSetDF = readParamSetDF(inDir, varParList) | ||
| # TODO: consider using cfgHandling/createInfoDF instead |
| inputsDF = ( | ||
| inputsDF.reset_index().merge(configurationDF, on=["simName", "modelType"]).set_index("index") | ||
| ) | ||
| except (NotADirectoryError, FileNotFoundError) as e: |
| keepColumnsCleaned = [kC for kC in keepColumns if kC in inputsDF.columns] | ||
| infoDF = inputsDF[keepColumnsCleaned + ["parameterSet"]] | ||
|
|
||
| # TODO: this is required in ana6Optimisation |
| indicatorDict, | ||
| simName, | ||
| cropFile=None, | ||
| Tversky="", |
| alpha, | ||
| beta, | ||
| allResults, | ||
| resType, |

This PR introduces a new optimisation module
ana6Optimisationforcom8MoTPSAand updates the simulation workflow.The module ana6Optimisation includes:
New files in ana6Opitmisaton:
runMorrisSA.py(configuration:runMorrisSACfg.ini)runPlotMorrisConvergence.py(usesrunMorrisSACfg.ini)runOptimisation.py(configuration:runOptimisationCfg.ini)optimisationUtils.pyREADME_ana6.md(contains usage instructions)New file in out3Plot:
outAna6Plots.pyChanged workflow of runing com8MoTPSA: