Skip to content

STM downstream simulation fcl updates#531

Merged
oksuzian merged 5 commits intoMu2e:mainfrom
YongyiBWu:main
Apr 2, 2026
Merged

STM downstream simulation fcl updates#531
oksuzian merged 5 commits intoMu2e:mainfrom
YongyiBWu:main

Conversation

@YongyiBWu
Copy link
Copy Markdown
Contributor

Addition of scripts to run the Score-Based Diffusion Model training and resampling for STM downstream simulations. Also fixed data product inclusion for the 2 data stream output propagating to VDs in the STM DS region.

YongyiBWu and others added 3 commits March 24, 2026 15:06
Add three new FCL job configuration files under JobConfig/pileup/STM to support training and generation for the VDResampler SBDM models.

Files added:
- VDResamplerConfigureTrainer.fcl: sets up VDResamplerConfigure analyzer (VirtualDetectorID 116) and training-related options (VDResamplerDir, fclDir, trainingThreshold, doROOTDump).
- VDResamplerGenerateFromModel.fcl: producer VDResamplerGenerateFromModel with placeholders for stage/all-at-once model files, useTwoStageModel, useHeun, diffusionSteps, and doROOTDump.
- VDResamplerGenerateMix.fcl: producer VDResamplerGenerateMix with ModelFileDir, placeholders for hitSummaryFiles/potsPerFile, useHeun, diffusionSteps, and doROOTDump.

All configs include standard prolog and services, reference the global constants file, and leave input/output file names nil for callers to populate.
@FNALbuild
Copy link
Copy Markdown
Collaborator

Hi @YongyiBWu,
You have proposed changes to files in these packages:

  • JobConfig

which require these tests: build.

@Mu2e/fnalbuild-users, @Mu2e/write have access to CI actions on main.

⌛ The following tests have been triggered for bd580c6: build (Build queue - API unavailable)

About FNALbuild. Code review on Mu2e/Offline.

@FNALbuild
Copy link
Copy Markdown
Collaborator

☀️ The build tests passed at bd580c6.

Test Result Details
test with Command did not list any other PRs to include
merge Merged bd580c6 at 9418fb0
build (prof) Log file. Build time: 04 min 11 sec
ceSimReco Log file.
g4test_03MT Log file.
transportOnly Log file.
POT Log file.
g4study Log file.
cosmicSimReco Log file.
cosmicOffSpill Log file.
ceSteps Log file.
ceDigi Log file.
muDauSteps Log file.
ceMix Log file.
rootOverlaps Log file.
g4surfaceCheck Log file.
FIXME, TODO TODO (0) FIXME (0) in 0 files
clang-tidy 0 errors 0 warnings

N.B. These results were obtained from a build of this Pull Request at bd580c6 after being merged into the base branch at 9418fb0.

For more information, please check the job page here.
Build artifacts are deleted after 5 days. If this is not desired, select Keep this build forever on the job page.

@oksuzian
Copy link
Copy Markdown
Collaborator

oksuzian commented Apr 1, 2026

I'll review this pull request to identify any issues.Here's my review of PR #531. The build CI passes ✅, but I found several issues worth flagging:


Issues Found

1. 🔴 Potentially Dangerous Wildcard Change in prolog.fcl (Data Product Filters)

# Before:
"keep mu2e::StepPointMCs_compressDetStepMCsSTM_*_*",
"keep mu2e::SimParticlemv_compressDetStepMCsSTM_*_*",

# After:
"keep mu2e::StepPointMCs_compressDetStepMCsSTM*_*_*",
"keep mu2e::SimParticlemv_compressDetStepMCsSTM*_*_*",

The underscore before the first * was removed, changing compressDetStepMCsSTM_ (exact module label match) to compressDetStepMCsSTM* (wildcard prefix match). This will now match any module label starting with compressDetStepMCsSTM, such as compressDetStepMCsSTMFoo, compressDetStepMCsSTM2, etc. While this is described as intentionally broadening to capture "2 data stream output," it could inadvertently keep unintended data products if any other modules share that prefix. It would be safer and more explicit to list the exact module labels being targeted.

2. 🟡 Empty/Placeholder Model File Paths in VDResamplerGenerateFromModel.fcl

stage1ModelFile : "" 
stage2ModelFile : "" 
allAtOnceModelFile : "" 

All three model file paths are empty strings. If someone runs this fcl without overriding these, it will either fail at runtime or silently produce bad results. Consider either:

  • Using @nil (like fileNames) to force an explicit override, or
  • Adding a comment warning that these must be set before use.

3. 🟡 Empty Lists in VDResamplerGenerateMix.fcl

hitSummaryFiles : []
potsPerFile : []

Similar to above — these are empty arrays that must be populated to be useful. Using @nil would be more consistent with the pattern used elsewhere in these files (e.g., fileNames: @nil) and would force the caller to provide values.

4. 🟡 Incorrect fclDir Path in VDResamplerConfigureTrainer.fcl

fclDir : "Production/pileup/STM"

The actual directory path in the repo is JobConfig/pileup/STM, not Production/pileup/STM. If this path is used to locate other fcl files relative to the repository root, it will fail. Verify whether this should be JobConfig/pileup/STM instead, or whether Production here refers to the package name used at runtime (in which case a clarifying comment would help).

5. 🟢 Minor: Copy-Pasted Comment Headers

All three new files share the same header comment:

# Provides a pass of the data and set up fcl's to train the SBDM models that 
# describe particle distributions at a selected VD.

This description only accurately describes VDResamplerConfigureTrainer.fcl. The other two files are for generation/resampling, not training setup. The comments should be updated to reflect each file's actual purpose.

6. 🟢 Minor: Commented-Out Parameters

VDResamplerGenerateFromModel.fcl and VDResamplerGenerateMix.fcl both have commented-out parameters (VDz0, VDr, VirtualDetectorID). If these are no longer needed, they should be removed. If they're optional overrides, a brief comment explaining when to uncomment them would be helpful.


Summary

Severity Issue File
🔴 High Wildcard broadening in keep rules may match unintended products prolog.fcl
🟡 Medium Empty model file paths should use @nil to force override VDResamplerGenerateFromModel.fcl
🟡 Medium Empty arrays should use @nil for consistency VDResamplerGenerateMix.fcl
🟡 Medium Possibly incorrect fclDir path VDResamplerConfigureTrainer.fcl
🟢 Low Inaccurate copy-pasted header comments All 3 new files
🟢 Low Commented-out parameters without explanation 2 new files

@YongyiBWu
Copy link
Copy Markdown
Contributor Author

YongyiBWu commented Apr 1, 2026

  1. The compressDetStepMCsSTM and compressDetStepMCsSTM116 (newly added) data products are specific to the STM simulations and are not used elsewhere. In each trigger path, by construction only one of the two is produced. In addition, there is a chance that additional trigger paths at other virtual detectors may be added; having a wildcard saves the potential problem of omitting those data products in the future. Hence I believe a wildcard is the best option here.
    2- 6. New commit pushed with added comments and suggested changes.

@oksuzian
Copy link
Copy Markdown
Collaborator

oksuzian commented Apr 2, 2026

Let me look at the PR details and the changes involved.Yes, there's a clear opportunity to simplify the three new .fcl files. They share a large amount of boilerplate — the source, services, and structural blocks are nearly identical. Here's what I'd suggest:

What's duplicated across all three files

All three new files (VDResamplerConfigureTrainer.fcl, VDResamplerGenerateFromModel.fcl, VDResamplerGenerateMix.fcl) repeat this identical block:

#include "Offline/STMMC/fcl/prolog.fcl"
#include "Offline/fcl/standardServices.fcl"

source : {
  module_type : RootInput
  fileNames: @nil
}
services : {
  message : @local::default_message
  GlobalConstantsService : {
    inputFile : "Offline/GlobalConstantsService/data/globalConstants_01.txt"
  }
}

# ... and at the bottom:
services.TFileService.fileName : @nil

How to simplify

1. Extract shared config into the existing prolog.fcl

Since JobConfig/pileup/STM/prolog.fcl already exists as a shared prolog for this directory, you could define reusable snippets there:

# Add these to the existing prolog:
VDResampler_source : {
  module_type : RootInput
  fileNames: @nil
}

VDResampler_services : {
  message : @local::default_message
  GlobalConstantsService : {
    inputFile : "Offline/GlobalConstantsService/data/globalConstants_01.txt"
  }
}

2. Then each job config becomes much shorter, for example:

#include "Production/JobConfig/pileup/STM/prolog.fcl"

process_name: VDResamplerConfigureTrainer

source   : @local::VDResampler_source
services : @local::VDResampler_services

physics: {
  analyzers : {
    VDResamplerTrainingSetup : {
      module_type : VDResamplerConfigure
      StepPointMCsTag : "compressDetStepMCsSTM:"
      SimParticlemvTag : "compressDetStepMCsSTM:"
      VirtualDetectorID : 116
      VDResamplerDir : "Offline/STMMC/VDResamplerFiles"
      fclDir : "Production/JobConfig/pileup/STM"
      dataSourceTag : MuBeam
      trainingThreshold : 100
      doROOTDump: true
    }
  }
  o1: [VDResamplerTrainingSetup]
  end_paths: [o1]
}

services.TFileService.fileName : @nil

The same pattern would apply to the other two files — replacing ~10 lines of boilerplate with 2 @local:: references each.

Additional observations

  • useHeun: true and diffusionSteps: 200 are shared defaults between VDResamplerGenerateFromModel.fcl and VDResamplerGenerateMix.fcl. These could also be factored into the prolog as a shared table if desired.
  • doROOTDump: true appears in all three files — if this is always the default, it could be part of a shared defaults table too.
  • The prolog.fcl change (the wildcard fix from compressDetStepMCsSTM_compressDetStepMCsSTM*_) is separate and should remain as-is.

This would reduce total added lines by roughly 30–40% and make future maintenance (e.g., changing the GlobalConstantsService path) a single-point edit.

@YongyiBWu
Copy link
Copy Markdown
Contributor Author

Source and services are edited as suggested. For now 'useHeun', 'diffusionSteps', and 'doROOTDump' are left in individual fcl's for easy testing purposes. At a later point a "VSResamplerSequence" method will be added as a common producer method for downstream use, which will provide a single point control for these flags.

@oksuzian oksuzian merged commit e4efa93 into Mu2e:main Apr 2, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants