-
Notifications
You must be signed in to change notification settings - Fork 55
[release/2.0] Updates for EPIC hosts #1856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release/2.0
Are you sure you want to change the base?
[release/2.0] Updates for EPIC hosts #1856
Conversation
…e-2.0' into feature/ursa-release-2.0
…e-2.0' into feature/gaea-release-2.0
…ease-2.0' into feature/derecho-release-2.0
climbfuji
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit unsatisfactory and confusing why these manual modifications to the modulepath / load order in the meta-modules is required for these platforms, but for none of the others (Acorn, NRL systems).
But ok, we can hopefully fix that for 2.1.0 so that no manual modifications are required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason this has the suffix DO NOT USE, whereas the one on Orion doesn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oneapi@2025.3.1 is not yet installed on hercules; that is expected to happen after the new year.
The pathing is all wrong (it's orion's pathing) but having the yaml file in place makes it easier to edit paths when oneapi@2025.3.1 is installed on hercules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to maintain two different gcc versions for Gaea C6 (and Derecho)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to maintain two different versions on either host.
Because there was difficulty building envs with GNU on those hosts (and it's not a requirement on gaea-c6, however I was trying it out for the sake of confirming configurations), I was overly hopeful that one would work.
Not a problem to pick one version and remove the extraneous one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you switch between this and the non-hpcx version with --compiler=oneapi-2025.2.1-hpcx and --compiler=oneapi-2025.2.1 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed you can. 😄
Not sure I 100% understand the issue, but we've run into similar issues to what @rickgrubin-noaa described and resolved them at least in part by setting LMOD_TMOD_FIND_FIRST. On Acorn/WCOSS2 we also unset some paths to be sure to avoid loading the NCO-installed modules. Nobody (at EMC/NCO) ever agrees with me about adding hashes to the module versions to avoid this and other issues, but, it's an option :) |
We used to have |
I don't recall using |
Description
Updates to EPIC hosts (RDHPCS on premises, not yet CSPs) for
spack-stack/release/2.0Dependencies
None
Issues addressed
Working towards #1835
Applications affected
None
Systems affected
N.B. Once environments are installed on EPIC hosts, the following modulefiles require manual editing:
<env dir>/modules/Core/stack-<compiler>/<version>.lua<env dir>/modules/<compiler>/stack-<mpi>/<version>.luaIn each module file, reverse the order of the following two stanzas:
-- spack compiler module hierarchy-- prerequisite modulesRDHPCS hosts often (always?) provide modules for commonly used packages (e.g.
hdf5,netcdf-c) that are also built within the stack. Loading system compiler / mpi modules after loading core stack components leads to confusion later on, asMODULEPATHwill then necessarily prefer system-provided package modules rather than stack-provided package modules.Testing
Checklist
doc/source/PreConfiguredSites.rstanddoc/source/MaintainersSection.rst