Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 15 additions & 4 deletions .github/workflows/gh-pages.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,19 @@
name: Deploy github pages
env:
# `BASE_URL` determines the website is served from, including CSS & JS assets
# You may need to change this to `BASE_URL: ''`
BASE_URL: /${{ github.event.repository.name }}

on:
# Runs on pushes targeting the default branch
# Artifact generation runs on all push events, but deploy only on main or when manually requested
push:
branches: ["main"]
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
inputs:
deploy:
description: "Force run the deploy step (ignore being on main)"
default: false
type: boolean

# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages
permissions:
Expand All @@ -28,7 +36,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
uses: actions/checkout@v6

- uses: actions/setup-python@v5
with:
Expand All @@ -44,7 +52,9 @@ jobs:

- name: Generate page content
shell: bash -l {0}
run: jupyter-book build docs
run: |
cd docs
jupyter-book build --html

- name: Setup Pages
uses: actions/configure-pages@v4
Expand All @@ -55,6 +65,7 @@ jobs:
path: docs/_build/html

- name: Deploy to GitHub Pages
if: ( github.ref == 'refs/heads/main' ) || ( github.event.inputs.deploy == 'true')
id: deployment
uses: actions/deploy-pages@v4

Expand Down
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -72,4 +72,7 @@ venv/
ENV/

# pytest
.pytest_cache/
.pytest_cache/
# MyST build outputs
_build
_site
34 changes: 0 additions & 34 deletions docs/_config.yml

This file was deleted.

6 changes: 0 additions & 6 deletions docs/_toc.yml

This file was deleted.

2 changes: 1 addition & 1 deletion docs/about.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
This webpage will provide technical information for the eX3 infrastructure, to facilitate usage.

Together with partners, Simula Research Laboratory has established eX3 to prepare researchers for exascale computing in Norway. The eX3 infrastructure has been funded through the [RCN program for national research infrastructures](https://prosjektbanken.forskningsradet.no/project/FORISS/270053?Kilde=FORISS&distribution=Ar&chart=bar&calcType=funding&Sprak=no&sortBy=score&sortOrder=desc&resultCount=30&offset=0&Fritekst=ex3).
In addition to the host institution Simula, the project consortium also counts the national HPC management body Sigma2, HPC research groups from the University of Tromsø, NTNU, the University of Bergen, and OsloMet, as well as the HPC technology providers Graphcore, Dolphin Interconnect Solutions, and Numascale.
In addition to the host institution Simula, the project consortium also counts the national HPC management body Sigma2, HPC research groups from the University of Tromsø, NTNU, the University of Bergen, and OsloMet, as well as the HPC technology providers Graphcore, Dolphin Interconnect Solutions, and Numascale.

The eX3 infrastructure is not an exascale computer by itself, but it is a carefully curated ecosystem of technology components that are crucial for embracing exascale computing. It allows HPC researchers throughout Norway and their collaborators to experiment hands-on with emerging HPC technologies – hardware as well as software.

Expand Down
13 changes: 13 additions & 0 deletions docs/authors.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
version: 1
project:
contributors:
- id: 2maz
name: Thomas M. Roehr
orcid: 0000-0002-7715-7052
email: roehr@simula.no
github: 2maz
affiliations:
- id: SRL
affiliations:
- id: SRL
name: Simula Research Laboratory
7 changes: 6 additions & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
# eX3 - Experimental Infrastructure for Exploration of Exascale Computing
---
authors:
- 2maz
---

# A Guide for the Experimental Infrastructure for Exploration of Exascale Computing

This page contains information about eX3 host at Simula Research Laboratory

Expand Down
19 changes: 19 additions & 0 deletions docs/myst.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
version: 1
extends:
- ./authors.yml
project:
title: eX3 - Experimental Infrastructure for Exploration of Exascale Computing
copyright: '2025'
github: simula/ex3.github.io
toc:
- file: index.md
- file: usage.md
children:
- file: usage/environment_modules.md
- file: usage/slurm.md
- file: about.md
site:
options:
logo: _static/logo.png
folders: true
template: book-theme
68 changes: 67 additions & 1 deletion docs/usage.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,70 @@
# Usage

This description intends to offer starting points for eX3 and how to use it.
We will continuously improved that documentation.
We will continuously improve this documentation.


## Prerequisites

In order to access eX3, a user will have to file an application and register first. For that purpose
please follow the registration process as documented [here](https://www.ex3.simula.no/access).

Once access has been granted, use the credentials and received login instructions to access the system.


## Logging in
User can login from within Simula or from extern using dnat, where
dnat will drop you on srl-login1:
```
ssh username@dnat.simula.no -p 60441
```

Within Simula network users can use one of the available login nodes:

- srl-login1.ex3.simula.no (priority)
- srl-login3.ex3.simula.no

So in case srl-login1 should not be available (for whatever reason), srl-login3 serves as fallback.


```{admonition} Hint
:class: tip
Login nodes should be used to launch jobs on other nodes - they are not fit for running heavy workloads on them.
```


## Filesystem

A user has access to the usual home directory on the current login node. In addition, a user
has access to shared storage on /global/D1/.
In addition project specific paths can be created upon request.

| Description | Path |
| ----------- | ---- |
| Home on global share | /global/D1/homes/\<username\> |
| Projects on global share | /global/D1/projects/\<projectname\> |

Since reading from shared filesystem can be a bottleneck for data heavy jobs, a local (faster) NVME disk can be used.
For that purpose, when running a job, prepare data (copy it over from the global share) into /work/*\<username\>*/ on the node that requires the data.

## Getting an overview

In order to gain an overview over available resource on eX3 you can navigate to https://naic-monitor.simula.no - as long as you can access the internal network.
Depending on how you access eX3 you might have to set up port forwarding ssh ... -L 443:443 .

## Running an interactive job

Start an interactive job, e.g., requesting one gpu and minimum 2 cpus
```
srun -p dgx2q --gres=gpu:1 --mincpus=2 --pty /bin/bash
```

Once you have gained access to the system, check that the GPU is visible, here for an NVidia GPU
```
$> nvidia-smi -L
GPU 0: Tesla V100-SXM3-32GB (UUID: GPU-ad466f2f-575d-d949-35e0-9a7d912d974e)

$> echo $CUDA_VISIBLE_DEVICES
0
```

35 changes: 35 additions & 0 deletions docs/usage/environment_modules.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Environment Modules

eX3 uses [environment modules](https://envmodules.io/) to manage reusable version of different software packages.
The [documentation](https://modules.readthedocs.io/en/latest/) is extensive, so advanced users are asked to consult the official documentation
for details. In particular the [cookbook](https://modules.readthedocs.io/en/latest/cookbook.html) will be interesting for advanced users, e.g., to create their own module files.

Here, we only highlight basic usage and eX3 specific elements and recommended practices.

## Basic commands
To know what is already loaded:
```
$> module list
Currently Loaded Modulefiles:
1) slurm/slurm/21.08.8 2) rustc/1.85.1(default) 3) cuda12.9/toolkit/12.9.1 4) hwloc/gcc/2.12.2+cu129 5) hwloc/gcc/2.12.2-base
```

To search for available modules (by pattern, here 'cuda'):
```
$> module avail cuda
module avail cuda
--------------------------------------------------------------------------------------------------------- /cm/shared/modulefiles ----------------------------------------------------------------------------------------------------------
cuda9.2.OLD/toolkit/9.2.148 cuda10.1/profiler/10.1.243 ...
...
cuda10.1/nsight/10.1.243 ...
```

To load a module
```
$> module load julia/1.11.4
```

or likewise to unload:
```
$> module unload julia/1.11.4
```
1 change: 1 addition & 0 deletions docs/usage/slurm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# SLURM
Loading