Skip to content

Tom-Notch/Cloud-Computing-Repository-Template

Repository files navigation

Cloud Computing Repository Template

A GitHub repository template for HPC and cloud workloads with Docker, Singularity, and Slurm

pre-commit Docker Singularity License: MIT

A GitHub repository template for cloud computing and HPC workloads. Combines Docker (for local/cloud machines) with Singularity/Apptainer (for HPC clusters where Docker is unavailable), plus Slurm job scripts.

TLDR: Search for todo and update all occurrences to your desired name. Docker and Singularity are optional if all dependencies can be installed directly on the HPC shell.

Dependencies

Usage

Base Repository

  1. Change LICENSE if necessary
  2. Modify .pre-commit-config.yaml according to your needs
  3. Modify/add GitHub workflow status badges in README.md

Docker Config

Continue on a machine where you have Docker permission — HPC clusters usually restrict Docker access for security reasons.

  1. Fill in all todo-* placeholders directly in .env.example and commit — these are project-level constants, not secrets

    Placeholder Description
    todo-docker-user Your Docker Hub account username
    todo-base-image Base image the Dockerfile builds from (e.g. nvidia/cuda:13.0.0-cudnn-devel-ubuntu24.04)
    todo-image-name Name of the image you are building
    todo-image-user Default user inside the image, used to determine the home folder
  2. Copy .env.example to .env and add any user-specific secrets or local overrides:

    cp .env.example .env

    .env is gitignored and will NOT be committed — it is the right place for secrets and per-user values. It is loaded automatically by docker compose.

  3. Modify the service name from todo-service-name to your service name in docker-compose.yml, add additional volume mounting options such as dataset directories

  4. Update Dockerfile and .dockerignore — the existing Dockerfile includes screen & tmux config, oh-my-zsh, cmake, and other basic tools

  5. Run scripts to build, test, and push:

    Script Action
    build_docker_image.sh Build and test the image locally (uses buildx for multi-arch)
    run_docker_container.sh Run and test a built image (docker compose up -d also works)
    push_docker_image.sh Push the multi-arch image to Docker Hub

    The service mounts the entire repository onto CODE_FOLDER inside the container — modifications inside are reflected outside, useful for VS Code remote development.

Singularity Config

Continue on the actual HPC cluster environment.

  1. Run pull_singularity_image.sh to build the Singularity image locally from the Docker image you pushed

    You should see todo-image-name_latest.def after a successful build.

  2. Run run_singularity_instance.sh to test the image

    • Add additional volume bind options (e.g. dataset directories) — define them in .env, then export via variables.sh using resolve_host_path to convert relative paths to absolute paths
    • Singularity instances have less environment isolation than Docker containers by default unless you pass the additional flags shown in the script

Job Config

  1. Modify job specifications under jobs/

    Slurm tips
    • Query your cluster's partition layout with sinfo
    • Tie resources to tasks for easy scaling: --ntasks-per-node, --gpus-per-task, --cpus-per-task, --mem-per-gpu
    • All jobs use -l (login) in the shebang so any command available in your login shell also works as a job
  2. Submit and monitor jobs:

    sbatch jobs/your-cluster/your-job.job

    Output logs appear as todo_your_job_name_<slurm_job_id>.out in the repository root.

  3. Recommend turm for job monitoring — turm -u your-slurm-user

Developer Quick Start

bash scripts/dev_setup.sh

Maintainer

Mukai (Tom Notch) Yu

About

Boilerplate for running HPC workloads

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors