Skip to content

valsdav/PhDCourse_MLForPrecisionPhysics_2024

Repository files navigation

ML Tools for Precision Physics in HEP

Welcome to the experimental section of the Physics at Colliders 2024 PhD Course (Milano-Bicocca).

Table of content

  • Dataset preparation

    • features scaling and normalization
    • data manipulation and formatting
  • Transformers

    • Intro and architecture
    • Full particles regression with transformers
      • Best losses for full particle regression
      • Constrained optimization with MDMM
  • Normalizing Flows:

    • Intro and architecture
    • Example: Conditional probability for event boost
    • Application: Generative Transformers for neutrinos generation

Setup

Setup at CERN

# Open a connection to lxplus-gpu with a port-forwarding on 8888 to visualize jupyter notebook
ssh -L 8888:localhost:8888 lxplus-gpu.cern.ch
# optionally move to eos to have more disk space
# cd /eos/user/your/name

git clone git@github.com:valsdav/PhDCourse_MLForPrecisionPhysics_2024.git

# Let's use tmux to keep the session open, note down your lxplus-gpu hostname
systemctl --user start tmux.service
tmux new -t course

# Start the apptainer shell
apptainer shell -B ${XDG_RUNTIME_DIR} \
          --nv -B /afs -B /cvmfs/cms.cern.ch \
          -B /eos/user/d/dvalsecc/PhDCourse_MLColliderPhysics2024 \
          --bind /etc/sysconfig/ngbauth-submit  \
          --env KRB5CCNAME=${XDG_RUNTIME_DIR}/krb5cc \
          /cvmfs/unpacked.cern.ch/registry.hub.docker.com/cmsml/cmsml:3.11-cuda

# Now from inside the singularity we create a virtual env to install some additional packages
python -m venv myenv --system-site-packages

# Activate the environment TO BE DONE ALL THE TIME
source myenv/bin/activate
# install packages (to doonly once)
python -m pip install -r requirements.txt

# Make the virtualenv visible to jupyter lab
python -m ipykernel install --user --name=myenv

# Now we can start the jupyter notebook, 
jupyter lab

Setup outside CERN

We don't need special software apart from torch (with CUDA support possibly).

You can use docker or apptainer to have a basic python environment and them install the required packages on top.

docker run --gpus=all -v ${pwd} -p 8888 -ti pytorch/pytorch:2.4.1-cuda12.4-cudnn9-runtime bash

# Now from inside the singularity we create a virtual env to install some additional packages
python -m venv myenv --system-site-packages

# Activate the environment TO BE DONE ALL THE TIME
source myenv/bin/activate
# install packages (to doonly once)
python -m pip install -r requirements.txt

# Make the virtualenv visible to jupyter lab
python -m ipykernel install --user --name=myenv

# Now we can start the jupyter notebook, 
jupyter lab

Datasets

The training dataset is available on CERN EOS to the course students. They are accessible at /eos/user/d/dvalsecchi/PhDCourse_MLColliderPhysics2024. The dataset is also temporarely publicly available at https://dvalsecc.web.cern.ch/public/datasets/PhDCourse_MLColliderPhysics_2024/training_datasets.tar.gz.

curl https://dvalsecc.web.cern.ch/public/datasets/PhDCourse_MLColliderPhysics_2024/training_datasets.tar.gz

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published