This project trains a DistilBERT model on the GLUE MRPC task with configurable hyperparameters and logs results to Weights & Biases (W&B).
- Prerequisites
- Getting Started
- Running with Docker (Recommended)
- Running Locally
- Configuration Options
Before you begin, ensure you have the following installed:
- Git (to clone the repository)
- Weights & Biases Account
- Sign up at wandb.ai
- Get your API key from wandb.ai/authorize
git clone git@gitlab.com:i.ba_mlops.h25/hyperparameter-tuning.git
cd hyperparameter-tuningYou'll need your Weights & Biases API key to log training metrics. You can find it at wandb.ai/authorize.
Create a .env file from the example:
# On Windows (CMD)
copy .env.example .env
# On Windows (PowerShell) or macOS/Linux
cp .env.example .envEdit the .env file and add your W&B API key:
WANDB_API_KEY=your_wandb_api_key_here
Docker provides a consistent environment and handles all dependencies automatically.
- Docker: Install Docker Desktop for Windows, macOS, or Linux
Build and run with default configuration:
docker compose up --buildBuild and run with custom arguments:
docker compose run --rm train [-h] --wandb-project WANDB_PROJECT [--checkpoint-dir CHECKPOINT_DIR] [-bs BATCH_SIZE] [-lr LEARNING_RATE] [-ws WARMUP_STEPS] [-wd WEIGHT_DECAY] [-o {AdamW,Adam,NAdam,SGD}] [-adm-b ADAM_BETAS] [-adm-e ADAM_EPS] [-sgd-m SGD_MOMENTUM] [-sgd-d SGD_DAMPENING] [-sgd-n]This command:
- Builds the Docker image (if not already built or if changes detected)
- Runs the training with default or custom configuration
- Shows output in your terminal
After training completes, view your results at wandb.ai in your specified project.
To persist model checkpoints on your local machine, uncomment the volumes section in docker-compose.yml:
volumes:
- ./models:/app/modelsIf you prefer to run without Docker:
- Python 3.12
- uv package manager
uv sync --extra cpuFor using your GPU with CUDA 12.9:
uv sync --extra cu129Run the training script:
python train.py [-h] --wandb-project WANDB_PROJECT [--checkpoint-dir CHECKPOINT_DIR] [-bs BATCH_SIZE] [-lr LEARNING_RATE] [-ws WARMUP_STEPS] [-wd WEIGHT_DECAY] [-o {AdamW,Adam,NAdam,SGD}] [-adm-b ADAM_BETAS] [-adm-e ADAM_EPS] [-sgd-m SGD_MOMENTUM] [-sgd-d SGD_DAMPENING] [-sgd-n]After training completes, view your results at wandb.ai in your specified project.
The training script supports various hyperparameters:
| Argument | Short | Type | Description |
|---|---|---|---|
--wandb-project |
- | string | Required. Name of Weights & Biases project |
--checkpoint-dir |
- | string | Directory to store checkpoints in |
--batch-size |
-bs |
int | Training & evaluation batch size |
--learning-rate |
-lr |
float | Optimizer learning rate |
--warmup-steps |
-ws |
int | Number of warmup steps |
--weight-decay |
-wd |
float | Weight decay (L2 regularization) |
--optimizer |
-o |
choice | Optimizer to use: AdamW, Adam, NAdam, SGD |
--adam-betas |
-adm-b |
tuple | Adam betas as 'beta1,beta2' (comma-separated) |
--adam-eps |
-adm-e |
float | Adam epsilon |
--sgd-momentum |
-sgd-m |
float | SGD momentum |
--sgd-dampening |
-sgd-d |
float | SGD dampening |
--sgd-nesterov |
-sgd-n |
flag | Enable Nesterov momentum |