High-Level library for Efficient and Fast Architecture Evaluation and Comparison of Machine Learning models. It provides a unified interface for training, benchmarking, and hyperparameter optimization with features like distributed training, mixed precision, and real-time visualization.
This project is still under development
- Unified Training Interface: Train single models with easy to use configuration options.
- Multi-Model Benchmarking: Compare multiple Architectures sequentially or in parallel (thread/process-based).
- Distributed Training: Built-in support for DataParallel, DistributedDataParallel (DDP), and FSDP.
- Advanced Mixed Precision: AMP with float16, bfloat16, and experimental FP8 support.
- Gradient Checkpointing: Reduce memory footprint for large models.
- Rich Visualization: Real-time training windows, video recording of metrics, and publication‑ready plots.
- Logging: Integration with Weights & Biases.
- Hyperparameter Optimization: Grid search and random search out of the box.
- Extensible Plugin System: Custom hooks and callbacks for maximum flexibility.
- Data Handling: Supports PyTorch Datasets, synthetic data, torchvision datasets, Hugging Face datasets, and streaming.
Install from the GitHub repository:
# Clone the repository
git clone --depth=1 https://github.com/lof310/arch_eval.git
cd arch_eval
# Install in development mode (recommended)
pip install -e .
# Install normally
pip install .import torch.nn as nn
from arch_eval import Trainer, TrainingConfig
# Define a global configuration
# Dataset
n_samples, n_features, n_classes = 5000, 128, 64
# Model
input_size, hidden = n_features, n_features*2
# Training
batch_size, num_epochs = 16, 4
# Define a simple model
class MLP(nn.Module):
def __init__(self, input_size=128, hidden=256, num_classes=64):
super().__init__()
self.net = nn.Sequential(
nn.Linear(input_size, hidden),
nn.GELU(),
nn.Linear(hidden, num_classes)
)
def forward(self, x):
return self.net(x)
# Configure training
config = TrainingConfig(
dataset="synthetic classification",
dataset_params={"n_samples": n_samples, "n_features": n_features, "n_classes": n_classes},
training_args={"num_epochs": num_epochs, "batch_size": batch_size},
task="classification",
realtime=True,
save_plot=["loss", "accuracy"]
)
model = MLP(input_size, hidden, n_classes)
trainer = Trainer(model, config)
history = trainer.train()from arch_eval import Benchmark, BenchmarkConfig
models = [
{"name": "Small MLP", "model": MLP(hidden=256)},
{"name": "Large MLP", "model": MLP(hidden=512)}
]
config = BenchmarkConfig(
dataset="synthetic classification",
dataset_params={"n_samples": 10000, "n_features": 128, "n_classes": 64},
compare_metrics=["accuracy", "loss"],
parallel=True
)
benchmark = Benchmark(models, config)
results = benchmark.run()
print(results)from arch_eval import HyperparameterOptimizer
def model_fn():
return MLP()
base_config = TrainingConfig(
dataset="synthetic classification",
dataset_params={"n_samples": 1000, "n_features": 128, "n_classes": 64},
training_args={"num_epochs": 3},
task="classification",
realtime=False # disable live plots during search
)
param_grid = {
"learning_rate": [0.001, 0.01, 0.1],
"hidden": [10, 20, 50]
}
optimizer = HyperparameterOptimizer(
model_fn, base_config, param_grid,
search_type="grid", metric="val_accuracy", mode="max"
)
results = optimizer.run()Documentation is under development.
Contributions are welcome!
Distributed under the Apache License 2.0. See LICENSE for more information.
If you use arch_eval in your research, please cite:
@software{arch_eval2026,
author = {Leinier Orama},
title = {arch_eval: High-level Library for Architecture Evaluation of ML Models},
year = {2026},
publisher = {GitHub},
url = {https://github.com/lof310/arch_eval}
}