ConvNet

A convolutional neural network for image recognition using AlexNet architecture implemented with TensorFlow 2.x and Keras API.

Overview

This project implements an AlexNet-based CNN for multi-class image classification. The model is designed to classify images into 4 categories: Cat, Tree, Horse, and Dog. It uses TensorFlow 2.x with the Keras API for a modern, efficient implementation.

Documentation

For detailed documentation, advanced topics, and in-depth guides, please visit the Wiki:

AlexNet Architecture Theory - Deep dive into the model architecture and theory
Advanced Training Configuration - Detailed guide on all training parameters
Data Preparation Guide - Complete guide to preparing your dataset
Hyperparameter Tuning - Strategies for optimizing model performance
Model Performance and Metrics - Understanding evaluation metrics
Advanced Troubleshooting - Solutions to common and advanced issues
TensorFlow 2.x Migration Notes - Migration details from TensorFlow 1.x

The README below covers the basics to get you started quickly. For comprehensive information, theory, and advanced usage, refer to the Wiki.

Model Architecture

The AlexNet model consists of:

5 Convolutional blocks with ReLU activation, max pooling, batch normalization, and dropout
2 Fully connected layers (4096 and 1024 neurons)
Output layer with 4 classes
Regularization: Dropout (0.8 for input, 0.5 for hidden layers) and Batch Normalization
Input size: 224x224x3 RGB images
Optimizer: Adam (default learning rate: 0.001, epsilon: 0.1)
Loss function: Categorical cross-entropy

Requirements

Python 3.x (Note: Originally designed for Python 2.7, but updated for Python 3.x compatibility)
TensorFlow 2.x
NumPy
Matplotlib
scikit-learn

Installation

pip install tensorflow numpy matplotlib scikit-learn

Dataset Structure

Organize your dataset in the following directory structure:

ConvNet/
├── dataset/              # Training images
│   ├── Cat/
│   │   ├── image1.jpg
│   │   ├── image2.jpg
│   │   └── ...
│   ├── Tree/
│   ├── Horse/
│   └── Dog/
└── test_dataset/         # Test images
    ├── Cat/
    ├── Tree/
    ├── Horse/
    └── Dog/

Image Requirements:

Format: JPEG (.jpg or .jpeg)
Recommended size: 224x224 pixels (images will be automatically resized)
Color: RGB (3 channels)

Usage

1. Preprocessing Training Data

Convert your training images to a compressed pickle format:

python my_alexnet_cnn.py preprocessing_training -f images_dataset.pkl

Shuffle the training dataset (recommended for better training):

python my_alexnet_cnn.py preprocessing_training -f images_dataset.pkl --shuffle

This creates an images_shuffled.pkl file with randomly shuffled training data.

2. Preprocessing Test Data

Convert your test images to a compressed pickle format:

python my_alexnet_cnn.py preprocessing_test -t images_test_dataset.pkl

3. Training the Model

Train the model with default parameters:

python my_alexnet_cnn.py train

Training with custom parameters:

python my_alexnet_cnn.py train \
  --learning-rate 0.0001 \
  --max_epochs 50 \
  --display-step 5 \
  --dataset_training images_shuffled.pkl

Available training parameters:

-lr, --learning-rate: Learning rate (default: 0.001)
-e, --max_epochs: Maximum number of epochs (default: 100)
-ds, --display-step: Display step for logging (default: 10)
-dtr, --dataset_training: Training dataset file (default: 'images_shuffled.pkl')

Training outputs:

Model checkpoint: ckpt_dir/model.ckpt
TensorBoard logs: ckpt_dir/
Training log: FileLog.log
ROC curve visualization (displayed after training)

4. Making Predictions

Run predictions on test data:

python my_alexnet_cnn.py predict --dataset_test images_test_dataset.pkl

The prediction will:

Load the trained model from ckpt_dir/model.ckpt
Process all test images
Output classification report with precision, recall, and F1-score

Examples

Complete Workflow Example

# Step 1: Prepare training data
python my_alexnet_cnn.py preprocessing_training -f images_dataset.pkl --shuffle

# Step 2: Prepare test data
python my_alexnet_cnn.py preprocessing_test -t images_test_dataset.pkl

# Step 3: Train the model
python my_alexnet_cnn.py train \
  --learning-rate 0.001 \
  --max_epochs 100 \
  --display-step 10 \
  --dataset_training images_shuffled.pkl

# Step 4: Evaluate on test set
python my_alexnet_cnn.py predict --dataset_test images_test_dataset.pkl

Quick Training Example

For a quick test with fewer epochs:

python my_alexnet_cnn.py train --max_epochs 10 --display-step 2

Output and Metrics

After training, the model provides:

Training Metrics:
- Training accuracy and loss per batch
- Validation accuracy
Classification Report:
- Precision, Recall, and F1-score for each class
- Overall accuracy
Visualizations:
- ROC curve for model performance
TensorBoard Support:
```
tensorboard --logdir=ckpt_dir/
```

File Structure

ConvNet/
├── my_alexnet_cnn.py       # Main script with AlexNet model and training logic
├── Dataset.py              # Dataset preprocessing utilities
├── README.md               # This file
├── LICENSE                 # MIT License
├── dataset/                # Training images directory
├── test_dataset/           # Test images directory
├── ckpt_dir/               # Model checkpoints and TensorBoard logs
├── FileLog.log             # Training and prediction logs
├── images_dataset.pkl      # Preprocessed training data
├── images_shuffled.pkl     # Shuffled training data
└── images_test_dataset.pkl # Preprocessed test data

Classes and Labels

The model classifies images into 4 categories:

Class ID	Label
0	Cat
1	Tree
2	Horse
3	Dog

Technical Details

Batch Size: 64
Image Processing: Images are resized to 224x224 using crop/pad operations
Data Format: Compressed pickle files (.pkl) with gzip compression
Checkpointing: Model weights saved in TensorFlow checkpoint format (.ckpt)

Troubleshooting

Issue: No model checkpoint found to restore - ERROR

Solution: Make sure to train the model first before running predictions

Issue: Memory errors during training

Solution: Reduce batch size in my_alexnet_cnn.py (BATCH_SIZE variable)

Issue: Images not loading

Solution: Ensure images are in JPEG format (.jpg or .jpeg) and organized in the correct directory structure

License

This project is licensed under the MIT License - see the LICENSE file for details.

Credits

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

PayPal: fci1908@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ConvNet

Overview

Documentation

Model Architecture

Requirements

Installation

Dataset Structure

Usage

1. Preprocessing Training Data

2. Preprocessing Test Data

3. Training the Model

4. Making Predictions

Examples

Complete Workflow Example

Quick Training Example

Output and Metrics

File Structure

Classes and Labels

Technical Details

Troubleshooting

License

Credits

Contributing

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

ConvNet

Overview

Documentation

Model Architecture

Requirements

Installation

Dataset Structure

Usage

1. Preprocessing Training Data

2. Preprocessing Test Data

3. Training the Model

4. Making Predictions

Examples

Complete Workflow Example

Quick Training Example

Output and Metrics

File Structure

Classes and Labels

Technical Details

Troubleshooting

License

Credits

Contributing