-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathFaceDetectionModelNotesAndTimeline.txt
More file actions
152 lines (88 loc) · 4.3 KB
/
FaceDetectionModelNotesAndTimeline.txt
File metadata and controls
152 lines (88 loc) · 4.3 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
Face Detection Model Tutorial Outline
Goal:
Build a face detection model from scratch using Python and TensorFlow, based on a deep object detection architecture.
Model can detect various object types, not just faces
Includes collecting images, annotating them, and building a deep learning model
Part 1 – Collect Images & Annotate
Collect data and label it for face detection using bounding boxes
Use cases: facial sentiment analysis, facial verification
Breakdown Board
Train on images with annotated bounding boxes
Use data augmentation (albumentations) to expand the dataset
Model consists of a classification head and a regression head
Losses: binary cross-entropy for classification; localization loss for regression
Setting Up & Getting Data
Install dependencies: labelme, tensorflow, opencv, matplotlib, albumentations
Use os, time, uuid, and opencv to organize files, handle timing, create unique IDs, and capture images
Folder structure:
data/images/ for raw images
data/labels/ for JSON annotations
Collecting and Labeling Images
Capture images in batches (e.g., 30 at a time) with delays to add variation
Label with labelme:
Draw bounding boxes around faces
Assign the label “face”
Save annotations as JSON files
Review and edit as needed
Part 2 – Partition & Augment Data
Split data into training, testing, and validation sets
Apply augmentation (random crop, flips, brightness/contrast/gamma adjustments) using albumentations
Review Dataset & Build Loading Function
Import: tensorflow, json, numpy, matplotlib
Limit GPU memory growth; check GPU availability
Use tf.data.Dataset.list_files and map to load images and labels
Visualize samples with matplotlib
Partition Unaugmented Data
Manually move images/labels into train/, test/, and val/ folders
Typical split: 63 images train, 14 test, 13 validation
Apply Image Augmentation
Define augmentation pipeline with six transformations
Specify bounding-box format (Pascal VOC, COCO, or YOLO)
Generate ~60 augmented images per original image
Save augmented data into an org_data/ directory structure
Load Augmented Data into TensorFlow
Create tf.data.Dataset pipelines for train, test, and validation
Resize images to 120×120 and scale pixel values to [0,1]
Shuffle, batch, and prefetch for performance
Prepare & Load Labels
Write a load_labels function to parse JSON files into class and coordinate arrays
Map label-loading function over label files in the dataset
Combine Images & Labels
Zip image and label datasets to form complete samples
Confirm dataset sizes: e.g., 3720 training, 840 testing, 720 validation samples
Part 3 – Build & Train the Model
Model Architecture
Base: VGG16 (pretrained, trimmed of its top layers)
Add two heads:
Classification output (sigmoid activation)
Regression output for bounding boxes (sigmoid activation)
Total parameters: ~16.8 million
Custom Losses & Optimizer
Classification loss: binary cross-entropy
Localization loss: sum of squared differences for coordinates and sizes
Test loss functions on sample data to verify correctness
Optimizer: Adam with learning-rate decay based on number of batches
Custom Training Loop
Create a FaceTracker Keras model subclass with:
__init__, compile, train_step, test_step, and call methods
In train_step, compute both losses and apply gradients
Compile & Fit
Compile model with optimizer and both losses
Train with model.fit, supplying training and validation datasets
Monitor training via TensorBoard
Part 4 – Test & Real‑Time Detection
Prepare a test data iterator
Run predictions and visualize results; debug any issues with architecture or data
Retrain with adjusted parameters if needed
Save the final model
Real‑Time Face Detection
Capture video frames, resize and preprocess for the model
Make predictions, draw bounding boxes, and render on the frame
Adjust rectangle size and styling as needed
Final Results
Successfully built a custom TensorFlow object detector for faces
Works under varied lighting and backgrounds (e.g., green screen)
Performance can improve further with more data
Code & Resources:
Final project code and details are available on GitHub.
Final validation regression loss: 0.025; total loss: 0.065.