Various improvements have been made to make FastestDet even lighter and faster.
This fork custom branch aims to improve the official implementation. A huge thank you to the authors of FastestDet.
| Inference image 96x96 |
→ | Original image 640x425 |
|---|---|---|
![]() |
→ | ![]() |
- Optimizer: SGD/AdamW
- Scheduler: Step/Cosine
- Supports Distributed Data Parallel (DDP) training
- EMA
- Distillation
- Resume
- Toggle pyramid-levels:
P1 only,P2 only,P3 only,P1 + P2,P1 + P3,P2 + P3,P1 + P2 + P3. For very low resolution models, including P2 and P3 in the architecture significantly reduces accuracy. - Added stride adjustment function using
--stride-halfand--stride-quarterflags. This changes the stride to 1/2 or 1/4, which significantly improves accuracy at the expense of increased computational cost. - Backbone ShuffleNetv2 improvements:
Shuffle-free shufflenet - SE (Squeeze-and-Excitation) block/eSE (efficient Squeeze-and-Excitation) block.
- Skip-Residual block.
- Added
--use-spp-separate-1x1to use separate 1x1 conversion for each SPP branch, which is important to minimize quantization error. - Free expansion of the number of channels by
--stage-out-channels. - Freely expansion of the number of stages by
--stage-repeats. - Added
--multi-label-robust-mode, which switches the class head to multi-label mode (sigmoid + BCEWithLogits, multi-hot target in the same cell) when specified. - Params and FLOPs Profiler:
profile_onnx.py.uv run python profile_onnx.py model.onnx Params: 72,861 (72.861 Kparams) FLOPs: 10,908,672 (0.011 GFLOPs)
- Replace pycocotools with faster-coco-eval
2022.7.14:Optimize loss, adopt IOU aware based on smooth L1, and the AP is significantly increased by 0.7
- Faster! Stronger! Simpler!
- It has better performance and simpler feature map post-processing than Yolo-fastest
- The performance is 10% higher than Yolo-fastest
- The coco evaluation index increased by 1.2% compared with the map0.5 of Yolo-fastestv2
- Algorithm intro: https://zhuanlan.zhihu.com/p/536500269 QQ group: 1062122604
| Network | mAPval 0.5 | mAPval 0.5:0.95 | Resolution | Run Time(4xCore) | Run Time(1xCore) | Params(M) |
|---|---|---|---|---|---|---|
| yolov5s | 56.8% | 37.4% | 640X640 | 395.31ms | 1139.16ms | 7.2M |
| yolov6n | - | 30.8% | 416X416 | 109.24ms | 445.44ms | 4.3M |
| yolox-nano | - | 25.8% | 416X416 | 76.31ms | 191.16ms | 0.91M |
| nanodet_m | - | 20.6% | 320X320 | 49.24ms | 160.35ms | 0.95M |
| yolo-fastestv1.1 | 24.40% | - | 320X320 | 26.60ms | 75.74ms | 0.35M |
| yolo-fastestv2 | 24.10% | - | 352X352 | 23.8ms | 68.9ms | 0.25M |
| FastestDet | 25.3% | 13.0% | 352X352 | 23.51ms | 70.62ms | 0.24M |
- Test platform Radxa Rock3A RK3568 ARM Cortex-A55 CPU,Based on NCNN
- CPU lock frequency 2.0GHz
- Anchor-Free
- Single scale detector head
- Cross grid multiple candidate targets
- Dynamic positive and negative sample allocation
| Equipment | Computing backend | System | Framework | Run time(Single core) | Run time(Multi core) |
|---|---|---|---|---|---|
| Radxa rock3a | RK3568(arm-cpu) | Linux(aarch64) | ncnn | 70.62ms | 23.51ms |
| Radxa rock3a | RK3568(NPU) | Linux(aarch64) | rknn | 28ms | - |
| Qualcomm | Snapdragon 835(arm-cpu) | Android(aarch64) | ncnn | 32.34ms | 16.24ms |
| Intel | i7-8700(X86-cpu) | Linux(amd64) | ncnn | 4.51ms | 4.33ms |
git clone https://github.com/PINTO0309/FastestDetNext.git && cd FastestDetNext
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync
source .venv/bin/activate- Picture test
uv run python test.py \
--yaml configs/coco.yaml \
--weight weights/weight_AP05:0.253207_280-epoch.pth \
--img data/3.jpg-
The format of the data set is the same as that of Darknet Yolo, Each image corresponds to a .txt label file. The label format is also based on Darknet Yolo's data set label format: "category cx cy wh", where category is the category subscript, cx, cy are the coordinates of the center point of the normalized label box, and w, h are the normalized label box The width and height, .txt label file content example as follows:
11 0.344192634561 0.611 0.416430594901 0.262 14 0.509915014164 0.51 0.974504249292 0.972 -
The image and its corresponding label file have the same name and are stored in the same directory. The data file structure is as follows:
dataset ├── train │  ├── 000001.jpg │  ├── 000001.txt │  ├── 000002.jpg │  ├── 000002.txt │  ├── 000003.jpg │  └── 000003.txt └── val ├── 000043.jpg ├── 000043.txt ├── 000057.jpg ├── 000057.txt ├── 000070.jpg └── 000070.txt -
Generate a dataset path .txt file, the example content is as follows:
train.txt
dataset/train/000001.jpg dataset/train/000002.jpg dataset/train/000003.jpgval.txt
dataset/val/000070.jpg dataset/val/000043.jpg dataset/val/000057.jpg -
Generate the .names category label file, the sample content is as follows:
category.names
person bicycle car motorbike ... -
The directory structure of the finally constructed training data set is as follows:
dataset ├── category.names # .names category label file ├── train # train dataset │ ├── 000001.jpg │  ├── 000001.txt │  ├── 000002.jpg │  ├── 000002.txt │  ├── 000003.jpg │  └── 000003.txt ├── train.txt # train dataset path .txt file ├── val # val dataset │  ├── 000043.jpg │  ├── 000043.txt │  ├── 000057.jpg │  ├── 000057.txt │  ├── 000070.jpg │  └── 000070.txt └── val.txt # val dataset path .txt file
- Reference./configs/coco.yaml
DATASET: TRAIN: "/home/qiuqiu/Desktop/coco2017/train2017.txt" # Train dataset path .txt file VAL: "/home/qiuqiu/Desktop/coco2017/val2017.txt" # Val dataset path .txt file NAMES: "dataset/coco128/coco.names" # .names category label file MODEL: INPUT_WIDTH: 352 # The width of the model input image INPUT_HEIGHT: 352 # The height of the model input image TRAIN: LR: 0.001 # Train learn rate WARMUP: true # Trun on warm up BATCH_SIZE: 64 # Batch size END_EPOCH: 350 # Train epichs MILESTIONES: # Declining learning rate steps - 150 - 250 - 300 - Number of classes is derived from
TRAIN.CLASSES(if set) or the number of lines inDATASET.NAMES. --classesoverridesTRAIN.CLASSESand updates the derived class count.
- Backbone downsampling:
first_convstride 2,maxpoolstride 2, and the first block of each stage (stage2/3/4) uses stride 2. - Feature map strides (relative to input):
P1(stage2 output): stride 8P2(stage3 output): stride 16P3(stage4 output): stride 32
- Backbone stride presets (requires re-training; not compatible with existing checkpoints):
--stride-half→ P1/P2/P3 = 4/8/16--stride-quarter→ P1/P2/P3 = 2/4/8
- If neither flag is set, the default strides are used: P1/P2/P3 = 8/16/32.
- Comma-separated list of pyramid levels to fuse:
P1,P2,P3(default) or any subset (e.g.P1,P3,P2). - When fusing multiple levels, features are aligned to
P2resolution:P1is downsampled (avg pool stride 2),P3is upsampled (x2). - When using a single level, its native stride is used (P1=8, P2=16, P3=32).
- Output shape is
float32[B, 1 + 4 + NC, H, W].B: batch sizeNC: number of classes (derived fromTRAIN.CLASSESorDATASET.NAMES)H, W: feature map size (depends on input size and selected stride)
- Channel layout per spatial location:
0: objectness (sigmoid)1-4: box regression (tx, ty, tw, th)5..(4+NC): class probabilities (softmax)
- Final score per box is computed as:
score = (objectness ** 0.6) * (max_class_prob ** 0.4)
-
Perform training tasks
uv run python train.py \ --yaml configs/coco.yaml
Click to expand
SIZE=64x64 uv run python train.py \ --exp-name exp_x3_00_x1_00_${SIZE}_lr0.01000_P1_half_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --lr 0.01000 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 3.00 \ --stage-repeats 1.00 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-half \ --val-interval 20 SIZE=60x80 uv run python train.py \ --exp-name exp_x3_00_x1_00_${SIZE}_lr0.01000_P1_half_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --lr 0.01000 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 3.00 \ --stage-repeats 1.00 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-half \ --val-interval 20 SIZE=96x96 uv run python train.py \ --exp-name exp_x3_00_x1_00_${SIZE}_lr0.01000_P1_half_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --lr 0.01000 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 3.00 \ --stage-repeats 1.00 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-half \ --val-interval 20 SIZE=90x120 uv run python train.py \ --exp-name exp_x3_00_x1_00_${SIZE}_lr0.01000_P1_half_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --lr 0.01000 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 3.00 \ --stage-repeats 1.00 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-half \ --val-interval 20 SIZE=120x160 uv run python train.py \ --exp-name exp_x3_00_x1_00_${SIZE}_lr0.01000_P1_half_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --batch-size 32 \ --lr 0.01000 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 3.00 \ --stage-repeats 1.00 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-half \ --val-interval 20
Click to expand
SIZE=64x64 uv run python train.py \ --exp-name exp_x3_00_x1_00_${SIZE}_lr0.01000_P1_quarter_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --lr 0.01000 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 3.00 \ --stage-repeats 1.00 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 20 SIZE=60x80 uv run python train.py \ --exp-name exp_x3_00_x1_00_${SIZE}_lr0.01000_P1_quarter_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --lr 0.01000 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 3.00 \ --stage-repeats 1.00 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 20 SIZE=96x96 uv run python train.py \ --exp-name exp_x3_00_x1_00_${SIZE}_lr0.01000_P1_quarter_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --lr 0.01000 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 3.00 \ --stage-repeats 1.00 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 20 SIZE=90x120 uv run python train.py \ --exp-name exp_x3_00_x1_00_${SIZE}_lr0.01000_P1_quarter_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --lr 0.01000 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 3.00 \ --stage-repeats 1.00 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 20 SIZE=120x160 uv run python train.py \ --exp-name exp_x3_00_x1_00_${SIZE}_lr0.01000_P1_quarter_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --batch-size 32 \ --lr 0.01000 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 3.00 \ --stage-repeats 1.00 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 20
Click to expand
SIZE=64x64 python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_half_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00100 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-half \ --val-interval 50 SIZE=60x80 python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_half_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00100 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-half \ --val-interval 50 SIZE=96x96 python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_half_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00100 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-half \ --val-interval 50 SIZE=90x120 python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_half_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00100 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-half \ --val-interval 50 SIZE=120x160 python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_half_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00100 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-half \ --val-interval 50 ################### SIZE=64x64 python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_quarter_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00100 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 50 SIZE=60x80 python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_quarter_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00100 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 50 SIZE=96x96 python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_quarter_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00100 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 50 SIZE=90x120 python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_quarter_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00100 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 50 SIZE=120x160 python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_quarter_sppsep_09cls_mlrm \ --yaml configs/uhd09.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00100 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 50
Click to expand
SIZE=64x64 uv run python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_half_sppsep_09cls_mlrm \ --yaml configs/uhd09ft.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00010 \ --epoch 100 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 50 \ --teacher-weight runs/exp_x50_00_x1_25_640x640_lr0.00010_skipred_noema_noamp_P1_09cls_mlrm_distill/last_0100_0.799208.pth \ --distill-weight-max 1.0 \ --distill-temperature 1.5 SIZE=60x80 uv run python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_half_sppsep_09cls_mlrm \ --yaml configs/uhd09ft.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00010 \ --epoch 100 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 50 \ --teacher-weight runs/exp_x50_00_x1_25_640x640_lr0.00010_skipred_noema_noamp_P1_09cls_mlrm_distill/last_0100_0.799208.pth \ --distill-weight-max 1.0 \ --distill-temperature 1.5 SIZE=96x96 uv run python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_half_sppsep_09cls_mlrm \ --yaml configs/uhd09ft.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00010 \ --epoch 100 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 50 \ --teacher-weight runs/exp_x50_00_x1_25_640x640_lr0.00010_skipred_noema_noamp_P1_09cls_mlrm_distill/last_0100_0.799208.pth \ --distill-weight-max 1.0 \ --distill-temperature 1.5 SIZE=90x120 uv run python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_half_sppsep_09cls_mlrm \ --yaml configs/uhd09ft.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00010 \ --epoch 100 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-half \ --val-interval 50 \ --teacher-weight runs/exp_x50_00_x1_25_640x640_lr0.00010_skipred_noema_noamp_P1_09cls_mlrm_distill/last_0100_0.799208.pth \ --distill-weight-max 1.0 \ --distill-temperature 1.5 SIZE=120x160 uv run python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_half_sppsep_09cls_mlrm \ --yaml configs/uhd09ft.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00010 \ --epoch 100 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-half \ --val-interval 50 \ --teacher-weight runs/exp_x50_00_x1_25_640x640_lr0.00010_skipred_noema_noamp_P1_09cls_mlrm_distill/last_0100_0.799208.pth \ --distill-weight-max 1.0 \ --distill-temperature 1.5 ################### SIZE=64x64 uv run python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_quarter_sppsep_09cls_mlrm \ --yaml configs/uhd09ft.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00010 \ --epoch 100 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 50 \ --teacher-weight runs/exp_x50_00_x1_25_640x640_lr0.00010_skipred_noema_noamp_P1_09cls_mlrm_distill/last_0100_0.799208.pth \ --distill-weight-max 1.0 \ --distill-temperature 1.5 SIZE=60x80 uv run python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_quarter_sppsep_09cls_mlrm \ --yaml configs/uhd09ft.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00010 \ --epoch 100 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 50 \ --teacher-weight runs/exp_x50_00_x1_25_640x640_lr0.00010_skipred_noema_noamp_P1_09cls_mlrm_distill/last_0100_0.799208.pth \ --distill-weight-max 1.0 \ --distill-temperature 1.5 SIZE=96x96 uv run python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_quarter_sppsep_09cls_mlrm \ --yaml configs/uhd09ft.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00010 \ --epoch 100 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 50 \ --teacher-weight runs/exp_x50_00_x1_25_640x640_lr0.00010_skipred_noema_noamp_P1_09cls_mlrm_distill/last_0100_0.799208.pth \ --distill-weight-max 1.0 \ --distill-temperature 1.5 SIZE=90x120 uv run python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_quarter_sppsep_09cls_mlrm \ --yaml configs/uhd09ft.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00010 \ --epoch 100 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 50 \ --teacher-weight runs/exp_x50_00_x1_25_640x640_lr0.00010_skipred_noema_noamp_P1_09cls_mlrm_distill/last_0100_0.799208.pth \ --distill-weight-max 1.0 \ --distill-temperature 1.5 SIZE=120x160 uv run python train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.001000_P1_quarter_sppsep_09cls_mlrm \ --yaml configs/uhd09ft.yaml \ --batch-size 60 \ --num-workers 60 \ --lr 0.00010 \ --epoch 100 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --use-spp-separate-1x1 \ --pyramid-levels P1 \ --multi-label-robust-mode \ --stride-quarter \ --val-interval 50 \ --teacher-weight runs/exp_x50_00_x1_25_640x640_lr0.00010_skipred_noema_noamp_P1_09cls_mlrm_distill/last_0100_0.799208.pth \ --distill-weight-max 1.0 \ --distill-temperature 1.5
- Run multi-GPU training and distributed evaluation with
torchrun:SIZE=640x640 torchrun --nproc_per_node=8 train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.00100_skipred_noema_noamp_P1_09cls_mlrm \ --yaml configs/uhd09.yaml \ --batch-size 8 \ --num-workers 8 \ --lr 0.00800 \ --epoch 300 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --pyramid-levels P1 \ --multi-label-robust-mode \ --val-interval 10 SIZE=640x640 torchrun --nproc_per_node=8 train.py \ --exp-name exp_x50_00_x1_25_${SIZE}_lr0.00010_skipred_noema_noamp_P1_09cls_mlrm_distill \ --weight runs/exp_x50_00_x1_25_128x128_lr0.00100_skipred_noema_noamp_P1_09cls_mlrm/best_0295_0.421461.pth \ --yaml configs/uhd09ft.yaml \ --batch-size 5 \ --num-workers 5 \ --lr 0.00080 \ --epoch 100 \ --img-size ${SIZE} \ --opencv_inter_nearest \ --stage-out-channels 50.00 \ --stage-repeats 1.25 \ --use-skip-residual \ --pyramid-levels P1 \ --multi-label-robust-mode \ --teacher-weight runs/exp_x50_00_x1_25_128x128_lr0.00100_skipred_noema_noamp_P1_09cls_mlrm/best_0295_0.421461.pth \ --distill-weight-max 1.0 \ --distill-temperature 1.5 \ --val-interval 20
BATCH_SIZEin the yaml is per GPU, so the global batch size isBATCH_SIZE * nproc_per_node.--nproc_per_nodesets the number of GPU processes to launch on the node (typically the number of GPUs to use).- Checkpoints, TensorBoard logs, ONNX export, and preview images are written by rank0 only.
- Generally, it is recommended to set the learning rate to
--nproc_per_nodetimes the default value.
- Calculate map evaluation
uv run python eval.py \ --yaml configs/uhd09.yaml \ --weight runs/distill/exp_x3_00_x1_00_640x640_lr0.00010_skipred_noema_noamp_P1_09cls_mlrm_distill/best_0100_0.486152.pth \ --img-size 640x640 \ --batch-size 1 uv run python eval.py \ --yaml configs/uhd09.yaml \ --weight runs/distill/exp_x3_00_x1_00_640x640_lr0.00010_skipred_noema_noamp_P1_09cls_mlrm_distill/best_0100_0.486152.pth \ --img-size 320x320 \ --batch-size 1
- COCO2017 evaluation
creating index... index created! creating index... index created! Running per image evaluation... Evaluate annotation type *bbox* DONE (t=30.85s). Accumulating evaluation results... DONE (t=4.97s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.130 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.253 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.119 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.021 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.129 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.237 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.142 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.208 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.214 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.043 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.236 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.372
STAGE_OUT_CHANNELS=3_00
STAGE_REPEATS=1_00
IMAGE_SIZE=96x96
RESIZE_MODE=opencv_inter_nearest
NUM_CLASSES=09
CKPT=runs/exp_x3_00_x1_00_64x64_lr0.01000_P1_half_sppsep_09cls_mlrm/best_0300_0.151091.pth
uv run python export_onnx.py \
--weight ${CKPT} \
--onnx-out fastestdetnext_x${STAGE_OUT_CHANNELS}_x${STAGE_REPEATS}_${IMAGE_SIZE}_${RESIZE_MODE}_cls${NUM_CLASSES}.onnx \
--opset 17uv run python quantize_onnx_model_for_esp32.py \
--list-path dataset/wholebody34/train.txt \
--onnx-model fastestdetnext_x3_00_x1_00_64x64_opencv_inter_nearest_cls09.onnx \
--espdl-model fastestdetnext_x3_00_x1_00_64x64_opencv_inter_nearest_cls09.espdluv run python quantize_onnx_model_for_esp32.py \
--list-path dataset/wholebody34/train.txt \
--onnx-model fastestdetnext_x3_00_x1_00_64x64_opencv_inter_nearest_cls09.onnx \
--espdl-model fastestdetnext_x3_00_x1_00_64x64_opencv_inter_nearest_cls09.espdl \
--int16-op-pattern /SPP/output3/output3.0/Conv- If you find this project useful in your research, please consider cite:
@misc{=FastestDet, title={FastestDet: Ultra lightweight anchor-free real-time object detection algorithm.}, author={xuehao.ma}, howpublished = {\url{https://github.com/dog-qiuqiu/FastestDet}}, year={2022} }



