NEWS: The code of our subsequent work EPro-PnP (CVPR 2022 Best Student Paper) has been released here!
MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation. CVPR 2021. [paper]
Hansheng Chen, Yuyao Huang, Wei Tian*, Zhong Gao, Lu Xiong. (*Corresponding author: Wei Tian.)
This repository is the PyTorch implementation for MonoRUn. The codes are based on MMDetection and MMDetection3D, although we use our own data formats. The PnP C++ codes are modified from PVNet.
Please refer to INSTALL.md.
Download the official KITTI 3D object dataset, including left color images, calibration files and training labels.
Download the train/val/test image lists [Google Drive | Baidu Pan, password: cj4u]. For training with LiDAR supervision, download the preprocessed object coordinate maps [Google Drive | Baidu Pan, password: fp3h].
Extract the downloaded archives according to the following folder structure. It is recommended to symlink the dataset root to $MonoRUn_ROOT/data. If your folder structure is different, you may need to change the corresponding paths in config files.
$MonoRUn_ROOT
├── configs
├── monorun
├── tools
├── data
│ ├── kitti
│ │ ├── testing
│ │ │ ├── calib
│ │ │ ├── image_2
│ │ │ └── test_list.txt
│ │ └── training
│ │ ├── calib
│ │ ├── image_2
│ │ ├── label_2
│ │ ├── obj_crd
│ │ ├── mono3dsplit_train_list.txt
│ │ ├── mono3dsplit_val_list.txt
│ │ └── trainval_list.txt
Run the preparation script to generate image metas:
cd $MonoRUn_ROOT
python tools/prepare_kitti.pycd $MonoRUn_ROOTTo train without LiDAR supervision:
python train.py configs/kitti_multiclass.py --gpu-ids 0 1where --gpu-ids 0 1 specifies the GPU IDs. In the paper we use two GPUs for distributed training. The number of GPUs affects the mini-batch size. You may change the samples_per_gpu option in the config file to vary the number of images per GPU. If you encounter out of memory issue, add the argument --seed 0 --deterministic to save GPU memory.
To train with LiDAR supervision:
python train.py configs/kitti_multiclass_lidar_supv.py --gpu-ids 0 1To view other training options:
python train.py -hBy default, logs and checkpoints will be saved to $MonoRUn_ROOT/work_dirs. You can run TensorBoard to plot the logs:
tensorboard --logdir $MonoRUn_ROOT/work_dirsThe above configs use the 3712-image split for training and the other split for validating. If you want to train on the full training set (train-val), use the config files with _trainval postfix.
You can download the pretrained models:
kitti_multiclass.pth[Google Drive | Baidu Pan, password:6bih] trained on KITTI training splitkitti_multiclass_lidar_supv.pth[Google Drive | Baidu Pan, password:nmdb] trained on KITTI training splitkitti_multiclass_lidar_supv_trainval.pth[Google Drive | Baidu Pan, password:hg2r] trained on KITTI train-val
To test and evaluate on the validation set using config at $CONFIG_PATH and checkpoint at $CPT_PATH:
python test.py $CONFIG_PATH $CPT_PATH --val-set --gpu-ids 0To test on the test set and save detection results to $RESULT_DIR:
python test.py $CONFIG_PATH $CPT_PATH --result-dir $RESULT_DIR --gpu-ids 0You can append the argument --show-dir $SHOW_DIR to save visualized results.
To view other testing options:
python test.py -hNote: the training and testing scripts in the root directory are wrappers for the original scripts taken from MMDetection, which can be found in $MonoRUn_ROOT/tools. For advanced usage, please refer to the official MMDetection docs.
We provide a demo script to perform inference on images in a directory and save the visualized results. Example:
python demo/infer_imgs.py $KITTI_RAW_DIR/2011_09_30/2011_09_30_drive_0027_sync/image_02/data configs/kitti_multiclass_lidar_supv_trainval.py checkpoints/kitti_multiclass_lidar_supv_trainval.pth --calib demo/calib.csv --show-dir show/2011_09_30_drive_0027If you find this project useful in your research, please consider citing:
@inproceedings{monorun2021,
author = {Hansheng Chen and Yuyao Huang and Wei Tian and Zhong Gao and Lu Xiong},
title = {MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2021}
}
