YOLTv4 builds upon YOLT and SIMRDWN, and updates these frameworks to use the most performant version of YOLO, YOLOv4. YOLTv4 is designed to detect objects in aerial or satellite imagery in arbitrarily large images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks.
This repository is built upon the impressive work of AlexeyAB's YOLOv4 implementation, which improves both speed and detection performance compared to YOLOv3 (which is implemented in SIMRDWN). We use YOLOv4 insead of "YOLOv5", since YOLOv4 is endorsed by the original creators of YOLO, whereas "YOLOv5" is not; furthermore YOLOv4 appears to have superior performance.
Below, we provide examples of how to use this repository with the open-source Rareplanes dataset.
YOLTv4 is built to execute within a docker container on a GPU-enabled machine. The docker command creates an Ubuntu 16.04 image with CUDA 9.2, python 3.6, and conda.
-
Clone this repository (e.g. to /yoltv4/).
-
Download model weights to yoltv4/darknet/weights). See: https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.conv.137 https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.weights https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.weights https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-csp.conv.142
-
Install nvidia-docker.
-
Build docker file.
nvidia-docker build -t yoltv4_image /yoltv4/docker
-
Spin up the docker container (see the docker docs for options).
NV_GPU=0 nvidia-docker run -it -v /local_data:/local_data -v /yoltv4:/yoltv4 -ti --ipc=host --name yoltv4_gpu0 yoltv4_image
-
Compile the Darknet C program.
First Set GPU=1 CUDNN=1, CUDNN_HALF=1, OPENCV=1 in /yoltv4/darknet/Makefile, then make:
cd /yoltv4/darknet make
-
Make YOLO images and labels (see yoltv4/notebooks/prep_data.ipynb for further details).
-
Create a txt file listing the training images.
-
Create file obj.names file with each desired object name on its own line.
-
Create file obj.data in the directory yoltv4/darknet/data containing necessary files. For example:
/yoltv4/darknet/data/rareplanes_train.data
classes = 30 train = /local_data/cosmiq/wdata/rareplanes/train/txt/train.txt valid = /local_data/cosmiq/wdata/rareplanes/train/txt/valid.txt names = /yoltv4/darknet/data/rareplanes.name backup = backup/
-
Prepare config files.
See instructions here, or tweak /yoltv4/darknet/cfg/yoltv4_rareplanes.cfg.
-
Execute.
cd /yoltv4/darknet time ./darknet detector train data/rareplanes_train.data cfg/yoltv4_rareplanes.cfg weights/yolov4.conv.137 -dont_show -mjpeg_port 8090 -map
-
Review progress (plotted at: /yoltv4/darknet/chart_yoltv4_rareplanes.png).
-
Make sliced images (see yoltv4/notebooks/prep_data.ipynb for further details).
-
Create a txt file listing the training images.
-
Create file obj.data in the directory yoltv4/darknet/data containing necessary files. For example:
/yoltv4/darknet/data/rareplanes_test.data classes = 30 train = valid = /local_data/cosmiq/wdata/rareplanes/test/txt/test.txt names = /yoltv4/darknet/data/rareplanes.name backup = backup/
-
Execute (proceeds at >80 frames per second on a Tesla P100):
cd /yoltv4/darknet time ./darknet detector valid data/rareplanes_test.data cfg/yoltv4_rareplanes.cfg backup/ yoltv4_rareplanes_best.weights
-
Post-process detections:
A. Move detections into results directory
mkdir /yoltv4/darknet/results/rareplanes_preds_v0 mkdir /yoltv4/darknet/results/rareplanes_preds_v0/orig_txt mv /yoltv4/darknet/results/comp4_det_test_* /yoltv4/darknet/results/rareplanes_preds_v0/orig_txt/
B. Stitch detections back together and make plots
time python /yoltv4/yoltv4/post_process.py \ --pred_dir=/yoltv4/darknet/results/rareplanes_preds_v0/orig_txt/ \ --raw_im_dir=/local_data/cosmiq/wdata/rareplanes/test/images/ \ --sliced_im_dir=/local_data/cosmiq/wdata/rareplanes/test/yoltv4/images_slice/ \ --out_dir= /yoltv4/darknet/results/rareplanes_preds_v0 \ --detection_thresh=0.25 \ --slice_size=416} \ --n_plots=8
Outputs will look something like the figures below: