2018-03-27 update: 1. I’ve written a new post about the latest YOLOv3, “YOLOv3 on Jetson TX2”; 2. Updated YOLOv2 related web links to reflect changes on the darknet web site.
I have been working extensively on deep-learning based object detection techniques in the past few weeks. While I have spent quite some time learning, training and testing NVIDIA DetectNet and Fater RCNN models, I did also check out how YOLOv2 performs on Jetson TX2. Here are the results.
YOLOv2 is an improved version of “You Only Look Once (YOLO)”. To gain an understanding about how YOLO/YOLOv2 works, I recommend reading the orginal papers along with the following video presentations by the orginal author.
- “You Only Look Once: Unified, Real-Time Object Detection”
- “YOLO9000: Better, Faster, Stronger”, aka YOLOv2
- Video: Presentation of YOLO at CVPR 2016
- Video: Presentation of YOLO 9000
In a nutshell, YOLOv2 incorporates the following improvements over the original YOLO to achieve an impressive 15.2 points of increase in mAP on Pascal VOC 2007 dataset.
Running pre-trained YOLOv2 models on Jetson TX2 is pretty straightforward. I mainly just followed instructions on the official YOLOv2 (Darknet) website:
Here is how I’ve done it:
Clone the latest darknet source code from GitHub.
$ cd ~/project $ git clone https://github.com/pjreddie/darknet yolov2 $ cd yolov2
Modify the first few lines of
Makefileas follows. Note that CUDA architecture of TX2 is “62”, while TX1 “53”.
GPU=1 CUDNN=1 OPENCV=1 ...... ARCH= -gencode arch=compute_53,code=[sm_53,compute_53] \ -gencode arch=compute_62,code=[sm_62,compute_62]
Build the code.
To test YOLOv2 with live video feed, I used a USB webcam (
/dev/video1). I saw YOLOv2 processed 7.0~7.1 frames per second. This was with the regular (larger) YOLOv2 model. And note that I had set Jetson TX2 to maximum performance mode with
$ wget https://pjreddie.com/media/files/yolov2.weights $ ./darknet detector demo cfg/coco.data cfg/yolov2.cfg yolov2.weights -c 1
I also tested the Tiny YOLO model. The result was much better. I was able to get 16~17 frames per second processed and displayed!
$ wget https://pjreddie.com/media/files/yolov2-tiny-voc.weights $ ./darknet detector demo cfg/voc.data cfg/yolov2-tiny-voc.cfg yolov2-tiny-voc.weights -c 1
Overall I think the speed of Tiny YOLOv2 is very good comparing to Faster RCNN. However, YOLOv2 is not implemented with Caffe, and it’s difficult for me to evaluate its accuracy/mAP (for the datasets/tasks I really care about). So for now I’m still sticking to Faster RCNN for my work.