A while ago I wrote a post about how to set up and run Faster RCNN on Jetson TX2. In this post I demonstrate how to use a faster CNN feature extractor to speed up Faster RCNN while maintaining its object detection accuracy (mAP). More specifically, I replaced VGG16 layers with GoogLeNet in Faster RCNN and was able to reduce model inference time roughly by half.
I extended my previous tegra-cam.py example by hooking up a Caffe image classification model into the video pipeline. The resulting code should be good for quickly verifying a newly trained Caffe image classification model, for prototyping, or for building Caffe demo programs with live camera input.
I implemented a video capturing module in Torch7. This module calls V4L2 API (in C code) to fetch raw video data from the underlying video capture device. It then uses Lua FFI interface to pass the data to Lua/Torch7. The resulting data are Torch Tensors and could be displayed on screen with Torch7's image API.
blog built using the cayman-theme by Jason Long. LICENSE