One of the challenges in training CNN models with a large image dataset lies in building an efficient data ingestion pipeline. Without that, the GPU's could be constantly starving for data and thus training goes slowly. In this post, I'm sharing my experience in training Keras image classification models with tensorflow's TFRecords and tf.data API. I think I train the models much more efficiently this way than reading original jpg files from the file system.
A while ago I wrote a post about how to set up and run Faster RCNN on Jetson TX2. In this post I demonstrate how to use a faster CNN feature extractor to speed up Faster RCNN while maintaining its object detection accuracy (mAP). More specifically, I replaced VGG16 layers with GoogLeNet in Faster RCNN and was able to reduce model inference time roughly by half.
I extended my previous tegra-cam.py example by hooking up a Caffe image classification model into the video pipeline. The resulting code should be good for quickly verifying a newly trained Caffe image classification model, for prototyping, or for building Caffe demo programs with live camera input.
I implemented a video capturing module in Torch7. This module calls V4L2 API (in C code) to fetch raw video data from the underlying video capture device. It then uses Lua FFI interface to pass the data to Lua/Torch7. The resulting data are Torch Tensors and could be displayed on screen with Torch7's image API.
blog built using the cayman-theme by Jason Long. LICENSE