
GSoC'17 Learning compact models for object detection (#1253) * Final solver and model for SqueezeNet model * update README * update dependencies and CMakeLists * add global pooling * Add training scripts * fix typo * fix dependency of caffe * fix whitespace * Add squeezedet architecture * Pascal pre process script * Adding pre process scripts * Generate the graph of the model * more readable * fix some bugs in the graph * Post process class implementation * Complete minimal post processing and standalone running * Complete the base class * remove c++11 features and fix bugs * Complete example * fix bugs * Adding final scripts * Classification scripts * Update README.md * Add example code and results * Update README.md * Re-order and fix some bugs * fix build failure * Document classes and functions * Add instructions on how to use samples * update instructionos * fix docs failure * fix conversion types * fix type conversion warning * Change examples to sample directoryu * restructure directories * add more references * fix whitespace * retain aspect ratio * Add more examples * fix docs warnings * update with links to trained weights * threshold update * png -> jpg * fix tutorial * model files * precomp.hpp , fix readme links, module dependencies * copyrights - no copyright in samples - use new style OpenCV copyright header - precomp.hpp
Object Detection using Convolutional Neural Networks
This module uses Convolutional Neural Networks for detecting objects in an image
Dependencies
- opencv dnn module
- Google Protobuf
Building this module
Run the following command to build this module:
cmake -DOPENCV_EXTRA_MODULES_PATH=<opencv_contrib>/modules -Dopencv_dnn_objdetect=ON <opencv_source_dir>
Models
There are two models which are trained.
SqueezeNet model trained for Image Classification.
- This model was trained for 1500000 iterations with a batch size of 16
- Size of Model: 4.9MB
- Top-1 Accuracy on ImageNet 2012 DataSet: 56.10%
- Top-5 Accuracy on ImageNet 2012 DataSet: 79.54%
- Link to trained weights: here (copy)
SqueezeDet model trained for Object Detection
- This model was trained for 180000 iterations with a batch size of 16
- Size of the Model: 14.2MB
- Link to the trained weights: here (copy)
Usage
With Caffe
For details pertaining to the usage of the model, have a look at this repository
You can infact train your own object detection models with the loss function which is implemented.
Without Caffe, using opencv's dnn module
tutorials/core_detect.cpp
gives an example of how to use the model to predict the bounding boxes.
tutorials/image_classification.cpp
gives an example of how to use the model to classify an image.
Here's the brief summary of examples. For detailed usage and testing, refer tutorials
directory.
Examples:
Image Classification
// Read the net along with it's trained weights
cv::dnn::net = cv::dnn::readNetFromCaffe(model_defn, model_weights);
// Read an image
cv::Mat image = cv::imread(image_file);
// Convert the image into blob
cv::Mat image_blob = cv::net::blobFromImage(image);
// Get the output of "predictions" layer
cv::Mat probs = net.forward("predictions");
probs
is a 4-d tensor of shape [1, 1000, 1, 1]
which is obtained after the application of softmax
activation.
Object Detection
// Reading the network and weights, converting image to blob is same as Image Classification example.
// Forward through the network and collect blob data
cv::Mat delta_bboxs = net.forward("slice")[0];
cv::Mat conf_scores = net.forward("softmax");
cv::Mat class_scores = net.forward("sigmoid");
Three blobs aka delta_bbox
, conf_scores
, class_scores
are post-processed in cv::dnn_objdetect::InferBbox
class and the bounding boxes predicted.
InferBbox infer(delta_bbox, class_scores, conf_scores);
infer.filter();
infer.filter()
returns vector of cv::dnn_objdetect::object
of predictions. Here cv::dnn_objdetect::object
is a structure containing the following elements.
typedef struct {
int xmin, xmax;
int ymin, ymax;
int class_idx;
std::string label_name;
double class_prob;
} object;
For further details on post-processing refer this detailed blog-post.
Results from Object Detection
Refer tutorials
directory for results.