1
0
mirror of https://github.com/opencv/opencv_contrib.git synced 2025-10-18 17:24:28 +08:00
Files
opencv_contrib/modules/dnn_objdetect/README.md
Kv Manohar 41a5a5eaf5 Merge pull request #1253 from kvmanohar22:GSoC17_dnn_objdetect
GSoC'17 Learning compact models for object detection (#1253)

* Final solver and model for SqueezeNet model

* update README

* update dependencies and CMakeLists

* add global pooling

* Add training scripts

* fix typo

* fix dependency of caffe

* fix whitespace

* Add squeezedet architecture

* Pascal pre process script

* Adding pre process scripts

* Generate the graph of the model

* more readable

* fix some bugs in the graph

* Post process class implementation

* Complete minimal post processing and standalone running

* Complete the base class

* remove c++11 features and fix bugs

* Complete example

* fix bugs

* Adding final scripts

* Classification scripts

* Update README.md

* Add example code and results

* Update README.md

* Re-order and fix some bugs

* fix build failure

* Document classes and functions

* Add instructions on how to use samples

* update instructionos

* fix docs failure

* fix conversion types

* fix type conversion warning

* Change examples to sample directoryu

* restructure directories

* add more references

* fix whitespace

* retain aspect ratio

* Add more examples

* fix docs warnings

* update with links to trained weights

* threshold update

* png -> jpg

* fix tutorial

* model files

* precomp.hpp , fix readme links, module dependencies

* copyrights

- no copyright in samples
- use new style OpenCV copyright header
- precomp.hpp
2018-01-29 12:08:32 +03:00

103 lines
3.3 KiB
Markdown

# Object Detection using Convolutional Neural Networks
This module uses Convolutional Neural Networks for detecting objects in an image
## Dependencies
- opencv dnn module
- Google Protobuf
## Building this module
Run the following command to build this module:
```make
cmake -DOPENCV_EXTRA_MODULES_PATH=<opencv_contrib>/modules -Dopencv_dnn_objdetect=ON <opencv_source_dir>
```
## Models
There are two models which are trained.
#### SqueezeNet model trained for Image Classification.
- This model was trained for 1500000 iterations with a batch size of 16
- Size of Model: 4.9MB
- Top-1 Accuracy on ImageNet 2012 DataSet: 56.10%
- Top-5 Accuracy on ImageNet 2012 DataSet: 79.54%
- Link to trained weights: [here](https://github.com/kvmanohar22/caffe/blob/obj_detect_loss/proto/SqueezeNet.caffemodel) ([copy](https://github.com/opencv/opencv_3rdparty/tree/dnn_objdetect_20170827))
#### SqueezeDet model trained for Object Detection
- This model was trained for 180000 iterations with a batch size of 16
- Size of the Model: 14.2MB
- Link to the trained weights: [here](https://github.com/kvmanohar22/caffe/blob/obj_detect_loss/proto/SqueezeDet.caffemodel) ([copy](https://github.com/opencv/opencv_3rdparty/tree/dnn_objdetect_20170827))
## Usage
#### With Caffe
For details pertaining to the usage of the model, have a look at [this repository](https://github.com/kvmanohar22/caffe)
You can infact train your own object detection models with the loss function which is implemented.
#### Without Caffe, using `opencv's dnn module`
`tutorials/core_detect.cpp` gives an example of how to use the model to predict the bounding boxes.
`tutorials/image_classification.cpp` gives an example of how to use the model to classify an image.
Here's the brief summary of examples. For detailed usage and testing, refer `tutorials` directory.
## Examples:
### Image Classification
```c++
// Read the net along with it's trained weights
cv::dnn::net = cv::dnn::readNetFromCaffe(model_defn, model_weights);
// Read an image
cv::Mat image = cv::imread(image_file);
// Convert the image into blob
cv::Mat image_blob = cv::net::blobFromImage(image);
// Get the output of "predictions" layer
cv::Mat probs = net.forward("predictions");
```
`probs` is a 4-d tensor of shape `[1, 1000, 1, 1]` which is obtained after the application of `softmax` activation.
### Object Detection
```c++
// Reading the network and weights, converting image to blob is same as Image Classification example.
// Forward through the network and collect blob data
cv::Mat delta_bboxs = net.forward("slice")[0];
cv::Mat conf_scores = net.forward("softmax");
cv::Mat class_scores = net.forward("sigmoid");
```
Three blobs aka `delta_bbox`, `conf_scores`, `class_scores` are post-processed in `cv::dnn_objdetect::InferBbox` class and the bounding boxes predicted.
```c++
InferBbox infer(delta_bbox, class_scores, conf_scores);
infer.filter();
```
`infer.filter()` returns vector of `cv::dnn_objdetect::object` of predictions. Here `cv::dnn_objdetect::object` is a structure containing the following elements.
```c++
typedef struct {
int xmin, xmax;
int ymin, ymax;
int class_idx;
std::string label_name;
double class_prob;
} object;
```
For further details on post-processing refer this detailed [blog-post](https://kvmanohar22.github.io/GSoC/).
## Results from Object Detection
Refer `tutorials` directory for results.