【Deep Learning】Review:Faster R-CNN- TowardsReal-Time Object Detection with Region Proposal Networks

来源:互联网 发布:java接口文档怎么写 编辑:程序博客网 时间:2024/04/27 10:59

Review of

Faster R-CNN- TowardsReal-Time Object Detection with Region Proposal Networks

 


 

1.      Summary of thePaper

Different from traditional CNN,Fast R-NN delivers the picture and simultaneously the multiple regions ofinterest (RoI) into the network. Every RoI was pooled into a fixed feature mapand then map then into corresponding feature vector. However, this paperpresents an even faster system that mainly responsible for obtaining proposal.The general way for getting proposal is to use selective search. Althoughregional CNN utilizes GPU to accelerate, the region proposal methods areactually implemented based on CPU, which slow down the speed of whole system. Thus,author presents we are able to obtain region proposals through a “featurepicture “after convolution. Via two convolutional layers, we could implementRegion Proposal Networks then. One for transferring feature picture to theposition vector, another one is for k region proposal. In such case, the objectdetection could reach real-time reaction.

Figure 1. The process of Faster R-CNN

 

2.      MainContributions

1)      Introducea Region Proposal Network (RPN) that shares full-image convolutional featureswith the detection network, thus enabling nearly cost-free region proposals.

Figure 2. Deliver the vector to box-regression layer andbox-classifier layer

2)      Theyfurther merge RPN and Fast R-CNN into a single network by sharing theirconvolutional features—using the recently popular terminology of neuralnetworks with “attention” mechanisms, the RPN component tells the unifiednetwork where to look, which largely accelerate the speed of the whole system.

 

3.      Positive andnegative points

Positive Points:

(i) Obviouslythe Region Proposal Networks. By sharing the convolutional layer, the systemlargely promotes the speed.

(ii) The Translation-InvariantAnchors is guaranteed and reduces the model size.

(iii) Withaccelerating the region proposal, the object detection system could be runningin real-time and the accuracy is also improved.

Negative Points:

(i) After readinganother paper called You Only Look Once: Unified, Real-TimeObject Detection, I found the bottleneck of RCNN is to convert thedetection problem into region classification problem, which might lose theobject context information in the picture.

 

4.      How strong isthe evaluation

The Faster R-CNNsystem increases the mAP from 41.5%/21.2% (VGG- 16) to 48.4%/27.2% (ResNet-101)on the COCO val set. With other improvements orthogonal to Faster RCNN obtaineda single-model result of 55.7%/34.9% and an ensemble result of 59.0%/37.4% onthe COCO test-dev set.

The mAP underthis setting is 76.1% on the PASCAL VOC 2007 test set. This result is slightlybetter than that trained on VOC07+12 (73.2%) by a good margin, even though thePASCAL VOC data are not exploited.

In all, theaccuracy rate is extremely high, better than RCNN and almost double theaccuracy for CNN.

5.      Possibledirection for the future work

Furtherdiscover the object context in the picture by regression method. Just likeYOLO.

 

 

0 0
原创粉丝点击