目标检测--Accurate Single Stage Detector Using Recurrent Rolling Convolution

来源：互联网发布：java string split 点编辑：程序博客网时间：2024/06/02 06:53

Accurate Single Stage Detector Using Recurrent Rolling Convolution
CVPR 2017 商汤科技关于目标检测的文献

Code: https://github.com/xiaohaoChen/rrc_detection

本文直观的理解就是对SSD 自动寻找合适 contextual information 来提升目标检测性能。

这里 single stage detectors 就是一个过程就搞定了检测，相对于 R-CNN 系列的 two stage：候选区域提取+分类
这里写图片描述

本文首先实用数学公式分析了一下 SSD 检测存在的问题
SSD heavily relies on a strong assumption to perform well
这个强假设就是每一层特征对于检测目标是足够复杂的，包含的信息是够用的。
every Φ, by itself, has to be sophisticated enough to support the detection and the accurate localization of the objects of interest
这里的 sophistication 满足一下三点：
1）特征的尺寸足够大，包含足够的目标细节
2）将原始输入图像映射到当前特征的特征提取函数足够的deep，这样可以得到合适的 high level abstraction ，得到目标的抽象信息
3）特征层含有合适的contextual information，基于这些信息可以解决 overlapping objects, occluded objects, small objects, blur or saturated objects

The contextual information inˆΦ(·) means differently for different objects of interest. For instance, when detecting small objects it meansˆΦ(·) should return feature maps contain higher resolution features of this object to represent the missing details. When detecting occluded objects,
ˆΦ(·) should return feature maps contain robust abstraction of such object so that the feature is relatively invariant to occlusion. When detecting overlapping objects, ˆΦ(·) should return feature maps contain both the details of the boundary and the high level abstraction to distinguish
different objects.

这里我们提出了一个 Recurrent Rolling Convolution 来渐进的完成寻找合适的 contextual information的任务
RRC is a recurrent process in which each iteration gathers and aggregates relevant features for detection. As we discussed before, these revevant feature
contains contextual information which is critical for detecting challenging objects

这里写图片描述

KITTI 测试数据检测结果：
这里写图片描述

虽然是 single stage detector, 但是这个网络的速度应该不是很快。做不到实时检测

1 1