caffe下py-Faster RCNN end2end模式修改anchor的scale大小

来源：互联网发布：永久封的淘宝店能解吗编辑：程序博客网时间：2024/06/07 06:42

caffe下py-Faster RCNN end2end模式修改anchor的scale大小

最近在使用Faster RCNN进行目标检测，因为自己的数据样本目标较小，原始的scale下训练结果不够，所以想着修改anchor的proposal大小来提高精度，这里记录下来。以下是Faster RCNN的论文和源码：
Faster RCNN论文：点这里
Faster RCNN源码Github地址：点这里

需要修改的总共有5个文件，分别修改如下：
1.源目录/lib/rpn/proposal_layer.py文件

def setup(self, bottom, top):        # parse the layer parameter string, which must be valid YAML        layer_params = yaml.load(self.param_str_)        self._feat_stride = layer_params['feat_stride']        anchor_scales = layer_params.get('scales', (8, 16, 32))        self._anchors = generate_anchors(scales=np.array(anchor_scales))        self._num_anchors = self._anchors.shape[0]        if DEBUG:            print 'feat_stride: {}'.format(self._feat_stride)            print 'anchors:'            print self._anchors        # rois blob: holds R regions of interest, each is a 5-tuple        # (n, x1, y1, x2, y2) specifying an image batch index n and a        # rectangle (x1, y1, x2, y2)        top[0].reshape(1, 5)        # scores blob: holds scores for R regions of interest        if len(top) > 1:            top[1].reshape(1, 1, 1, 1)

大概在29行，

anchor_scales = layer_params.get('scales', (8, 16, 32))

这一句中括号里的数字就是proposal的anchor的大小标准，按照自己的需要修改，我这里修改如下：

anchor_scales = layer_params.get('scales', (2, 4, 8, 16, 32))

1.源目录/lib/rpn/anchor_target_layer.py文件

def setup(self, bottom, top):        layer_params = yaml.load(self.param_str_)        anchor_scales = layer_params.get('scales', (8, 16, 32))        self._anchors = generate_anchors(scales=np.array(anchor_scales))        self._num_anchors = self._anchors.shape[0]        self._feat_stride = layer_params['feat_stride']        if DEBUG:            print 'anchors:'            print self._anchors            print 'anchor shapes:'            print np.hstack((                self._anchors[:, 2::4] - self._anchors[:, 0::4],                self._anchors[:, 3::4] - self._anchors[:, 1::4],            ))            self._counts = cfg.EPS            self._sums = np.zeros((1, 4))            self._squared_sums = np.zeros((1, 4))            self._fg_sum = 0            self._bg_sum = 0            self._count = 0        # allow boxes to sit over the edge by a small amount        self._allowed_border = layer_params.get('allowed_border', 0)        height, width = bottom[0].data.shape[-2:]        if DEBUG:            print 'AnchorTargetLayer: height', height, 'width', width        A = self._num_anchors        # labels        top[0].reshape(1, 1, A * height, width)        # bbox_targets        top[1].reshape(1, A * 4, height, width)        # bbox_inside_weights        top[2].reshape(1, A * 4, height, width)        # bbox_outside_weights        top[3].reshape(1, A * 4, height, width)

大概在27行，

anchor_scales = layer_params.get('scales', (2,4,8, 16, 32))

这一句和上一句一样，修改也一样：

anchor_scales = layer_params.get('scales', (2, 4, 8, 16, 32))

3.源目录/models/pascal_voc/VGG16/faster_rcnn_end2end/train.prototxt

注意：这里的VGG16是指自己要训练采用的网络模型，根据自己的需要变动，可以是ZF、ResNet之类的。

rpn_cls_score层：

layer {  name: "rpn_cls_score"  type: "Convolution"  bottom: "rpn/output"  top: "rpn_cls_score"  param { lr_mult: 1.0 }  param { lr_mult: 2.0 }  convolution_param {    num_output:18   # 2(bg/fg) * 9(anchors)    kernel_size: 1 pad: 0 stride: 1    weight_filler { type: "gaussian" std: 0.01 }    bias_filler { type: "constant" value: 0 }  }}

这里的num_output是2*anchors的个数，原来是9（3个anchor scale*3个anchor ratio），现在改成对应的数字，我这里anchor的个数是3*5，所以num_output改成30。

num_output:18   # 2(bg/fg) * 9(anchors)

改成

num_output:30   # 2(bg/fg) * 15(anchors)

rpn_bbox_pred层：

layer {  name: "rpn_bbox_pred"  type: "Convolution"  bottom: "rpn/output"  top: "rpn_bbox_pred"  param { lr_mult: 1.0 }  param { lr_mult: 2.0 }  convolution_param {    num_output: 36   # 4 * 9(anchors)    kernel_size: 1 pad: 0 stride: 1    weight_filler { type: "gaussian" std: 0.01 }    bias_filler { type: "constant" value: 0 }  }}

这里的num_output是每个anchor四个角点坐标个数，同样按之前的方法修改：

num_output: 36   # 4 * 9(anchors)

修改成：

num_output: 60   # 4 * 15(anchors)

rpn_cls_prob_reshape层：

layer {  name: 'rpn_cls_prob_reshape'  type: 'Reshape'  bottom: 'rpn_cls_prob'  top: 'rpn_cls_prob_reshape'  reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }}

这一行

reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }

中的18修改成2*anchor个数，这里是30：

reshape_param { shape { dim: 0 dim: 30 dim: -1 dim: 0 } }

4.源目录/models/pascal_voc/VGG16/faster_rcnn_end2end/test.prototxt
这里修改和train.prototxt一样

5.源目录/models/pascal_voc/VGG16/faster_rcnn_alt_opt/faster_rcnn_test.pt
虽然是end2end模式，但是还是得修改这个目录下的这个文件，不然test的时候会出现错误。
修改方法很简单，直接把刚才修改的4（test.prototxt）复制过来，文件名改成faster_rcnn_test.pt就行了。

以上就是要修改的文件内容，需要注意的是这里的scale大小是指图像resize之后并经过stride=16倍的池化缩小后的大小，所以在resize后的图像应该是(128,256,512)的anchor大小，根据自己的实际情况改。

阅读全文

'); })();