Anchors in SSD

来源：互联网发布：80端口打不开编辑：程序博客网时间：2024/06/06 05:10

SSD预测bbox的做法本质与RPN相同，只不过它在多个layer上进行了预测，以更好的在多尺度上检测物体。

在多层上预测bbox

原始的SSD300在以下Layer上预测：

      conv4 ==> 38 x 38      conv7 ==> 19 x 19      conv8 ==> 10 x 10      conv9 ==> 5 x 5      conv10 ==> 3 x 3      conv11 ==> 1 x 1

后面的数字是这个layer输出的feature map的大小。
SSD512：

      conv4 ==> 64 x 64      conv7 ==> 32 x 32      conv8 ==> 16 x 16      conv9 ==> 8 x 8      conv10 ==> 4 x 4      conv11 ==> 2 x 2      conv12 ==> 1 x 1

也可以根据实际情况自己决定在哪几层上预测。

Size of reference box

以SSD512为例。
如同RPN一样，每一层都有一个特定大小的refercence box，根据它计算出各自的default boxes(anchors)。Reference box是一个正方形的box，大小由scale参数决定，作者使用的计算方式为：
这里写图片描述
其中:m为负责预测的layer数量，smin=0.2,smax=0.95, 意思是最低的一层(conv4)的Reference box的wbbox=smin. 最高一层类似。
训练时，输入的图片都被reshape到512*512，这样的话，conv4上的Reference box就是一个实际边长为512*0.2=102的正方形。

Aspect Ratios for default boxes

得到一个Layer上的Reference box后，根据这个Layer配置的aspect ratio，可以得到多个不同aspect ratio的default box（也叫anchor）。
以以下参数为例说明anchor的计算方式：

size = 0.2 * 512 = 102.4, size of reference box
aspect_ratio = [2]
feat_shape=[8, 8]
img_shape = [512, 512] #[h, w]
第i个anchor的w, h为：

s = aspect_ratio[i];w = size / img_shape[1] * np.sqrt(s)h = size / img_shape[0] / np.sqrt(s)

w,h现在都是0到1之间的相对值。在coding时，到这一步已经足够了。下面分析得到的anchor它的真实aspect ratio。
先计算w, h的绝对长度：

w = w * img_shape[1]h = h * img_shape[0]

它的真实aspect ratio为：

w h = s i z e / W i m g * s \sqrt * W i m g s i z e / H i m g / s \sqrt * H i m g = s

所以，无论是在训练时：Himg==Wimg，还是在测试时： Himg一般不等于Wimg，anchor的aspect ratio都是我们指定的aspect ratio。

阅读全文

0 0