Spatial Pyramid Pooling in Deep Convolutional --- Spp_net
来源:互联网 发布:软件测试中软件的定义 编辑:程序博客网 时间:2024/05/19 04:04
微软亚研院2015的一篇文章,优点是能够满足任意大小图像的输入。
主要思想:
(1)Spatial Pyramid Pooling Layer. 正是因为该层,才让Spp_net能够实现任意图片的输入,并且得到固定长度的特征向量:
stride和window的计算:
(2)Mapping a Window to Feature Maps. 将原图输入Spp_net后,通过下面图片中介绍的方法,能够将原图中的点映射到feature map上,为object detection打下基础:
主要代码实现(基于theano/keras):
(1)spp_layer:
class SppLayer(Layer): def __init__(self,bins,feature_map_size=0): super(SppLayer,self).__init__() self.strides = [] self.windows = [] self.a = feature_map_size#feature_map size self.bins = bins self.num_bins = len(bins) def get_output(self,train): self.input = self.get_input(train) for i in range(self.num_bins): self.strides.append(int(math.floor(self.a/self.bins[i]))) self.windows.append(int(math.ceil(self.a/self.bins[i]))) self.pooled_out = [] for j in range(self.num_bins): self.pooled_out.append(downsample.max_pool_2d(input=self.input, ds=(self.windows[j],self.windows[j]), st=(self.strides[j],self.strides[j]), ignore_border=False)) for k in range(self.num_bins): self.pooled_out[k] = self.pooled_out[k].flatten(2) """ print self.windows[k] print self.strides[k] print 'K: '+str(k) """ # batch_size * image_size self.output = T.concatenate([self.pooled_out[0],self.pooled_out[1],self.pooled_out[2]],axis=1) return self.output
(2)Mapping a Window to Feature Maps:
def window_to_feature_map(window_point_x1,window_point_y1,window_point_x2,window_point_y2, window_size_x,window_size_y,map_size_x,map_size_y): map_point_x1 = window_point_x1*math.ceil(map_size_x/window_size_x)-1 map_point_y1 = window_point_y1*math.ceil(map_size_y/window_size_y)-1 map_point_x2 = window_point_x2*math.ceil(map_size_x/window_size_x)-1 map_point_y2 = window_point_y2*math.ceil(map_size_y/window_size_y)-1 return map_point_x1,map_point_y1,map_point_x2,map_point_y2
0 0
- Spatial Pyramid Pooling in Deep Convolutional --- Spp_net
- Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- Spatial Pyramid Pooling in Deep Convolutional Networks for Visual
- Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- 【深度学习论文笔记】Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- 深度学习研究理解7:Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- 深度学习论文笔记-Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition 文章总结
- 论文笔记 《Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition》
- SPP-net论文笔记《Spatial Pyramid Pooling in Deep Convolutional Network for Visual Recognition》
- RCNN学习笔记(3):Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition(SPP-net)
- 论文笔记|Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition--SPP-net论文笔记
- SPP-Net:Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- RCNN学习笔记(3):Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition(SPP-net)
- Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition(SPP-net) 笔记
- getdtablesize()函数
- Android数据序列化之对Parcelable和Serializable的理解
- 数据库范式
- Bartender教程 检验数字
- 安卓中的帧布局
- Spatial Pyramid Pooling in Deep Convolutional --- Spp_net
- flex不使用自动滚动,使用固定高度宽度套用也是一种方案
- PL/SQL 9 注册码
- 伊斯坦布尔之旅第一天:蓝色清真寺和圣索菲亚博物馆
- C++中的PIMPL和NVI
- 如何做到通过Struts2完成Submit但是画面不跳转
- javascript的正则表达式
- HTC one M9 查看CID
- 高级调试技巧