论文笔记:going deeper with convolutions
来源:互联网 发布:软件破解软件 编辑:程序博客网 时间:2024/04/29 15:02
读完以后还有很多细节不懂,但这篇paper主要思想就是提出了inception层,有两个好处,第一,inception层计算时降低了数据的维度,使得可以再有限的计算资源下构建更复杂的网络结构(GoogleNet有22层),第二,inception层将多个scale的feature结合起来,效果更好,原文是这么说的:
One of the main beneficial aspects of this architecture is that it allows for increasing the number of units at each stage significantly without an uncontrolled blow-up in computational complexity. The ubiquitous use of dimension reduction allows for shielding the large number of input filters of the last stage to the next layer, first reducing their dimension before convolving over them with a large patch size. Another practically useful aspect of this design is that it aligns with the intuition that visual information should be processed at various scales and then aggregated so that the next stage can abstract features from different scales simultaneously.
下面看一下inception层的结构:
可以看到它是4个scale的feature的结合.3个黄色的1x1的卷积层都是降维的作用,作者认为降维后仍可以保留大部分原始信息.上一层的输出分别经过1x1,3x3,5x5的conv和pooling层最后结合在一起作为inception层的输出.下面看一下GoogleNet的结构,进一步解释inception层:
以inception (3a)为例,它的输入是28x28x192,有192个chanel,1x1的conv只输出了64个chanel,降低了维数但保留了大部分信息,3x3和5x5的conv也都是在降维后进行,pooling层经过降维也只有32个chanel,最终的输出为28x28x256,chanel增加的很少.所以说inception层允许CNN更加deeper.
关于GoogleNet最后几层原文中是这样说的:
The use of average pooling before the classifier is based on [12], although our implementation differs in that we use an extra linear layer. This enables adapting and fine-tuning our networks for other label sets easily, but it is mostly convenience and we do not expect it to have a major effect. It was found that a move from fully connected layers to average pooling improved the top-1 accuracy by about 0.6%, however the use of dropout remained essential even after removing the fully connected layers.
大概就是说用avg pool替代fully connected效果更好,linear层的作用是在使用其他数据集的时候做一个映射,比如我们训练face verification的model的时候有10000多类别,就要在这一层将1024维的输入映射到10000多维.dropout层是有一定几率(这里是40%)将输入变为0输出,有助于避免overfiting,基本现在所有的CNN都有用到.
还有一个地方不清楚的是:文章中所提的分析统计相关性来进行聚类,这个聚类是怎么做的?聚类的结果与Inception Model有没有什么关系?(比如说是不是先将前一层的输出进行统计分析聚类,然后再在这个聚类的结果上进行1*1 3*3 5*5卷积在多个尺度上提取特征???一个朋友说这里的聚类就是pooling,不是很理解)以后有机会再来考虑吧!或者看看参考文献2吧
- 论文笔记:going deeper with convolutions
- 论文笔记 | Going deeper with convolutions
- 论文笔记《Going deeper with convolutions》1409
- going deeper with convolutions笔记
- Going deeper with convolutions笔记
- 《Going Deeper with Convolutions》笔记
- [深度学习论文笔记][Image Classification] Going Deeper with Convolutions
- GoogLeNet Inception V1:Going deeper with convolutions论文笔记
- 论文笔记:Going deeper with convolutions(inception v1)
- Going Deeper with Convolutions--Googlenet论文总结
- GoogleNet :Going deeper with convolutions 论文阅读
- GoogLeNet论文《Going deeper with convolutions》总结
- 《Going deeper with Convolutions》论文阅读
- Going Deeper with convolutions
- Going deeper with convolutions
- Going Deeper with Convolutions
- Going deeper with convolutions
- Going deeper with convolutions
- Android可以动态控制图片显示区域的自定义ImageView
- 从数据库里读出数据,并将其按照下拉框的选择填入表格
- 【PHP】PHP开发工具
- iOS 中KVC、KVO、NSNotification、delegate 总结及区别
- 流转换成文件,存库 保存本地
- 论文笔记:going deeper with convolutions
- dubbo配置文件异常
- [leetcode] 300. Longest Increasing Subsequence 解题报告
- ubuntu14.04下nginx+gunicorn部署django1.8
- CAS实现SSO单点登录原理
- 关于长串数字字符串的换行
- OpenLDAP安装与配置
- Objective-C语法之KVO的使用
- 高效程序员系列 别做机器人——让工作自动化