Training Very Deep Networks--Highway Networks 论文笔记
来源:互联网 发布:南阳百牛网络 编辑:程序博客网 时间:2024/06/11 06:05
网上有传言 微软的深度残差学习是抄袭 Highway Networks,只是Highway Networks的一个特例。Highway Networks 的确是先发表的。
http://people.idsia.ch/~rupesh/very_deep_learning/
有开源代码
reference:
http://blog.csdn.net/cv_family_z/article/details/50349436
http://blog.csdn.net/l494926429/article/details/51737883
Our Highway Networks take inspiration from Long Short Term Memory (LSTM)and allow training of deep, efficient networks (even with hundreds of layers) with conventional gradient-based methods. Even when large depths are not required, highway layers can be used instead of traditional neural layers to allow the network to adaptively copy or transform representations
我们这个高速CNN网络受 LSTM启发,可以使用传统基于梯度的方法快速训练深度网络(几百层的)。即使不需要大的深度,高速网络也可以自适应表示合适的特征变换。
随着神经网络的发展,网络的深度逐渐加深(更深的层数以及更小的感受野,能够提高网络分类的准确性(Szegedy et al.,2014;Simonyan & Zisserman,2014)),网络的训练也就变得越来越困难。Highway Networks就是一种解决深层次网络训练困难的网络框架。 以下这几篇文章证明了优化深层神经网络十分困难(写文章的时候肯定用得到,先记下):(Glorot & Bengio,2010;Saxe et al.,2013;He et al.,2015,) (Simonyan & Zisserman,2014; Romero et al., 2014) (Szegedy et al.,2014; Lee et al., 2015)。
Highway Networks:一种可学习的门限机制,在此机制下,一些信息流没有衰减的通过一些网络层,适用于SGD法。
2 Highway Networks
一般一个 plain feedforward neural network 有L层网络组成,每层网络对输入进行一个非线性映射变换,可以表达如下
H为非线性函数,W权重,x输入,y输出。
一般后续还有其他处理,例如非线性激活函数, convolutional or recurrent
对于高速CNN网络,我们定义一层网络如下
We refer to T as the transform gate and C as the carry gate
T和C分别表示 对输入的映射和直接传送。
在这篇文献中我们设置 C=1-T,则得到下式
上公式中参数的维数须一致。 x,y, H(x,WH)andT(x,WT)的维度必须相同,不够补零。
我们观察到,对于特殊的T:
for the Jacobian of the layer transform:
Thus, depending on the output of the transform gates, a highway layer can smoothly vary its behavior between that of H and that of a layer which simply passes its inputs through
2.1 Constructing Highway Networks
如果 x,y,H,T的维数不一致,可以通过处理使其一致。
2.2 Training Deep Highway Networks
我们定义 transform gate 如下
W是权重矩阵, b是 bias 向量
This suggests a simple initialization scheme which is independent of the nature of H: b T can be initialized with a negative value (e.g. -1, -3 etc.) such that the network is initially biased towards carry behavior. This scheme is strongly inspired by the proposal [30] to initially bias the gates in an LSTM network, to help bridge long-term temporal dependencies early in learning
初始化时可以给b初始化一个负值,相当于网络在开始的时候侧重于搬运行为(carry behavior),就是什么处理都不做。这个主要是受文献【30】启发。我们的实验也证明了这个推测是正确的。
- Training Very Deep Networks--Highway Networks 论文笔记
- Training Very Deep Networks--Highway Networks
- Training Very Deep Networks
- Channel Pruning for Accelerating Very Deep Neural Networks 论文笔记
- Training Very Deep Networks公式推导
- 论文笔记 | VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE -SCALE IMAGE RECOGNITION
- [深度学习论文笔记][Image Classification] Very Deep Convolutional Networks for Large-Scale Image Recognitio
- VGG:VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION论文笔记
- 论文笔记:Very Deep Convolutional Networks for Large-Scale Image Recognition
- 论文Very Deep Convolutional Networks for Large-Scale Image Recognition
- 论文笔记:Weakly Supervised Deep Detection Networks
- Highway Networks
- Highway Networks
- Highway Networks
- Efficient Training of Very Deep Neural Networks for Supervised Hashing
- 《Understanding the difficulty of training deep feedforward neural networks》笔记
- VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION 论文学习
- Very Deep Convolutional Networks For Large-Scale Image Recognition论文翻译总结
- Android 显示view控件超出父控件效果
- iOS-JSON字符串转NSDictionary / 字典转JSON字符串
- 《Java编程思想》读书笔记
- UnicodeDecodeError: 'ascii' codec can't decode byte 0Xb0 in postion 1: ordinal not in range(128)
- logback多线程下死锁问题
- Training Very Deep Networks--Highway Networks 论文笔记
- 查询数据库所有表信息
- 安装gitolite
- ORACLE——Instant Client配置SQL*LDR、EXP等命令工具
- 相似图片搜索的原理
- Spring4 多种定时器详解
- 对Python的深度学习库Theano的介绍
- AndroidStudio快捷键
- Android系统Recovery工作原理之使用update.zip升级过程分析(一)---update.zip包的制作