[深度学习论文笔记][Image Classification] Identity Mappings in Deep Residual Networks
来源:互联网 发布:spin.js 编辑:程序博客网 时间:2024/05/16 00:45
1 Identity Mapping
[Original Residual Unit and Proposed Residual Unit] See Fig. 10.
[Forward Pass]
The feature of any deeper layer L 1 can be represented as the feature of any shallower layer L 0 plus a residual function.
[Backward Pass] The gradient of any shallow layer L 0 can be represented as the gradient of any deep layer L 1 plus the gradient propagates through the weight layers.
2 Experiment on Skip Connections
Replace skip shortcut with gating or 1 × 1 conv should have stronger representational abilities than identity shortcuts. In fact, the shortcut-only gating and 1 × 1 convolution cover the solution space of identity shortcuts. However, their training error is higher than that of identity shortcuts, indicating that the degradation of these models is caused by optimization issues, instead of representational abilities.
3 On the Usage of Activation Functions
There are various usages of activations, see Fig. 11.
[Original] The signal is impacted if it is negative. The impact of ReLU is not severe when the ResNet has fewer layers. This is because after some training, the weights are adjusted into a status such that x + F (x) is more frequently above zero and ReLU does not truncate it x is always non-negative due to the previous ReLU, so x + F (x) is below zero only when the magnitude of F is very negative). The truncation, however, is more frequent when there are 1000 layers.
[BN After Addition] BN layer alters the signal that passes through the shortcut and impedes information propagation, as reflected by the difficulties on reducing training loss at the beginning of training.
[ReLU Before Addition] This leads to a non-negative output from the transform F , while intuitively a “residual” function should take values in R . As a result, the forward propagated signal is monotonically increasing. This may impact the representational ability, and the result is worse than the baseline.
[ReLU-Only Pre-activation] This ReLU layer is not used in conjunction with a BN layer, and may not enjoy the benefits of BN.
[Full Pre-activation] It reaches slightly higher training loss at convergence, but produces lower test error. This is presumably caused by BN’s regularization effect. In the original Residual Unit, although the BN normalizes the signal, this is soon added to the shortcut and thus the merged signal is not normalized. This unnormalized signal is then used as the input of the next weight layer. On the contrary, in our pre-activation version, the inputs to all weight layers have been normalized.
- [深度学习论文笔记][Image Classification] Identity Mappings in Deep Residual Networks
- 论文笔记-Identity Mappings in Deep Residual Networks
- 论文笔记 | Identity Mappings in Deep Residual Networks
- 解密ResNet:Identity Mappings in Deep Residual Networks论文笔记
- 论文笔记— Identity Mappings in Deep Residual Networks
- 解密ResNet:Identity Mappings in Deep Residual Networks论文笔记
- Identity Mappings in Deep Residual Networks
- 论文笔记(一). Identity mapping in Deep Residual Networks
- [深度学习论文笔记][Image Classification] Deep Residual Learning for Image Recognition
- Identity Mappings in Deep Residual Networks(译)
- Identity Mappings in Deep Residual Networks(译)
- [深度学习论文笔记][Image Classification] ImageNet Classification with Deep Convolutional Neural Networks
- [深度学习论文笔记][Image Classification] Very Deep Convolutional Networks for Large-Scale Image Recognitio
- [深度学习论文笔记][Visualizing] Deep Inside Convolutional Networks Visualising Image Classification
- [深度学习论文笔记][Image Classification] Maxout Networks
- [深度学习论文笔记][Image Classification] Network in Network
- 深度学习论文笔记 [图像处理] Deep Residual Learning for Image Recognition
- Deep Residual Networks for Image Classification with Python + NumPy
- 数据库知识整理——经典SQL语句
- bootstrap-datepicker使用总结
- RedHat Enterprise Linux 7关闭防火墙方法
- WebRTC第三步:在客户端用shadowsocks作网络代理
- 别因为要学的太多反而压垮自己
- [深度学习论文笔记][Image Classification] Identity Mappings in Deep Residual Networks
- Visualforce页面apex:commandXXXX的rerender使用注意事项
- Android M新控件之FloatingActionButton,TextInputLayout,Snackbar,TabLayout的使用
- box-sizing 图解
- openCV—图像直方图及其直方图均衡化
- 创建ionic项目,cordova自定义插件
- MKL——常用函数说明
- linux c编程复习笔记1-10章
- Oracle修改表和索引的INITIAL初始化大小