[深度学习论文笔记][Weight Initialization] Understanding the difficulty of training deep feedforward neural
来源:互联网 发布:java中ioc是什么 编辑:程序博客网 时间:2024/06/05 00:29
Glorot, Xavier, and Yoshua Bengio. “Understanding the difficulty of training deep feedforward neural networks.” Aistats. Vol. 9. 2010. [Citations: 722].
Plateaus are less present with the softmax cost function, while there are more severe plateaus with the quadratic cost.
2 Xavier Initialization
[Motivation] Ensure that all neurons in the network initially have approximately the same output distribution, and empiracally improves the rate of convergence.[Forward Pass] Consider linear activation function (or we are in the linear regime at the initialization).
We want
then
[Backward Pass]
We want
then
[Xavier Initialization] As a compromise between these two constraints, we might want to have
Recall the variance of uniform distribution U(−c, c) is
Let
then
I.e., the xavier initialization is
3 Xavier Initialization in Caffe
XavierFiller
- By default,
- If FillerParameter_VarianceNorm_FAN_OUT,
- If FillerParameter_VarianceNorm AVERAGE,
4 References
[1]. shuzfan. http://blog.csdn.net/shuzfan/article/details/51338178.
[2]. F.-F. Li, A. Karpathy and J. Johnson. http://cs231n.github.io/neural-networks-2/.
[3]. Caffe. https://github.com/BVLC/caffe/blob/master/include/caffe/filler.hpp.
- [深度学习论文笔记][Weight Initialization] Understanding the difficulty of training deep feedforward neural
- 《Understanding the difficulty of training deep feedforward neural networks》笔记
- Understanding the difficulty of training deep feedforward neural networks
- Understanding the difficulty of training deep feedforward neural networks (Xavier)
- Understanding the difficulty of training deep feedforward neural networks
- 【Deep Learning】笔记:Understanding the difficulty of training deep feedforward neural networks
- [深度学习论文笔记][Weight Initialization] Random walk initialization for training very deep feedforward netw
- 神经网络不同激活函数比较--读《Understanding the difficulty of training deep feedforward neural networks》
- [深度学习论文笔记][Weight Initialization] Batch Normalization: Accelerating Deep Network Training by Reducin
- [深度学习论文笔记][Weight Initialization] Exact solutions to the nonlinear dynamics of learning in deep lin
- [深度学习论文笔记][Weight Initialization] Data-dependent Initializations of Convolutional Neural Networks
- [深度学习论文笔记][Weight Initialization] Delving deep into rectifiers: Surpassing human-level performance
- [深度学习论文笔记][Visualizing] Understanding Neural Networks Through Deep Visualization
- 【深度学习论文笔记】Recover Canonical-View Faces in the Wild with Deep Neural Network
- 【深度学习论文笔记】Deep Neural Networks for Object Detection
- [深度学习论文笔记][Weight Initialization] 参数初始化部分论文导读
- On the difficulty of training Recurrent Neural Networks
- [深度学习论文笔记][Weight Initialization] All you need is a good init
- id in yii框架
- jquery跳转、刷新页面大全
- AFN做post请求,参数为数组的坑
- html新旧布局的区别
- SSM框架——详细整合教程(Spring+SpringMVC+MyBatis)
- [深度学习论文笔记][Weight Initialization] Understanding the difficulty of training deep feedforward neural
- Android中Intent连接不同组件的原理
- KaraTsuba乘法——高效的大数乘法
- MySql(28)------使用explain分析低效sql的执行情况
- 判断出入栈的合法性
- groovy
- html xmlns="http://www.w3.org/1999/xhtm的解释
- Ceph浅接触
- [InstallShield.12.豪华完全版介绍及破解