[深度学习论文笔记][Weight Initialization] All you need is a good init

来源:互联网 发布:高维数据稀疏表示 编辑:程序博客网 时间:2024/05/17 09:14
Mishkin, Dmytro, and Jiri Matas. “All you need is a good init.” arXiv preprint arXiv:1511.06422 (2015). [Citations: 19].


1 Layer-Sequential Unit-Variance Initialization

[Idea]
• Pre-initialize weights of each convolution or fc layer with orthonormal matrices.
• Normalizing the variance of the output of each layer to be equal to one.


[Algorithm] See Alg. 3.


[Hyper-parameters ε, T] Use them because it is often not possible to normalize variance with the desired precision due to the variation of data.

0 0