BN and Caffe BN
来源:互联网 发布:手机淘宝客户端改评价 编辑:程序博客网 时间:2024/05/21 11:48
caffe scale layer:
http://stackoverflow.com/questions/37410996/scale-layer-in-caffe
layer {
bottom: “res2b_branch2b”
top: “res2b_branch2b”
name: “scale2b_branch2b”
type: “Scale”
scale_param { bias_term: true }
}layer {
name: “scaleToUnitInt”
type: “Scale”
bottom: “bot”
top: “scaled”
param { lr_mult: 0
decay_mult: 0
}
param { lr_mult: 0
}
scale_param {
filler { value: 0.5 }
bias_term: true
bias_filler { value: -2 }
}
}
Quora关于BN的一些属性的回答:
https://www.quora.com/Why-does-batch-normalization-help
Batch Normalization solves such problem with some additional assumptions. Followings are the properties of Batch Normalization with mean and variance for a mini batch version:
- Learning faster: Learning rate can be increased compare to non-batch-normalized version.
Increase Accuracy: Flexibility on mean and variance value for every dimension in every hidden layer provides better learning, hence accuracy of the network.- Normalization or Whitening of the inputs to each layer: Zero means, unit variances and or not decorrelated.
- To remove the ill-effect of Internal Covariate shift:Transformation makes data to big or to small; change of the input distribution away from normalization due to successive transformation.
- Not-Stuck in the saturation mode: Even if ReLU is not used.
- Integrate Whitening within the gradient descent optimization: Decoupled Whitening between training steps, which modifies network directly, reduces the effort of optimization. So, model blows up when normalization parameters are computed outside the gradient descent step.
- Whitening within gradient descent: Requires inverse square root of covariance matrix as well as derivatives for backpropagation
- Normalization of Individual dimension: Individual dimension of hidden layers are normalized independently rather than joint covariances. So, features are not decorrelated.
- Normalization of mini-batch: Estimation of mean and variance are computed after each mini-batch rather than entire training set. 9. Even ignoring the joint covariance as it will create singular co-variance matrices for such small number of training sample per mini-batch compare to high dimension size of the hidden layer.
- Learning of scale and shift for every dimension: Scaled and shifted values are passed to the next layer, whether mean and variances are calculated after getting all mini-batch activation of current layer. So, forward pass of all the samples within the mini-batch should pass layer wise. Backpropagation is required for getting gradient of weights as well as scaling (variance) and shift (mean).
- Inference: During inference moving averaged mean and variance parameters during mini batch training are considered.
- Convolution Neural Network: Whitening of intermediate layers, before or after the nonlinearity creates a lot of new innovation pathways [11-15].
[1] What is buckling of a column?
[2] Why does buckling occur in columns?
[3] What is buckling?
[4] What is meant by buckling in engineering words?
[5] How does buckling analysis work?
[6] What is the difference between crippling and buckling?
[7] What is the difference between crushing and buckling failures of a column?
[8] What is the cylindrical buckling?
[9] What is difference between buckling and bending?
[10] batch normalization
[11] How do I apply Batch Normalization to the convolutional layer of a CNN?
[12] How does batch normalization behave differently at training time and test time?
[13] How does a person choose the best size of mini-batch in the test when the model is using batch normalization?
[14] How does a person choose the best size of mini-batch in the test when the model is using batch normalization?
[15] What is local response normalization?
介绍BN的blog:
https://standardfrancis.wordpress.com/2015/04/16/batch-normalization/
- BN and Caffe BN
- BN
- BN
- BN
- 均值问题and BN
- caffe (SGD ADAGRAD NAG BN)
- 2016.4.2 对于caffe BN代码分析
- 深度学习之caffe的BN层
- caffe中bn层与scale层
- BN层
- Tensorflow BN
- BN使用
- bn层
- BN算法
- CAFFE CIFAR10 MODEL IMAGE 之 cifar10 full sigmoid bn
- 小BN,大杀器
- bn 层及其变种
- Batch Normalization(BN)
- php页面相调用过程
- 项目2-程序的多文件组织
- PLSQL_查询SQL的执行次数和频率(案例)
- JAVA中的String类的部分方法
- win7下安装Docker
- BN and Caffe BN
- 常用SQL
- 实现导航界面
- Android之NDK开发一第一个测试
- 第二周-项目3 体验复杂度-汉诺塔问题
- java中的单例模式
- VS项目属性中的C/C++运行库:MT、MTd、MD、MDd
- css 常用属性
- Java NIO:浅析I/O模型