SGD求解器的学习率a和遗忘因子u的设置原则
来源:互联网 发布:php lvs 编辑:程序博客网 时间:2024/06/11 01:05
SGD
Stochastic gradient descent (type: "SGD"
) updates the weights
Formally, we have the following formulas to compute the update value
The learning “hyperparameters” (
[1] L. Bottou. Stochastic Gradient Descent Tricks. Neural Networks: Tricks of the Trade: Springer, 2012.
Rules of thumb for setting the learning rate 阿法和momentum miu
A good strategy for deep learning with SGD is to initialize the learning rate
This was the strategy used by Krizhevsky et al. [1] in their famously winning CNN entry to the ILSVRC-2012 competition, and Caffe makes this strategy easy to implement in a SolverParameter
, as in our reproduction of [1] at ./examples/imagenet/alexnet_solver.prototxt
.
To use a learning rate policy like this, you can put the following lines somewhere in your solver prototxt file:
base_lr: 0.01 # begin training at a learning rate of 0.01 = 1e-2lr_policy: "step" # learning rate policy: drop the learning rate in "steps" # by a factor of gamma every stepsize iterationsgamma: 0.1 # drop the learning rate by a factor of 10 # (i.e., multiply it by a factor of gamma = 0.1)stepsize: 100000 # drop the learning rate every 100K iterationsmax_iter: 350000 # train for 350K iterations totalmomentum: 0.9
Under the above settings, we’ll always use momentum
base_lr
of gamma
(max_iter: 350000
) at
Note that the momentum setting
For example, with base_lr
) by a factor of 10.
Note also that the above settings are merely guidelines, and they’re definitely not guaranteed to be optimal (or even work at all!) in every situation. If learning diverges (e.g., you start to see very large or NaN
or inf
loss values or outputs), try dropping the base_lr
(e.g., base_lr: 0.001
) and re-training, repeating this until you find a base_lr
value that works.
- SGD求解器的学习率a和遗忘因子u的设置原则
- hiho #1241 : Best Route in a Grid 【dp 求解质因子2和5的最少匹配个数】
- nyist 小M的因子和 (A^B的因子和)
- POJ1845——A的B次方的因子和
- 算法学习之欧几里得算法求解两个整数的最大公因子
- POJ1845 A^B的因子和mod 9901
- poj1845 A^B 的因子和 (逆元)
- poj_1845_求所有a^b的所有因子和
- N的因子和
- 因子和的计算方法
- 数字的因子和
- R语言学习之字符串学习和因子的学习
- 填充因子设置的一般性准则和指导
- 神经网络学习之参数的设置原则
- 异步的SGD
- 学习过JSP容易遗忘的东西
- Serv-U的安装和基本设置
- 容易遗忘和忽视的东西
- Zynq-Linux移植学习笔记之十-u-boot网络配置
- Python图结构--邻接矩阵
- oracle 数据泵导入导出
- linux下/dev/null与/dev/zero
- 蓝桥杯 BEGIN-2 入门训练 序列求和
- SGD求解器的学习率a和遗忘因子u的设置原则
- Dubbo还是Spring-cloud?将来的架构你怎么选,两套方案对比
- php结合redis实现高并发下的抢购、秒杀功能
- junit测试和spring整合
- (一)系统任务(空闲任务、统计任务)与优先级配置
- [Unity&C#&继承]unity继承中覆盖和隐藏基类成员变量 string 变量
- Python Second Day
- 拉格朗日
- org.hibernate.exception.SQLGrammarException: could not execute statement