Caffe之learning rate policy

来源:互联网 发布:许昌自学考试网络报名 编辑:程序博客网 时间:2024/06/06 07:04

learning rate很重要,如何设置它有很多种方法,在Caffe源码的caffe-master\src\caffe\solvers\sgd_solver.cpp中的GetLearningRate函数注释中有对应的介绍,如下:

// policies are as follows://    - fixed: always return base_lr.//    - step: return base_lr * gamma ^ (floor(iter / step))//    - exp: return base_lr * gamma ^ iter//    - inv: return base_lr * (1 + gamma * iter) ^ (- power)//    - multistep: similar to step but it allows non uniform steps defined by//      stepvalue//    - poly: the effective learning rate follows a polynomial decay, to be//      zero by the max_iter. return base_lr (1 - iter/max_iter) ^ (power)//    - sigmoid: the effective learning rate follows a sigmod decay//      return base_lr ( 1/(1 + exp(-gamma * (iter - stepsize))))//

废话不多说,直接看它们的图像一目了然:


对应matlab代码是:

iter=1:50000;  max_iter=50000;  base_lr=0.01;  gamma=0.0001;  power=0.75;  step_size=5000;  % - fixed: always return base_lr.  lr=base_lr*ones(1,50000);  subplot(2,3,1)  plot(lr)  title('fixed')  % - step: return base_lr * gamma ^ (floor(iter / step))  lr=base_lr .* gamma.^(floor(iter./10000));  subplot(2,3,2)  plot(lr)  title('step')  % - exp: return base_lr * gamma ^ iter  lr=base_lr * gamma .^ iter;  subplot(2,3,3)  plot(lr)  title('exp')  % - inv: return base_lr * (1 + gamma * iter) ^ (- power)  lr=base_lr.*(1./(1+gamma.*iter).^power);  subplot(2,3,4)  plot(lr)  title('inv')  % - multistep: similar to step but it allows non uniform steps defined by  % stepvalue  % - poly: the effective learning rate follows a polynomial decay, to be  % zero by the max_iter. return base_lr (1 - iter/max_iter) ^ (power)  lr=base_lr *(1 - iter./max_iter) .^ (power);  subplot(2,3,5)  plot(lr)  title('poly')  % - sigmoid: the effective learning rate follows a sigmod decay  % return base_lr ( 1/(1 + exp(-gamma * (iter - stepsize))))  lr=base_lr *( 1./(1 + exp(-gamma * (iter - step_size))));  subplot(2,3,6)  plot(lr)  title('sigmoid')

What is `lr_policy` in Caffe?



0 0
原创粉丝点击