mxnet设置动态学习率（learning rate）

来源：互联网发布：java wait 用法编辑：程序博客网时间：2024/06/11 19:25

　　如果learning rate很大，算法会在局部最优点附近来回跳动，不会收敛；
　　如果learning rate太小，算法每步的移动距离很短，就会导致算法收敛速度很慢。
　　所以我们可以先设置一个比较大的学习率，随着迭代次数的增加慢慢降低它。mxnet中有现成的类class，我们可以直接引用。
　　这里有三种mxnet.lr_scheduler。
　　第一种是：

mxnet.lr_scheduler.FactorScheduler(step, factor=1, stop_factor_lr=1e-08)# Reduce the learning rate by a factor for every n steps.# It returns a new learning rate by:base_lr * pow(factor, floor(num_update/step))# Parameters:   step (int) – Changes the learning rate for every n updates.factor (float, optional) – The factor to change the learning rate.stop_factor_lr (float, optional) – Stop updating the learning rate if it is less than this value.

　　例如：

lr_sch = mxnet.lr_scheduler.FactorScheduler(step=500, factor=0.9)model.fit(        train_iter,        eval_data=val_iter,        optimizer='sgd',        optimizer_params={'learning_rate': 0.1, 'lr_scheduler': lr_sch},        eval_metric=metric,        num_epoch=num_epoch,

　　这里就表示：初始学习率是0.1 。经过500次参数更新后，学习率变为0.1×0.9。经过1000次参数更新之后，学习率变为0.1×0.9×0.9

　　第二种是：

class mxnet.lr_scheduler.LRScheduler(base_lr=0.01)# Base class of a learning rate scheduler.# A scheduler returns a new learning rate based on the number of updates that have been performed.Parameters: base_lr (float, optional) – The initial learning rate.__call__(num_update)# Return a new learning rate.# The num_update is the upper bound of the number of updates applied to every weight.# Assume the optimizer has updated i-th weight by k_i times, namely optimizer.update(i, weight_i) is called by k_i times. Then:num_update = max([k_i for all i])Parameters: num_update (int) – the maximal number of updates applied to a weight.

　　第三种是：

class mxnet.lr_scheduler.MultiFactorScheduler(step, factor=1)# Reduce the learning rate by given a list of steps.# Assume there exists k such that:step[k] <= num_update and num_update < step[k+1]# Then calculate the new learning rate by:base_lr * pow(factor, k+1)# Parameters:   step (list of int) – The list of steps to schedule a changefactor (float) – The factor to change the learning rate.

参考：https://mxnet.incubator.apache.org/api/python/optimization/optimization.html#mxnet.lr_scheduler.LRScheduler

阅读全文

0 0