caffe 学习笔记之solver层
来源:互联网 发布:软件测试核心期刊 编辑:程序博客网 时间:2024/05/16 09:04
caffe的protobuf中去掉注释和删除的部分的solver的定义为:
message SolverParameter { optional string net = 24;//net路径 optional NetParameter net_param = 25; optional string train_net = 1; repeated string test_net = 2; optional NetParameter train_net_param = 21; repeated NetParameter test_net_param = 22; optional NetState train_state = 26; repeated NetState test_state = 27; repeated int32 test_iter = 3;//test时取batch_size的图片进行测试的次数,取的总图片数num=test_iter*batch_size,通常要与测试的图片数量有关 optional int32 test_interval = 4 [default = 0];//每test_interval进行一次test optional bool test_compute_loss = 19 [default = false]; optional bool test_initialization = 32 [default = true];//网络初始时进行一次test optional float base_lr = 5;//初始学习率 optional int32 display = 6;//日志输出间隔的迭代次数 optional int32 average_loss = 33 [default = 1]; optional int32 max_iter = 7; //最大迭代次数 optional int32 iter_size = 36 [default = 1];//`iter_size`x`batch_size`个实例进行一次梯度计算 optional string lr_policy = 8;//学习率策略 optional float gamma = 9; optional float power = 10; optional float momentum = 11; //动量值,通常取0.9 optional float weight_decay = 12; //权重衰减通常取0.0005 optional string regularization_type = 29 [default = "L2"];//正则化类型,{"L1","L2"} optional int32 stepsize = 13;//step policy时的参数 repeated int32 stepvalue = 34;//multi_step policy时的参数 optional float clip_gradients = 35 [default = -1]; optional int32 snapshot = 14 [default = 0]; //snapshot时的间隔次数,为0则不保存中间态 optional string snapshot_prefix = 15;//snapshot时保存的文件前缀, optional bool snapshot_diff = 16 [default = false];//是否保存梯度,用于辅助debug,会增大保存文件的尺寸 enum SnapshotFormat { HDF5 = 0; BINARYPROTO = 1; } optional SnapshotFormat snapshot_format = 37 [default = BINARYPROTO];//保存格式类型 enum SolverMode { CPU = 0; GPU = 1; } optional SolverMode solver_mode = 17 [default = GPU]; optional int32 device_id = 18 [default = 0]; optional int64 random_seed = 20 [default = -1]; optional string type = 40 [default = "SGD"];//优化器类型,{"SGD","Nesterov","AdaGrad","RMSProp","AdaDelta","ADAM"} optional float delta = 31 [default = 1e-8]; optional float momentum2 = 39 [default = 0.999]; optional float rms_decay = 38 [default = 0.99]; optional bool debug_info = 23 [default = false];//若为真,打印有关网络的信息,可用于debug optional bool snapshot_after_train = 28 [default = true];//若为假,则训练完毕后不执行snapshot操作 optional bool layer_wise_reduce = 41 [default = true];//用于数据并行训练的重叠计算和通讯操作}
相关的其它meaaage有:NetState
message NetState { optional Phase phase = 1 [default = TEST];//{"TRAIN","TEST"} optional int32 level = 2 [default = 0]; repeated string stage = 3;}
NetParameter
message NetParameter { optional string name = 1; //net的名字 optional bool force_backward = 5 [default = false];//层是否进行反向传播自动地取决于网络架构和学习状态,为真则强制进行反向传播计算 optional NetState state = 6; optional bool debug_info = 7 [default = false];//在网络进行forward,backword,update时打印debugging信息 // The layers that make up the net. Each of their configurations, including connectivity and behavior, is specified as a LayerParameter. repeated LayerParameter layer = 100; // ID 100 so layers are printed last.}
Phase
enum Phase { TRAIN = 0; TEST = 1;}
学习率的更新方式为:
//The learning rate decay policy. The currently implemented learning rate policies are as follows: // - fixed: always return base_lr. // - step: return base_lr * gamma ^ (floor(iter / step)) // - exp: return base_lr * gamma ^ iter // - inv: return base_lr * (1 + gamma * iter) ^ (- power) // - multistep: similar to step but it allows non uniform steps defined by // stepvalue // - poly: the effective learning rate follows a polynomial decay, to be // zero by the max_iter. return base_lr (1 - iter/max_iter) ^ (power) // - sigmoid: the effective learning rate follows a sigmod decay // return base_lr ( 1/(1 + exp(-gamma * (iter - stepsize)))) // // where base_lr, max_iter, gamma, step, stepvalue and power are defined // in the solver parameter protocol buffer, and iter is the current iteration.
caffe的六种优化器介绍有优化方法概述
一个solver文件例子有AlexNet in caffe
net: "models/bvlc_alexnet/train_val.prototxt"test_iter: 1000test_interval: 1000base_lr: 0.01lr_policy: "step"gamma: 0.1stepsize: 100000display: 20max_iter: 450000momentum: 0.9weight_decay: 0.0005snapshot: 10000snapshot_prefix: "models/bvlc_alexnet/caffe_alexnet_train"solver_mode: GPU
阅读全文
0 0
- caffe 学习笔记之solver层
- caffe学习笔记(3):solver层配置
- 深度学习之caffe Solver
- caffe学习笔记10-solver.prototxt学习
- caffe学习笔记1 SGD solver
- Caffe学习笔记8:solver参数配置
- caffe学习笔记7:solver及其配置
- caffe学习笔记8:solver优化方法
- CAFFE源码学习笔记之激活层
- CAFFE源码学习之优化方法solver
- 【深度学习】caffe之SGD solver
- caffe教程笔记《Solver》
- Caffe学习:Solver
- Caffe学习3-Solver
- Caffe学习:Solver
- Caffe solver.prototxt学习
- Caffe学习:Solver
- DL学习笔记【5】caffe参数调节-solver文件
- 如何使用夜神模拟器调试ReactNative应用
- Java7中的HashMap详解
- 303. Range Sum Query
- 一路孤独走来
- Java——类与对象
- caffe 学习笔记之solver层
- echarts中option的title
- day14 神奇的正则和工具类
- SEO的艺术
- java中的private public protected
- SSH:Action中Service无法实例化
- TensorFlow 实现流行的机器学习算法的教程汇集
- [已解决]Ubuntu安装libssl-dev失败
- WIN10连接远程桌面(以阿里云服务器为例)