Caffe源码解析9: Caffe模型

来源：互联网发布：通用编程器报价编辑：程序博客网时间：2024/06/05 12:42

一个完整的深度学习系统最核心的两个方面是数据和模型。 一个深度学习模型通常由三部分参数组成：

可学习参数（Learnable Parameter），又称可训练参数、神经网络权重系数、权重，其数值由模型初始化参数、误差反向传播过程控制，一般不可人工干预。
结构参数（Archetecture Parameter），包括卷积层/全连接层/下采样层数目、卷积核数目、卷积核大小等描述网络结构的参数，一旦设定好，在网络训练阶段不可更改。
训练超参数（Hyper-Parameter），用来控制网络训练收敛的参数，训练阶段可自动或手动调节以获得更好的效果，预测阶段不需要该参数。

在caffe中，一个模型的三部分参数分别由不同模块定义和实现：

可学习参数在内存中使用Blob对象保持，必要时以二进制ProtoBuffer文件（*.caffemodel）形态序列化并存储于磁盘上，便于进一步微调（finetune）、共享（例如参数服务器Parameter Server，PS）、性能评估（benchmark）。
结构参数使用protoBuffer的文件格式（*.prototxt）描述，网络初始化时通过该描述文件构建Net对象、Layer对象形成有向无环图结构，在Layer与Layer之间、Net输入源和输出间均为持有数据和中间结果的Blob对象。
训练超参数同样使用ProtoBuffer文本格式（*.prototxt）描述，训练阶段利用该描述文件构建求解器Solver对象，该对象按照一定规则在训练网络时自动调节这些超参数值。

lenet_lr.prototxt:

name: "LeNet"layer {  name: "mnist"  type: "Data"  top: "data"  top: "label"  include {    phase: TRAIN  }  transform_param {    scale: 0.00390625  }  data_param {    source: "./examples/mnist/mnist_train_lmdb"    batch_size: 64    backend: LMDB  }}layer {  name: "mnist"  type: "Data"  top: "data"  top: "label"  include {    phase: TEST  }  transform_param {    scale: 0.00390625  }  data_param {    source: "./examples/mnist/mnist_test_lmdb"    batch_size: 100    backend: LMDB  }}layer {  name: "ip"  type: "InnerProduct"  bottom: "data"  top: "ip"  param {    lr_mult: 1  }  param {    lr_mult: 2  }  inner_product_param {    num_output: 10    weight_filler {      type: "xavier"    }    bias_filler {      type: "constant"    }  }}layer {  name: "accuracy"  type: "Accuracy"  bottom: "ip"  bottom: "label"  top: "accuracy"  include {    phase: TEST  }}layer {  name: "loss"  type: "SoftmaxWithLoss"  bottom: "ip"  bottom: "label"  top: "loss"}

lenet_lr_solver.prototxt

# The train/test net protocol buffer definitionnet: "examples/mnist/lenet_lr.prototxt"# test_iter specifies how many forward passes the test should carry out.# In the case of MNIST, we have test batch size 100 and 100 test iterations,# covering the full 10,000 testing images.test_iter: 100# Carry out testing every 500 training iterations.test_interval: 500# The base learning rate, momentum and the weight decay of the network.base_lr: 0.01momentum: 0.9weight_decay: 0.0005# The learning rate policylr_policy: "inv"gamma: 0.0001power: 0.75# Display every 100 iterationsdisplay: 100# The maximum number of iterationsmax_iter: 10000# snapshot intermediate resultssnapshot: 5000snapshot_prefix: "examples/mnist/lenet"# solver mode: CPU or GPUsolver_mode: CPU

使用以下命令D:\VS2012\Projects\caffe-windows>D:\VS2012\Projects\caffe-windows\Build\x64\Release\caffe train --solver=D:\VS2012\Projects\caffe-windows\examples\mnist\lenet_lr_solver.prototxt >log\lenet_lr_train.log 2>&1 可在log目录下查看到具体日志lenet_lr_train.log信息。

I0103 16:31:52.100195  7080 solver.cpp:91] Creating training net from net file: examples/mnist/lenet_lr.prototxtI0103 16:31:52.100195  7080 net.cpp:322] The NetState phase (0) differed from the phase (1) specified by a rule in layer mnistI0103 16:31:52.100195  7080 net.cpp:322] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracyI0103 16:31:52.100195  7080 net.cpp:58] Initializing net from parameters:

查看solver.cpp第91行中，有如下实现：

if (param_.has_net()) {    LOG_IF(INFO, Caffe::root_solver())        << "Creating training net from net file: " << param_.net();    ReadNetParamsFromTextFileOrDie(param_.net(), &net_param);  }

再从log中查看：

I0103 16:31:55.859853  7080 solver.cpp:454] Snapshotting to binary proto file examples/mnist/lenet_iter_5000.caffemodelI0103 16:31:55.861356  7080 sgd_solver.cpp:273] Snapshotting solver state to binary proto file examples/mnist/lenet_iter_5000.solverstate

其中.caffemodel文件是在特定训练间隙保持的二进制文件，包含当前网络各层的权值状态；而solverstate是与.caffemodel一起产生的二进制文件，包含从上次停止点恢复训练模型所需的信息。

查看solver.cpp的454行：

template <typename Dtype>

string Solver<Dtype>::SnapshotToBinaryProto() {  string model_filename = SnapshotFilename(".caffemodel");  //得到模型文件名  LOG(INFO) << "Snapshotting to binary proto file " << model_filename;  NetParameter net_param;  net_->ToProto(&net_param, param_.snapshot_diff()); //将net_转换为NetParameter  WriteProtoToBinaryFile(net_param, model_filename); //写入ProtoBuffer二进制文件，这里是lenet_iter_5000.caffemodel  return model_filename;}

再查看sgd_solver.cpp的273行：

template <typename Dtype>void SGDSolver<Dtype>::SnapshotSolverStateToBinaryProto(    const string& model_filename) {  SolverState state;  // 创建一个序列化对象  state.set_iter(this->iter_);  //记录当前迭代次数  state.set_learned_net(model_filename);  //记录网络描述文件  state.set_current_step(this->current_step_);  //记录当前步进值  state.clear_history();   //清空容器，准备接纳新内容  for (int i = 0; i < history_.size(); ++i) {    // Add history 记录权值的历史信息    BlobProto* history_blob = state.add_history();    history_[i]->ToProto(history_blob);  }  string snapshot_filename = Solver<Dtype>::SnapshotFilename(".solverstate");  LOG(INFO)    << "Snapshotting solver state to binary proto file " << snapshot_filename;  WriteProtoToBinaryFile(state, snapshot_filename.c_str());}

0 0