Caffe源码解析9: Caffe模型
来源:互联网 发布:通用编程器报价 编辑:程序博客网 时间:2024/06/05 12:42
一个完整的深度学习系统最核心的两个方面是数据和模型。 一个深度学习模型通常由三部分参数组成:
- 可学习参数(Learnable Parameter),又称可训练参数、神经网络权重系数、权重,其数值由模型初始化参数、误差反向传播过程控制,一般不可人工干预。
- 结构参数(Archetecture Parameter),包括卷积层/全连接层/下采样层数目、卷积核数目、卷积核大小等描述网络结构的参数,一旦设定好,在网络训练阶段不可更改。
- 训练超参数(Hyper-Parameter),用来控制网络训练收敛的参数,训练阶段可自动或手动调节以获得更好的效果,预测阶段不需要该参数。
在caffe中,一个模型的三部分参数分别由不同模块定义和实现:
- 可学习参数在内存中使用Blob对象保持,必要时以二进制ProtoBuffer文件(*.caffemodel)形态序列化并存储于磁盘上,便于进一步微调(finetune)、共享(例如参数服务器Parameter Server,PS)、性能评估(benchmark)。
- 结构参数使用protoBuffer的文件格式(*.prototxt)描述,网络初始化时通过该描述文件构建Net对象、Layer对象形成有向无环图结构,在Layer与Layer之间、Net输入源和输出间均为持有数据和中间结果的Blob对象。
- 训练超参数同样使用ProtoBuffer文本格式(*.prototxt)描述,训练阶段利用该描述文件构建求解器Solver对象,该对象按照一定规则在训练网络时自动调节这些超参数值。
lenet_lr.prototxt:
name: "LeNet"layer { name: "mnist" type: "Data" top: "data" top: "label" include { phase: TRAIN } transform_param { scale: 0.00390625 } data_param { source: "./examples/mnist/mnist_train_lmdb" batch_size: 64 backend: LMDB }}layer { name: "mnist" type: "Data" top: "data" top: "label" include { phase: TEST } transform_param { scale: 0.00390625 } data_param { source: "./examples/mnist/mnist_test_lmdb" batch_size: 100 backend: LMDB }}layer { name: "ip" type: "InnerProduct" bottom: "data" top: "ip" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 10 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}layer { name: "accuracy" type: "Accuracy" bottom: "ip" bottom: "label" top: "accuracy" include { phase: TEST }}layer { name: "loss" type: "SoftmaxWithLoss" bottom: "ip" bottom: "label" top: "loss"}
lenet_lr_solver.prototxt
# The train/test net protocol buffer definitionnet: "examples/mnist/lenet_lr.prototxt"# test_iter specifies how many forward passes the test should carry out.# In the case of MNIST, we have test batch size 100 and 100 test iterations,# covering the full 10,000 testing images.test_iter: 100# Carry out testing every 500 training iterations.test_interval: 500# The base learning rate, momentum and the weight decay of the network.base_lr: 0.01momentum: 0.9weight_decay: 0.0005# The learning rate policylr_policy: "inv"gamma: 0.0001power: 0.75# Display every 100 iterationsdisplay: 100# The maximum number of iterationsmax_iter: 10000# snapshot intermediate resultssnapshot: 5000snapshot_prefix: "examples/mnist/lenet"# solver mode: CPU or GPUsolver_mode: CPU
使用以下命令D:\VS2012\Projects\caffe-windows>D:\VS2012\Projects\caffe-windows\Build\x64\Release\caffe train --solver=D:\VS2012\Projects\caffe-windows\examples\mnist\lenet_lr_solver.prototxt >log\lenet_lr_train.log 2>&1 可在log目录下查看到具体日志lenet_lr_train.log信息。
I0103 16:31:52.100195 7080 solver.cpp:91] Creating training net from net file: examples/mnist/lenet_lr.prototxtI0103 16:31:52.100195 7080 net.cpp:322] The NetState phase (0) differed from the phase (1) specified by a rule in layer mnistI0103 16:31:52.100195 7080 net.cpp:322] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracyI0103 16:31:52.100195 7080 net.cpp:58] Initializing net from parameters:
查看solver.cpp第91行中,有如下实现:
if (param_.has_net()) { LOG_IF(INFO, Caffe::root_solver()) << "Creating training net from net file: " << param_.net(); ReadNetParamsFromTextFileOrDie(param_.net(), &net_param); }
再从log中查看:
I0103 16:31:55.859853 7080 solver.cpp:454] Snapshotting to binary proto file examples/mnist/lenet_iter_5000.caffemodelI0103 16:31:55.861356 7080 sgd_solver.cpp:273] Snapshotting solver state to binary proto file examples/mnist/lenet_iter_5000.solverstate
其中.caffemodel文件是在特定训练间隙保持的二进制文件,包含当前网络各层的权值状态;而solverstate是与.caffemodel一起产生的二进制文件,包含从上次停止点恢复训练模型所需的信息。
查看solver.cpp的454行:
template <typename Dtype>
string Solver<Dtype>::SnapshotToBinaryProto() { string model_filename = SnapshotFilename(".caffemodel"); //得到模型文件名 LOG(INFO) << "Snapshotting to binary proto file " << model_filename; NetParameter net_param; net_->ToProto(&net_param, param_.snapshot_diff()); //将net_转换为NetParameter WriteProtoToBinaryFile(net_param, model_filename); //写入ProtoBuffer二进制文件,这里是lenet_iter_5000.caffemodel return model_filename;}
再查看sgd_solver.cpp的273行:
template <typename Dtype>void SGDSolver<Dtype>::SnapshotSolverStateToBinaryProto( const string& model_filename) { SolverState state; // 创建一个序列化对象 state.set_iter(this->iter_); //记录当前迭代次数 state.set_learned_net(model_filename); //记录网络描述文件 state.set_current_step(this->current_step_); //记录当前步进值 state.clear_history(); //清空容器,准备接纳新内容 for (int i = 0; i < history_.size(); ++i) { // Add history 记录权值的历史信息 BlobProto* history_blob = state.add_history(); history_[i]->ToProto(history_blob); } string snapshot_filename = Solver<Dtype>::SnapshotFilename(".solverstate"); LOG(INFO) << "Snapshotting solver state to binary proto file " << snapshot_filename; WriteProtoToBinaryFile(state, snapshot_filename.c_str());}
0 0
- Caffe源码解析9: Caffe模型
- Caffe源码解析caffe.cpp
- 【Caffe源码解析】DataLayer
- caffe.proto 源码解析
- caffe源码解析-inner_product_layer
- caffe源码解析-BaseConvolutionLayer
- caffe源码解析-im2col
- Caffe源码:math_functions 解析
- Caffe源码解析
- caffe 源码解析系列
- caffe源码解析
- caffe源码解析 — caffe.proto
- caffe源码解析 — caffe.proto
- caffe源码解析 — caffe.proto
- caffe源码解析 — caffe.proto
- caffe源码解析 — caffe.proto
- 【Caffe】002 caffe.cpp源码解析
- 【Caffe】003 caffe.proto源码解析
- Android实现Canny算法
- 未能加载文件或程序集 Microsoft.ReportViewer.Common, Version=11.0.0.0
- R语言实战笔记--第十章 功效分析&样本量
- leetcode389
- VB.Net未能加载文件或程序集“XXX”或它的某一个依赖项。
- Caffe源码解析9: Caffe模型
- 使用eclipse创建之前没有创建的web.xml
- 有效学习的方法
- cart(分类与回归树)原理与实现
- mysql分区
- iOS CoreGraphics 框架介绍
- CSS学习
- 290. Word Pattern*
- Box2D中切割刚体效果的实现一览(一)