学习Caffe代码的方法心得
来源:互联网 发布:java object的方法 编辑:程序博客网 时间:2024/06/06 17:14
这几天在梳理深度学习框架Caffe的代码结构,有一些心得做一下记录。
从prototxt开始
按照我的理解,从系统整体结构来看,Caffe是个数据驱动型的系统,而非程序驱动型,如果要类比,可以类似于用于Java应用的Spring框架,我对Spring的了解也非常肤浅,不过我知道基于Sprin框架的java应用都是用xml配置文件来控制的。xml文件里详细定义了每个应用的Class以及相应的参数。。。。
从这一点来说,Caffe与Spring框架就有相似之处了,Caffe定义一个神经网络,定义训练和测试的参数都是通过ProtoBuffer格式的prototxt文件来完成。
比如对于一个训练来说,net定义,net中每个layer的参数,训练/测试数据来源,训练参数,都在定义在.prototxt文件中(训练超参数Hyper-Parameter),
lenet_solver.prototxt
# net 指定训练/测试的网络定义文件net: "examples/mnist/lenet_train_test.prototxt"############定义各种训练参数#############test_iter: 100test_interval: 500base_lr: 0.01momentum: 0.9weight_decay: 0.0005lr_policy: "inv"gamma: 0.0001power: 0.75########以下定义训练过程控制参数########## 每迭代100次输出logdisplay: 100# 最大迭代次数max_iter: 10000# 训练快照保存间隔(5000迭代保存一次)snapshot: 5000# 训练快照保存的位置snapshot_prefix: "examples/mnist/lenet"# Caffe运行模式CPU/GPUsolver_mode: GPU
在上面这个训练超参数文件中,描述了一次训练所需要的所有参数,这其中最关注的就是第一个参数net,从这里顺藤摸瓜,我们就找到了神经网络定义文件。
所以,在开始学习Caffe的代码整体结构时,并不需要急于看cpp/h代码,先看prototxt,通过solver.prototxt以及net 的prototxt文件就可以提纲挈领,从全局角度对代码结构有一个总体的了解。
另外还要提到一个文件就是 src/caffe/proto/caffe.proto
,Caffe几乎所有的数据类型定义都在这里,还以前面这个lenet_solver.prototxt文件为例,这个文件被Caffe读取后,最终也会被解析成一个叫SolverParameter的数据对象,打开caffe.proto,就能找到它,代码如下(已经简化删除掉一些无关或废弃的字段,完整定义请参见caffe.proto)。
message SolverParameter { // prototxt格式的网络定义文件名 optional string net = 24; // The number of iterations for each test net. repeated int32 test_iter = 3; // The number of iterations between two testing phases. optional int32 test_interval = 4 [default = 0]; optional float base_lr = 5; // The base learning rate // the number of iterations between displaying info. If display = 0, no info // will be displayed. optional int32 display = 6; optional int32 max_iter = 7; // the maximum number of iterations optional string lr_policy = 8; optional float gamma = 9; // The parameter to compute the learning rate. optional float power = 10; // The parameter to compute the learning rate. optional float momentum = 11; // The momentum value. optional float weight_decay = 12; // The weight decay. // the stepsize for learning rate policy "step" optional int32 stepsize = 13; // the stepsize for learning rate policy "multistep" repeated int32 stepvalue = 34; // Set clip_gradients to >= 0 to clip parameter gradients to that L2 norm, // whenever their actual L2 norm is larger. optional float clip_gradients = 35 [default = -1]; optional int32 snapshot = 14 [default = 0]; // The snapshot interval optional string snapshot_prefix = 15; // The prefix for the snapshot. optional SnapshotFormat snapshot_format = 37 [default = BINARYPROTO]; // 运算类型定义 enum SolverMode { CPU = 0; GPU = 1; } // 运算模式 CPU/GPU optional SolverMode solver_mode = 17 [default = GPU];}
那么同理,我们也可以推断lenet_solver.prototxt中定义的net文件lenet_train_test.prototxt也会被Caffe解析成一个叫NetParameter 数据对象,下面是NetParameter 的定义(已经简化删除掉一些无关或废弃的字段,完整定义请参见caffe.proto)
message NetParameter { // 网络名字 比如LeNet optional string name = 1; repeated int32 input_dim = 4; optional bool force_backward = 5 [default = false]; optional NetState state = 6; optional bool debug_info = 7 [default = false]; // 保存所有layer对象的layer数组,prototxt中的每个layer最终都保存在这里 repeated LayerParameter layer = 100; // ID 100 so layers are printed last.}
这里我们再看看lenet_train_test.prototxt,其中的name和layer就与NetParameter中的字段有了对应关系。
name: "LeNet"// 每个layer都是NetParameter中layer数组的一个元素layer { name: "mnist" type: "Data" top: "data" top: "label" include { phase: TRAIN } transform_param { scale: 0.00390625 } data_param { source: "examples/mnist/mnist_train_lmdb" batch_size: 64 backend: LMDB }}// 。。。略过layer { name: "loss" type: "SoftmaxWithLoss" bottom: "ip2" bottom: "label" top: "loss"}
再进一步NetParameter中layer的类型LayerParameter数组,那么在caffe.proto中找到LayerParameter,就可以搞知道LayerParameter的所有字段及含义描述:
message LayerParameter { optional string name = 1; // the layer name optional string type = 2; // the layer type repeated string bottom = 3; // the name of each bottom blob repeated string top = 4; // the name of each top blob // The train / test phase for computation. optional Phase phase = 10; // The amount of weight to assign each top blob in the objective. // Each layer assigns a default value, usually of either 0 or 1, // to each top blob. repeated float loss_weight = 5; // 略过..... optional MILParameter mil_param = 0x004d494c; //"MIL"}
。。。由此类推,你可以找到每个数据对象的定义及说明,由此也能找到每个数据对象的cpp/h文件。
layer factory
LayerParameter定义中的type是一个字符串,指定了这个层的类型,C++又不像java有Class.forName()这样的方法可以直接将一个类名实例化为一个Class,那么这个type中的名字是如何与实际的C++对象联系在一起的呢?这是我一直想搞清楚的问题。
Caffe维护了一个name->创建layer对象函数指针的map.所以Caffe可以根据type字段指定的名字,创建对应的layer对象。具体实现代码参见caffe/layer_factory.hpp和caffe/layer_factory.cpp
caffe/layer_factory.hpp ,中文为本文作者添加
#ifndef CAFFE_LAYER_FACTORY_H_#define CAFFE_LAYER_FACTORY_H_#include <map>#include <string>#include <vector>#include "caffe/common.hpp"#include "caffe/layer.hpp"#include "caffe/proto/caffe.pb.h"namespace caffe {template <typename Dtype>class Layer;template <typename Dtype>class LayerRegistry { public: // 定义创建layer对象的函数指针类型 typedef shared_ptr<Layer<Dtype> > (*Creator)(const LayerParameter&); // type -> layer对象的函数指针的映射类型 typedef std::map<string, Creator> CreatorRegistry; static CreatorRegistry& Registry() { // 全局静态变量(map实例) static CreatorRegistry* g_registry_ = new CreatorRegistry(); return *g_registry_; } // Adds a creator.向map中加入一个映射 static void AddCreator(const string& type, Creator creator) { CreatorRegistry& registry = Registry(); CHECK_EQ(registry.count(type), 0) << "Layer type " << type << " already registered."; registry[type] = creator; } // Get a layer using a LayerParameter. static shared_ptr<Layer<Dtype> > CreateLayer(const LayerParameter& param) { if (Caffe::root_solver()) { LOG(INFO) << "Creating layer " << param.name(); } const string& type = param.type(); CreatorRegistry& registry = Registry(); CHECK_EQ(registry.count(type), 1) << "Unknown layer type: " << type << " (known types: " << LayerTypeListString() << ")"; return registry[type](param); } static vector<string> LayerTypeList() { CreatorRegistry& registry = Registry(); vector<string> layer_types; for (typename CreatorRegistry::iterator iter = registry.begin(); iter != registry.end(); ++iter) { layer_types.push_back(iter->first); } return layer_types; } private: // Layer registry should never be instantiated - everything is done with its // static variables. LayerRegistry() {} static string LayerTypeListString() { vector<string> layer_types = LayerTypeList(); string layer_types_str; for (vector<string>::iterator iter = layer_types.begin(); iter != layer_types.end(); ++iter) { if (iter != layer_types.begin()) { layer_types_str += ", "; } layer_types_str += *iter; } return layer_types_str; }};template <typename Dtype>class LayerRegisterer { public: LayerRegisterer(const string& type, shared_ptr<Layer<Dtype> > (*creator)(const LayerParameter&)) { // LOG(INFO) << "Registering layer type: " << type; // 将指定创建layer对象的函数指针加入map LayerRegistry<Dtype>::AddCreator(type, creator); }};//宏定义用于将创建layer对象的函数指针加入map#define REGISTER_LAYER_CREATOR(type, creator) \ static LayerRegisterer<float> g_creator_f_##type(#type, creator<float>); \ static LayerRegisterer<double> g_creator_d_##type(#type, creator<double>) \#define REGISTER_LAYER_CLASS(type) \ template <typename Dtype> \ shared_ptr<Layer<Dtype> > Creator_##type##Layer(const LayerParameter& param) \ { \ return shared_ptr<Layer<Dtype> >(new type##Layer<Dtype>(param)); \ } \ REGISTER_LAYER_CREATOR(type, Creator_##type##Layer)} // namespace caffe#endif // CAFFE_LAYER_FACTORY_H_
caffe/layer_factory.cpp ,中文为本文作者添加
// Make sure we include Python.h before any system header// to avoid _POSIX_C_SOURCE redefinition#ifdef WITH_PYTHON_LAYER#include <boost/python.hpp>#endif#include <string>#include "caffe/layer.hpp"#include "caffe/layer_factory.hpp"#include "caffe/layers/conv_layer.hpp"#include "caffe/layers/lrn_layer.hpp"#include "caffe/layers/pooling_layer.hpp"#include "caffe/layers/relu_layer.hpp"#include "caffe/layers/sigmoid_layer.hpp"#include "caffe/layers/softmax_layer.hpp"#include "caffe/layers/tanh_layer.hpp"#include "caffe/proto/caffe.pb.h"#ifdef USE_CUDNN#include "caffe/layers/cudnn_conv_layer.hpp"#include "caffe/layers/cudnn_lcn_layer.hpp"#include "caffe/layers/cudnn_lrn_layer.hpp"#include "caffe/layers/cudnn_pooling_layer.hpp"#include "caffe/layers/cudnn_relu_layer.hpp"#include "caffe/layers/cudnn_sigmoid_layer.hpp"#include "caffe/layers/cudnn_softmax_layer.hpp"#include "caffe/layers/cudnn_tanh_layer.hpp"#endif#ifdef WITH_PYTHON_LAYER#include "caffe/layers/python_layer.hpp"#endifnamespace caffe {// Get convolution layer according to engine.template <typename Dtype>shared_ptr<Layer<Dtype> > GetConvolutionLayer( const LayerParameter& param) { ConvolutionParameter conv_param = param.convolution_param(); ConvolutionParameter_Engine engine = conv_param.engine();#ifdef USE_CUDNN bool use_dilation = false; for (int i = 0; i < conv_param.dilation_size(); ++i) { if (conv_param.dilation(i) > 1) { use_dilation = true; } }#endif if (engine == ConvolutionParameter_Engine_DEFAULT) { engine = ConvolutionParameter_Engine_CAFFE;#ifdef USE_CUDNN if (!use_dilation) { engine = ConvolutionParameter_Engine_CUDNN; }#endif } if (engine == ConvolutionParameter_Engine_CAFFE) { return shared_ptr<Layer<Dtype> >(new ConvolutionLayer<Dtype>(param));#ifdef USE_CUDNN } else if (engine == ConvolutionParameter_Engine_CUDNN) { if (use_dilation) { LOG(FATAL) << "CuDNN doesn't support the dilated convolution at Layer " << param.name(); } return shared_ptr<Layer<Dtype> >(new CuDNNConvolutionLayer<Dtype>(param));#endif } else { LOG(FATAL) << "Layer " << param.name() << " has unknown engine."; }}// 使用layer_factory.hpp定义的REGISTER_LAYER_CREATOR宏将"Convolution"加入定义REGISTER_LAYER_CREATOR(Convolution, GetConvolutionLayer);// Get pooling layer according to engine.template <typename Dtype>shared_ptr<Layer<Dtype> > GetPoolingLayer(const LayerParameter& param) { PoolingParameter_Engine engine = param.pooling_param().engine(); if (engine == PoolingParameter_Engine_DEFAULT) { engine = PoolingParameter_Engine_CAFFE;#ifdef USE_CUDNN engine = PoolingParameter_Engine_CUDNN;#endif } if (engine == PoolingParameter_Engine_CAFFE) { return shared_ptr<Layer<Dtype> >(new PoolingLayer<Dtype>(param));#ifdef USE_CUDNN } else if (engine == PoolingParameter_Engine_CUDNN) { if (param.top_size() > 1) { LOG(INFO) << "cuDNN does not support multiple tops. " << "Using Caffe's own pooling layer."; return shared_ptr<Layer<Dtype> >(new PoolingLayer<Dtype>(param)); } // CuDNN assumes layers are not being modified in place, thus // breaking our index tracking for updates in some cases in Caffe. // Until there is a workaround in Caffe (index management) or // cuDNN, use Caffe layer to max pooling, or don't use in place // layers after max pooling layers if (param.pooling_param().pool() == PoolingParameter_PoolMethod_MAX) { return shared_ptr<Layer<Dtype> >(new PoolingLayer<Dtype>(param)); } else { return shared_ptr<Layer<Dtype> >(new CuDNNPoolingLayer<Dtype>(param)); }#endif } else { LOG(FATAL) << "Layer " << param.name() << " has unknown engine."; }}// 使用layer_factory.hpp定义的REGISTER_LAYER_CREATOR宏将"Pooling"加入定义REGISTER_LAYER_CREATOR(Pooling, GetPoolingLayer);// Get LRN layer according to enginetemplate <typename Dtype>shared_ptr<Layer<Dtype> > GetLRNLayer(const LayerParameter& param) { LRNParameter_Engine engine = param.lrn_param().engine(); if (engine == LRNParameter_Engine_DEFAULT) {#ifdef USE_CUDNN engine = LRNParameter_Engine_CUDNN;#else engine = LRNParameter_Engine_CAFFE;#endif } if (engine == LRNParameter_Engine_CAFFE) { return shared_ptr<Layer<Dtype> >(new LRNLayer<Dtype>(param));#ifdef USE_CUDNN } else if (engine == LRNParameter_Engine_CUDNN) { LRNParameter lrn_param = param.lrn_param(); if (lrn_param.norm_region() ==LRNParameter_NormRegion_WITHIN_CHANNEL) { return shared_ptr<Layer<Dtype> >(new CuDNNLCNLayer<Dtype>(param)); } else { // local size is too big to be handled through cuDNN if (param.lrn_param().local_size() > CUDNN_LRN_MAX_N) { return shared_ptr<Layer<Dtype> >(new LRNLayer<Dtype>(param)); } else { return shared_ptr<Layer<Dtype> >(new CuDNNLRNLayer<Dtype>(param)); } }#endif } else { LOG(FATAL) << "Layer " << param.name() << " has unknown engine."; }}// 使用layer_factory.hpp定义的REGISTER_LAYER_CREATOR宏将"LRN"加入定义REGISTER_LAYER_CREATOR(LRN, GetLRNLayer);// Get relu layer according to engine.template <typename Dtype>shared_ptr<Layer<Dtype> > GetReLULayer(const LayerParameter& param) { ReLUParameter_Engine engine = param.relu_param().engine(); if (engine == ReLUParameter_Engine_DEFAULT) { engine = ReLUParameter_Engine_CAFFE;#ifdef USE_CUDNN engine = ReLUParameter_Engine_CUDNN;#endif } if (engine == ReLUParameter_Engine_CAFFE) { return shared_ptr<Layer<Dtype> >(new ReLULayer<Dtype>(param));#ifdef USE_CUDNN } else if (engine == ReLUParameter_Engine_CUDNN) { return shared_ptr<Layer<Dtype> >(new CuDNNReLULayer<Dtype>(param));#endif } else { LOG(FATAL) << "Layer " << param.name() << " has unknown engine."; }}// 使用layer_factory.hpp定义的REGISTER_LAYER_CREATOR宏将"ReLU"加入定义REGISTER_LAYER_CREATOR(ReLU, GetReLULayer);// Get sigmoid layer according to engine.template <typename Dtype>shared_ptr<Layer<Dtype> > GetSigmoidLayer(const LayerParameter& param) { SigmoidParameter_Engine engine = param.sigmoid_param().engine(); if (engine == SigmoidParameter_Engine_DEFAULT) { engine = SigmoidParameter_Engine_CAFFE;#ifdef USE_CUDNN engine = SigmoidParameter_Engine_CUDNN;#endif } if (engine == SigmoidParameter_Engine_CAFFE) { return shared_ptr<Layer<Dtype> >(new SigmoidLayer<Dtype>(param));#ifdef USE_CUDNN } else if (engine == SigmoidParameter_Engine_CUDNN) { return shared_ptr<Layer<Dtype> >(new CuDNNSigmoidLayer<Dtype>(param));#endif } else { LOG(FATAL) << "Layer " << param.name() << " has unknown engine."; }}// 使用layer_factory.hpp定义的REGISTER_LAYER_CREATOR宏将"Sigmoid"加入定义REGISTER_LAYER_CREATOR(Sigmoid, GetSigmoidLayer);// Get softmax layer according to engine.template <typename Dtype>shared_ptr<Layer<Dtype> > GetSoftmaxLayer(const LayerParameter& param) { SoftmaxParameter_Engine engine = param.softmax_param().engine(); if (engine == SoftmaxParameter_Engine_DEFAULT) { engine = SoftmaxParameter_Engine_CAFFE;#ifdef USE_CUDNN engine = SoftmaxParameter_Engine_CUDNN;#endif } if (engine == SoftmaxParameter_Engine_CAFFE) { return shared_ptr<Layer<Dtype> >(new SoftmaxLayer<Dtype>(param));#ifdef USE_CUDNN } else if (engine == SoftmaxParameter_Engine_CUDNN) { return shared_ptr<Layer<Dtype> >(new CuDNNSoftmaxLayer<Dtype>(param));#endif } else { LOG(FATAL) << "Layer " << param.name() << " has unknown engine."; }}// 使用layer_factory.hpp定义的REGISTER_LAYER_CREATOR宏将"Softmax"加入定义REGISTER_LAYER_CREATOR(Softmax, GetSoftmaxLayer);// Get tanh layer according to engine.template <typename Dtype>shared_ptr<Layer<Dtype> > GetTanHLayer(const LayerParameter& param) { TanHParameter_Engine engine = param.tanh_param().engine(); if (engine == TanHParameter_Engine_DEFAULT) { engine = TanHParameter_Engine_CAFFE;#ifdef USE_CUDNN engine = TanHParameter_Engine_CUDNN;#endif } if (engine == TanHParameter_Engine_CAFFE) { return shared_ptr<Layer<Dtype> >(new TanHLayer<Dtype>(param));#ifdef USE_CUDNN } else if (engine == TanHParameter_Engine_CUDNN) { return shared_ptr<Layer<Dtype> >(new CuDNNTanHLayer<Dtype>(param));#endif } else { LOG(FATAL) << "Layer " << param.name() << " has unknown engine."; }}// 使用layer_factory.hpp定义的REGISTER_LAYER_CREATOR宏将"TanH"加入定义REGISTER_LAYER_CREATOR(TanH, GetTanHLayer);#ifdef WITH_PYTHON_LAYERtemplate <typename Dtype>shared_ptr<Layer<Dtype> > GetPythonLayer(const LayerParameter& param) { Py_Initialize(); try { bp::object module = bp::import(param.python_param().module().c_str()); bp::object layer = module.attr(param.python_param().layer().c_str())(param); return bp::extract<shared_ptr<PythonLayer<Dtype> > >(layer)(); } catch (bp::error_already_set) { PyErr_Print(); throw; }}// 使用layer_factory.hpp定义的REGISTER_LAYER_CREATOR宏将"Python"加入定义REGISTER_LAYER_CREATOR(Python, GetPythonLayer);#endif// Layers that use their constructor as their default creator should be// registered in their corresponding cpp files. Do not register them here.} // namespace caffe
这个文件的主要作用就是将net中用到的这些layer类型的添加到map,搞明白这个原理,我们也可以根据自己需要创建自己layer,然后如上加入map,就可以在自己的网络定义中使用自定义的layer。
- 学习Caffe代码的方法心得
- 深度学习caffe的代码如何学习
- 深度学习caffe的代码怎么读?
- 深度学习caffe的代码怎么读?
- 深度学习caffe的代码怎么读?
- 深度学习caffe的代码怎么读?
- 深度学习caffe的代码怎么读?
- 深度学习caffe的代码怎么读?
- 深度学习caffe的代码怎么读?
- 深度学习caffe的代码怎么读?
- 深度学习caffe的代码怎么读?
- caffe代码学习--Blob
- 对“视觉机器学习20讲配套仿真代码”的研究心得---SVM方法
- 学习共享操作的代码及心得
- caffe SigmoidCrossEntropyLossLayer 理论代码学习
- caffe SigmoidCrossEntropyLossLayer 理论代码学习
- caffe SigmoidCrossEntropyLossLayer 理论代码学习
- Caffe学习特征可视化的一点代码问题.
- 慕课网JAVA入门第二季综合练习答答租车系统
- String、StringBuffer和StringBuilder
- GC(1.1OOPMap)
- APUE学习笔记2——第四章——文件和目录
- 什么是A类、B类、C类地址?
- 学习Caffe代码的方法心得
- Python 使用字符串
- java数据结构和算法(栈)
- CrapApiV6效果图
- 【安卓】安卓App开发思路 一步一个脚印(三)BaseActivity BaseFragment
- python挑战之level 28
- 循环
- 对于随机森林的通俗理解
- 【53.57%】【codeforces 722D】Generating Sets