Blobs, Layers, and Nets: anatomy of a Caffe model

来源：互联网发布：计算机三级数据库编辑：程序博客网时间：2024/05/21 10:04

Blobs

Blob作为Caffe的四大模块之一，负责完成CPU/GPU存储申请、同步和数据持久化映射。Caffe内部数据存储和通讯都是通过Blob来完成，Blob提供统一的存储操作接口，可用来保存训练数据、模型参数等。

Blob 事实上是调用了SyncedMemory 类。SyncedMemory类封装了CPU/GPU内存申请、同步和释放等。所以SyncedMemory 完成了对内存的实际操作。

A Blob is a wrapper over the actual data being processed and passed along by Caffe, and also under the hood provides synchronization capability between the CPU and the GPU. Mathematically, a blob is an N-dimensional array stored in a C-contiguous fashion.>

Caffe stores and communicates data using blobs. Blobs provide a unified memory interface holding data;
e.g., batches of images, model parameters, and derivatives for optimization.

A wrapper around SyncedMemory holders serving as the basic computational unit through which Layer, Net, and Solver interact.

As we are often interested in the values as well as the gradients of the blob, a Blob stores two chunks of memories, data and diff.
The former is the normal data that we pass along, and the latter is the gradient computed by the network.

所以说，Blob内部其实主要包含两个存储对象：data_ 和 diff_，前者存储前向传递的数据，后者存储反向传递的梯度。

Blob 主要用于存储数据和网络参数，同时也在 CPU 和 GPU 之间做了数据同步。Blob 原本在 Caffe 中被表示为一个 4 维数组 (num x channel x height x width)，现在借助于C++中的vector来表示多维数组，最高维数由宏 kMaxBlobAxes确定，目前 blob.hpp 中设置了 const int kMaxBlobAxes = 32。此外，Blob 类的代码主要集中在 blob.hpp 和 blob.cpp 中。

Note:
Reshaping an input blob and immediately calling Net::Backward is an error; either Net::Forward or Net::Reshape need to be called to propagate the new input shape to higher layers.

Blob代码分析

Blob的构造函数

默认构造函数完成最基本的初始化，两个显式构造函数会调用Reshape函数完成data_和diff_的共享内存对象SyncedMemory的申请。

//in blob.hppBlob() : data_(), diff_(), count_(0), capacity_(0) {}  /// @brief Deprecated; use <code>Blob(const vector<int>& shape)</code>.  explicit Blob(const int num, const int channels, const int height,const int width);  explicit Blob(const vector<int>& shape);

所以说，在存储的分配过程中是使用Reshape函数来实现的。

// in blob.cpp// 完成blob形状shape_的记录，大小count_的计算，合适大小capacity_存储的申请template <typename Dtype>void Blob<Dtype>::Reshape(const vector<int>& shape) {  CHECK_LE(shape.size(), kMaxBlobAxes);  count_ = 1;  shape_.resize(shape.size());  for (int i = 0; i < shape.size(); ++i) {    CHECK_GE(shape[i], 0);    count_ *= shape[i];    shape_[i] = shape[i];  }  if (count_ > capacity_) {    capacity_ = count_;    data_.reset(new SyncedMemory(capacity_ * sizeof(Dtype))); // 只是构造了SyncedMemory对象，并未真正分配内存和显存。真正分配是在第一次访问数据时    diff_.reset(new SyncedMemory(capacity_ * sizeof(Dtype)));   }}

接下来是Blob的数据访问函数:
Blob定义了两种数据访问方式：const方式只读，不允许改写数据；mutable方式可改写数据（对diff_的访问也是类似的）。

// in blob.hppconst Dtype* cpu_data() const;const Dtype* gpu_data() const;Dtype* mutable_cpu_data();Dtype* mutable_gpu_data();

以cpu_data()为例，看看数据访问是怎样完成的，

// in blob.cpptemplate <typename Dtype>const Dtype* Blob<Dtype>::cpu_data() const {  CHECK(data_);  return (const Dtype*)data_->cpu_data(); // 调用SyncedMemory的数据访问函数cpu_data()}

// in syncedmem.cppconst void* SyncedMemory::cpu_data() {  to_cpu(); // 首先完成数据同步，第一次访问时会申请存储空间  return (const void*)cpu_ptr_;}

由以上代码可以看出，Blob想要访问data_数据，Blob不关心细节调用，通过调用SyncedMemory 的数据访问函数cpu_data()，来完成数据的同步并返回数据指针。

序列化操作

Blob 数据可以通过 Protobuf 来做相应的序列化操作，ToProto 和 FromProto 完成相应的序列化操作。

Blob中存储了网络的中间处理结果和网络的参数，这些数据最终是要被存储到磁盘或从磁盘读入内存的，最后来看Blob的数据持久化函数是如何完成数据读写磁盘的。还记得之前提到过的Google Protocol Buffers吗？对，就是借助这个数据序列化和持久化库来完成的。在Caffe的源码文件中有一个文件caffe.proto，其中与Blob相关的有BlobShape、BlobProto、BlobProtoVector，BlobShape与shape_对应，BlobProto是Blob序列化对象。

以下是caffe.proto中关于Blob的定义。

// Specifies the shape (dimensions) of a Blob.message BlobShape {  repeated int64 dim = 1 [packed = true];}message BlobProto {  optional BlobShape shape = 7;  repeated float data = 5 [packed = true];  repeated float diff = 6 [packed = true];  // 4D dimensions -- deprecated.  Use "shape" instead.  optional int32 num = 1 [default = 0];  optional int32 channels = 2 [default = 0];  optional int32 height = 3 [default = 0];  optional int32 width = 4 [default = 0];}// The BlobProtoVector is simply a way to pass multiple blobproto instances// around.message BlobProtoVector {  repeated BlobProto blobs = 1;}

Blob的序列化函数:

//in blob.hppvoid FromProto(const BlobProto& proto, bool reshape = true);void ToProto(BlobProto* proto, bool write_diff = false) const;

ToProto将Blob的shape_, data_, diff_分别copy到BlobProto的shape,data,diff,完成序列化。

FromProto将BlobProto的shape,data,diff分别copy到Blob的shape_,data_,diff_,完成数据解析。

最后数据持久化函数由Protocol Buffers的工具实现，详见io.hpp,其中，数据可以text和binary两种格式被持久化

// in io.hppbool ReadProtoFromTextFile(const char* filename, Message* proto);bool ReadProtoFromBinaryFile(const char* filename, Message* proto);void WriteProtoToTextFile(const Message& proto, const char* filename);void WriteProtoToBinaryFile(const Message& proto, const char* filename);

参数更新update（）

Blob还有一个参数更新函数也很重要Update, 它会被网络中存储参数的Blob调用，主要完成梯度下降过程中的参数更新。其中核心操作就是 : data(k+1)=data(k)−diff

In the blob.cpp implementation details, the most important function is Update. If the newest data is on GPU side, the Update function calls the caffe_gpu_axpy, otherwise calls the caffe_axpy. here is the definition of Update:

// in blob.cpptemplate <typename Dtype>void Blob<Dtype>::Update() {  // We will perform update based on where the data is located.  switch (data_->head()) {  case SyncedMemory::HEAD_AT_CPU:    // perform computation on CPU    // 参数更新，新参数（data_） = 原参数(data_) - 梯度(diff_)    // caffe_axpy: Y=alpha * X + Y     caffe_axpy<Dtype>(count_, Dtype(-1),        static_cast<const Dtype*>(diff_->cpu_data()),        static_cast<Dtype*>(data_->mutable_cpu_data()));    break;  case SyncedMemory::HEAD_AT_GPU:  case SyncedMemory::SYNCED:#ifndef CPU_ONLY    // perform computation on GPU    caffe_gpu_axpy<Dtype>(count_, Dtype(-1),        static_cast<const Dtype*>(diff_->gpu_data()),        static_cast<Dtype*>(data_->mutable_gpu_data()));#else    NO_GPU;#endif    break;  default:    LOG(FATAL) << "Syncedmem not initialized.";  }}

Blob总结

Caffe 通过 SyncedMemory 和 Blob 封装了底层数据，为 Caffe 框架上的其他组件提供最基础的数据抽象，后面的 Layer 参数，Net 参数以及 Solver 的参数等都是 Blob 数据，所以理解 Blob 抽象和管理数据的实现方式有助于后续 Caffe 源码的阅读，也是阅读 Caffe 源码的第一步。

注：这一部分的很多思路来自于博主Bin Wang

Layer

The layer is the essence of a model and the fundamental unit of computation. Layers convolve filters, pool, take inner products, apply nonlinearities like rectified-linear and sigmoid and other elementwise transformations, normalize, load data, and compute losses like softmax and hinge. See the layer catalogue for all operations. Most of the types needed for state-of-the-art deep learning tasks are there.

Each layer type defines three critical computations: setup, forward, and backward.

Setup: initialize the layer and its connections once at model initialization.

Forward: given input from bottom compute the output and send to the top.

Backward: given the gradient w.r.t. the top output compute the gradient w.r.t. to the input and send to the bottom. A layer with parameters computes the gradient w.r.t. to its parameters and stores it internally.

Layers have two key responsibilities for the operation of the network as a whole:

a forward pass that takes the inputs and produces the outputs

a backward pass that takes the gradient with respect to the output, and computes the gradients with respect to the parameters and to the inputs, which are in turn back-propagated to earlier layers.
These passes are simply the composition of each layer’s forward and backward.

Layer 加载机制

Caffe 的 Layer 注册表其实就是一组键值对，key 为 Layer 的类型而 value 则对应其工厂函数。下面两组宏控制了 Layer 的注册动作。

#define REGISTER_LAYER_CREATOR(type, creator)                                  \  static LayerRegisterer<float> g_creator_f_##type(#type, creator<float>);     \  static LayerRegisterer<double> g_creator_d_##type(#type, creator<double>)    \#define REGISTER_LAYER_CLASS(type)                                             \  template <typename Dtype>                                                    \  shared_ptr<Layer<Dtype> > Creator_##type##Layer(const LayerParameter& param) \  {                                                                            \    return shared_ptr<Layer<Dtype> >(new type##Layer<Dtype>(param));           \  }                                                                            \  REGISTER_LAYER_CREATOR(type, Creator_##type##Layer)

REGISTER_LAYER_CLASS 宏可以实现将特定 Layer 注册到全局注册表中，首先定义一个工厂函数用来产生 Layer 对象，然后调用 REGISTER_LAYER_CREATOR 将工厂函数和 Layer 的类型名进行注册，注册时只是用 Layer 的 float 和 double 类型，这是网络实际数据使用到的类型。两个静态变量一个对应 float，另一个对应 double，这两个变量的初始化，也就是它们的构造函数实际上完成 Layer 的注册动作。

Normal layers

Vision Layer
a. Convolution
b. Pooling
c. Local Response Normalization(LRN)
d. im2co
Losss Layers
a. Softmax
b. Sum-of-Squares / Euclidean
c. Hinge / Margin
d. Sigmoid-Cross-Entroy
e. Infogain
f. Accuracy and Top-k
Activation / Neuron Layers
a. ReLU / Rectified-Linear and Leaky-ReLU
b. Sigmoid
c. TanH / Hyperbolic Tangent
d. Absolute Value
e. Power
f. BNLL
Data Layers
a. Database
b. In-Memory
c. HDF5 Input
d. HDF5 Output
e. Images
f. Windows
g. Dummy
Common Layers
a. Inner Product
b. Splitting
c. Flattening
d. Reshape
e. Concatenation
f. Slicing
g. Elementwise Operations
h. Argmax
i. Softmax
j. Mean-Variance Normalization

Net

The net jointly defines a function and its gradient by composition and auto-differentiation. The composition of every layer’s output computes the function to do a given task, and the composition of every layer’s backward computes the gradient from the loss to learn the task. Caffe models are end-to-end machine learning engines.

The net is a set of layers connected in a computation graph – a directed acyclic graph (DAG) to be exact. Caffe does all the bookkeeping for any DAG of layers to ensure correctness of the forward and backward passes. A typical net begins with a data layer that loads from disk and ends with a loss layer that computes the objective for a task such as classification or reconstruction.

The net is defined as a set of layers and their connections in a plaintext modeling language. A simple logistic regression classifier

Model initialization is handled by Net::Init().
The initialization mainly does two things:

scaffolding the overall DAG by creating the blobsand layers
(for C++ geeks: the network will retain ownership of the blobs and layers during its lifetime),

and calls the layers’ SetUp() function. It also does a set of other bookkeeping things, such as validating the correctness of the overall network architecture. Also, during initialization the Net explains its initialization by logging to INFO as it goes:

Model format

The models are defined in plaintext protocol buffer schema (prototxt) while the learned models are serialized as binary protocol buffer (binaryproto) .caffemodel files.

The model format is defined by the protobuf schema in caffe.proto. The source file is mostly self-explanatory so one is encouraged to check it out.

Caffe speaks Google Protocol Buffer for the following strengths: minimal-size binary strings when serialized, efficient serialization, a human-readable text format compatible with the binary version, and efficient interface implementations in multiple languages, most notably C++ and Python. This all contributes to the flexibility and extensibility of modeling in Caffe.

Article Reference:

http://spiritsaway.info/code/under-the-hood-caffe.html#a7d6d5
http://imbinwang.github.io/blog/inside-caffe-code-blob
http://lepsucd.com/?p=453

0 0