caffe随记（二） --- 数据结构简介

来源：互联网发布：3个人合唱的歌曲知乎编辑：程序博客网时间：2024/04/28 21:04

caffe随记（二） --- 数据结构简介

注：这篇文章博文我写的内容有点多，建议看一下左上角的目录，对本文结构有个大致了解。

1、Blob

Blob其实从代码的角度看，它是一个模板类。Blob封装了运行时的数据信息（存储、交换和处理网络中正反向传播时的数据和导数信息），并且在CPU和GPU之间具有同步处理的能力。

对于图像处理来说，Blob是一个四维数组，（N, C, H ,W）, 其中N表示图片的数量，C表示图片的通道数，H表示图片的高度， W表示图片的宽度。除了图片数据，Blob也可以用于非图片数据。比如传统的多层感知机，就是比较简单的全连接网络，用2D的Blob，调用innerProduct层来计算就可以了。

在模型中设定的参数，也是用Blob来表示和运算。它的维度会根据参数的类型不同而不同。比如：在一个卷积层中，输入一张3通道图片，有96个卷积核，每个核大小为11*11，因此这个Blob是96*3*11*11. 而在一个全连接层中，假设输入1024通道图片，输出1000个数据，则Blob为1000*1024。

Caffe.proto中Blob的描述

// Specifies the shape (dimensions) of a Blob.该结构描述Blob的形状信息message BlobShape {  repeated int64 dim = 1 [packed = true]; // 只包含若干int64类型的值，分别表示Blob每个维度的大小。packed表示这些值紧密排布}message BlobProto {   //该结构表示Blob载磁盘中序列化后的形态  optional BlobShape shape = 7;              //可选，包括一个个BlobShape对象  repeated float data = 5 [packed = true];   //包括若干的（repeated）浮点类型的元素,存储数据,权值；元素数目由shape或者(num,channels,height,width)确定  repeated float diff = 6 [packed = true];   //包括若干的浮点类型的元素，来存储增量信息（diff）,维度与上面的data一致  repeated double double_data = 8 [packed = true];  //与data并列,只是类型是double  repeated double double_diff = 9 [packed = true];  //与diff并列,只是类型是double  // 4D dimensions -- deprecated.  Use "shape" instead. 可选的维度信息,新版本Caffe推荐使用shape来代替  optional int32 num = 1 [default = 0];  optional int32 channels = 2 [default = 0];  optional int32 height = 3 [default = 0];  optional int32 width = 4 [default = 0];}// The BlobProtoVector is simply a way to pass multiple blobproto instances around.存放多个BlobProto实例的对应Index，易于引用 message BlobProtoVector {  repeated BlobProto blobs = 1;}

2、Layer

Layer是caffe模型的本质内容和执行计算的基本单元。至少有一个输入Blob（Bottom Blob）和一个输出Blob（Top Blob），部分Layer带有权值和偏置项。

每一种类型的层都定义了三种关键的计算：setup, forward, backward

●setup: Layer的建立和初始化，以及在整个模型中的连接初始化。

●forward: 从bottom得到输入数据，进行计算，并将计算结果送到top，进行输出。

●backward: 从Layer的输出端top得到数据的梯度，计算当前层的梯度，并将计算结果送到bottom,向前传递。

Ⅰ、Caffe.proto中Layer的描述，

这个才是一切Layer的根本，如果第一次看不明白，可以看完我后面举的各种Layer的例子之后再回头来看一遍

## 后面接的是我的翻译

// NOTE// Update the next available ID when you add a new LayerParameter field.//  ##注意,当在LayerParameter中新增字段时，需要为其更新下一个可用ID。// LayerParameter next available layer-specific ID: 151 (last added: box_annotator_ohem_param) // ##LayerParameter中下一个可用的ID是151（最近一次增加的是box_annotator_ohem_param,即下面数字为150的那个）// ## 层参数{名称，类型，输入底，输出顶，阶段，损失加权系数，全局乘数，}  message LayerParameter {  optional string name = 1; // the layer name    ##层的名称  optional string type = 2; // the layer type    ##层的类型  repeated string bottom = 3; // the name of each bottom blob ##各个输入Blob的名称  repeated string top = 4; // the name of each top blob       ##各个输出Blob的名称  // The train / test phase for computation. ##计算时phase是train还是test  optional Phase phase = 10;  // The amount of weight to assign each top blob in the objective.  // Each layer assigns a default value, usually of either 0 or 1,  // to each top blob.  // ##每层输出blob在目标损失函数中的加权系数，每层默认为0或1  repeated float loss_weight = 5;  // Specifies training parameters (multipliers on global learning constants,  // and the name and other settings used for weight sharing).  //##指定训练参数(全局学习率上的乘数lr_mrlt (即与solver中的base_lr相乘) )    repeated ParamSpec param = 6;  // The blobs containing the numeric parameters of the layer.  // ##包含每层数值参数的blobs  repeated BlobProto blobs = 7;  // Specifies whether to backpropagate to each bottom. If unspecified,  // Caffe will automatically infer whether each input needs backpropagation  // to compute parameter gradients. If set to true for some inputs,  // backpropagation to those inputs is forced; if set false for some inputs,  // backpropagation to those inputs is skipped.  //##指定是否需要向底部进行反向传播，如果没有指定Caffe会自动推断每个输入是否需要反向传播来计算参数梯度。  //##如果对某些输入设置为true，则强制对这些输入的反向传播;  //##如果某些输入设置为false，则跳过这些输入的反向传播。  // The size must be either 0 or equal to the number of bottoms.  // ##这个参数的数量必须为0或者等于bottom blobs的数量  repeated bool propagate_down = 11;  // Rules controlling whether and when a layer is included in the network,  // based on the current NetState.  You may specify a non-zero number of rules  // to include OR exclude, but not both.  If no include or exclude rules are  // specified, the layer is always included.  If the current NetState meets  // ANY (i.e., one or more) of the specified rules, the layer is  // included/excluded.  //## Rules控制每层是否被包含在网络中，基于当前的NetState.   //##你可使用一个非0数字规则（参考proto中关于message NetStateRule的描述）来指定include或者exclude，但不能同时指定二者。  //##如果没有指定include或者exclude，这个layer通常默认include。  //##如果当前NetState满足任何（即一个或多个）指定的规则，则该层被包括/排除。  repeated NetStateRule include = 8;  repeated NetStateRule exclude = 9;  // Parameters for data pre-processing. ##数据预处理的参数  optional TransformationParameter transform_param = 100;  // Parameters shared by loss layers.   ##loss layer共享的参数  optional LossParameter loss_param = 101;  // Layer type-specific parameters.   // ##层类型指定参数  // Note: certain layers may have more than one computational engine  // for their implementation. These layers include an Engine type and  // engine parameter for selecting the implementation.  // The default for the engine is set by the ENGINE switch at compile-time.  // ##注意：某些层可能有多个计算引擎用于实现。这些层包括用于选择实现的引擎类型和引擎参数。  optional AccuracyParameter accuracy_param = 102;  optional ArgMaxParameter argmax_param = 103;  optional BatchNormParameter batch_norm_param = 139;  optional BoxAnnotatorOHEMParameter box_annotator_ohem_param = 150;  optional BiasParameter bias_param = 141;  optional ConcatParameter concat_param = 104;  optional ContrastiveLossParameter contrastive_loss_param = 105;  optional ConvolutionParameter convolution_param = 106;  optional CropParameter crop_param = 144;  optional DataParameter data_param = 107;  optional DropoutParameter dropout_param = 108;  optional DummyDataParameter dummy_data_param = 109;  optional EltwiseParameter eltwise_param = 110;  optional ELUParameter elu_param = 140;  optional EmbedParameter embed_param = 137;  optional ExpParameter exp_param = 111;  optional FlattenParameter flatten_param = 135;  optional HDF5DataParameter hdf5_data_param = 112;  optional HDF5OutputParameter hdf5_output_param = 113;  optional HingeLossParameter hinge_loss_param = 114;  optional ImageDataParameter image_data_param = 115;  optional InfogainLossParameter infogain_loss_param = 116;  optional InnerProductParameter inner_product_param = 117;  optional InputParameter input_param = 143;  optional LogParameter log_param = 134;  optional LRNParameter lrn_param = 118;  optional MemoryDataParameter memory_data_param = 119;  optional MVNParameter mvn_param = 120;  optional ParameterParameter parameter_param = 145;  optional PoolingParameter pooling_param = 121;  optional PowerParameter power_param = 122;  optional PReLUParameter prelu_param = 131;  optional PSROIPoolingParameter psroi_pooling_param = 149;  optional PythonParameter python_param = 130;  optional RecurrentParameter recurrent_param = 146;  optional ReductionParameter reduction_param = 136;  optional ReLUParameter relu_param = 123;  optional ReshapeParameter reshape_param = 133;  optional ROIPoolingParameter roi_pooling_param = 147;  optional ScaleParameter scale_param = 142;  optional SigmoidParameter sigmoid_param = 124;  optional SmoothL1LossParameter smooth_l1_loss_param = 148;  optional SoftmaxParameter softmax_param = 125;  optional SPPParameter spp_param = 132;  optional SliceParameter slice_param = 126;  optional TanHParameter tanh_param = 127;  optional ThresholdParameter threshold_param = 128;  optional TileParameter tile_param = 138;  optional WindowDataParameter window_data_param = 129;  optional MILDataParameter mil_data_param = 0x004d4944; //"MID"  optional MILParameter mil_param = 0x004d494c; //"MIL"}

Ⅱ、一些常用Layer的例子：

①Data Layers

数据层是每个模型的最底层，是模型的入口，不仅提供数据的输入，也提供数据从Blobs转换成别的格式进行保存输出。通常数据的预处理（如减去均值, 放大缩小, 裁剪和镜像等），也在这一层设置参数实现。

数据来源可以来自高效的数据库（如LevelDB和LMDB），也可以直接来自于内存、也可来自磁盘的HDF5文件和图片文件。

A、数据来源于数据库

示例

layer {

type: "Data" # 层类型，如果是Data，表示数据来源于LevelDB或LMDB

top: "data" # top表示把数据输出到哪个Blob去

top: "label" # 因为这是Data层，所以没有Bottom只有Top

include {

phase: TRAIN #表示这个Layer在train阶段有效，还有的是TEST阶段有效

}

transform_param { # 这是图像变换的参数

scale: 0.00390625 # 实际上就是1/256, 即将输入数据由0-255归一化到0-1之间

mean_file_size: "examples/cifar10/mean.binaryproto"

# 用一个配置文件来进行均值操作

mirror: 1 # 1表示开启镜像，0表示关闭，也可用true和false来表示

crop_size: 227 # 剪裁一 227*227的图块，训练阶段随机剪裁，测试阶段中间裁剪

}

data_param {

source:"examples/mnist/mnist_train_lmdb" # 数据库文件的路径，必填参数

batch_size: 64 # 网络单次输入数据的数量，必填参数

backend: LMDB # 选择使用LevelDB还是LMDB，默认是LevelDB，可选填

}

B、数据来源于内存

示例

layer {

top: "data"

top: "label"

name:"memory_data"

type: "MemoryData"

memory_data_param{ # 其特有的参数，每次读取一个大小为batch_size的数据块

# 这4个参数都为必填

batch_size: 2

height: 100

width: 100

channels: 1

}

transform_param {

scale: 0.0078125

mean_file:"mean.proto"

mirror: false

}

C、数据来源于HDF5

示例

layer {

type: "HDF5Data"

top: "data"

top: "label"

hdf5_data_param { # 这里面的两个参数也为必填项

source:"examples/hdf5_classification/data/train.txt"

batch_size: 10

}

D、数据来源于图像

layer {

type: "ImageData"

top: "data"

top: "label"

transform_param {

mirror: false

crop_size: 227

mean_file:"data/ilsvrc12/imagenet_mean.binaryproto"

}

image_data_param {

source:"examples/_temp/file_list.txt"

# text文件的路径名，此文件的每一行存储一张图片的路径名和对应的标签，必填

batch_size: 50 # 每一次处理的图片的个数，必填

new_height: 256 # 根据设置的值，输入的图片将会被调整为给定的高度，选填

new_width : 256 # 根据设置的值，输入的图片将会被调整为给定的宽度，选填

}

E、数据来源于窗口Windows

layer {

type: "WindowData"

top: "data"

top: "label"

include {

phase: TRAIN

}

transform_param {

mirror: true

crop_size: 227

mean_file:"data/ilsvrc12/imagenet_mean.binaryproto"

}

window_data_param{

source:"examples/finetune_pascal_detection/window_file_2007_trainval.txt" #必填

batch_size: 128 # 必填

fg_threshold: 0.5

bg_threshold: 0.5

fg_fraction: 0.25

context_pad: 16

crop_mode:"warp"

}

②Convolution Layers

卷积层，是卷积神经网络（CNN）的核心层。

layer {

type:"Convolution"

bottom: "data"

top: "conv1"

param {

lr_mult: 1 # 权值 w的学习率倍数，最终的学习率是这个数乘以solver.prototxt配置文件中的base_lr

decay_mult:1 # 衰减因子（可选填）

}

param {

lr_mult: 2 # 偏置 b的学习率倍数，一般都设置为权值倍数的2倍

decay_mult:0 # 衰减因子（可选填）

}

convolution_param {

num_output: 20 # 卷积核的数量，必填

pad: 2 # 填充，默认为0

kernel_size: 5 # 卷积核的高度和宽度，必填

stride: 1 # 步长

group : 1 # 指定分组卷积操作的组数，默认为1

weight_filler{ # 权值初始化参数

type: "xavier" # ”xavier” 算法来进行初始化，也可以设置为”Gaussian”

}

bias_filler{ # 偏置初始化参数

type:"constant" # 偏置值初始化为常数，默认值为0

value: 0

}

③ReLU Layer

layer {

type: "ReLu"

bottom: "conv1"

top: " conv1" # ReLu层的输入输出一般都是指向同一个Blob

}

④pooling Layers

池化层

layer {

type: "Pooling"

bottom: "conv1"

top: "pool1"

pooling_param {

pool: MAX # 池化方法，有最大池化MAX, 均值池化AVE, 随机池化STOCHASTIC

kernel_size: 3 # 池化窗口的高度和宽度，必填

stride: 2 # 步长，默认值为1

}

⑤InerProduct Layers

全连接层，把输入当作成一个向量，输出也是一个简单向量（把输入数据blobs的width和height全变为1）。

layer {

type:"InnerProduct"

bottom: "pool1"

top: "ip1"

param {

lr_mult: 1

decay_mult:1

}

param {

lr_mult: 2

decay_mult:0

}

inner_product_param {

num_output: 500 # 全连接层的输出节点或滤波器的个数，必填

weight_filler {

type: "gaussian" #参数初始化方案，必填项，默认为”constant", 值全为0，也用"xavier"算法来进行初始化，也可以设置为”gaussian"

std: 0.01

}

bias_filler {

type:"constant"

}

全连接层实际上也是一种卷积层，只是它的卷积核大小和原数据大小一致。因此它的参数基本和卷积层的参数一样。

⑥Dropout Layers

这就是AlexNet中提出的防止过拟合的方法dropout

layer {

type: "Dropout"

bottom: "fc1"

top: "fc1"

dropout_param {

dropout_ratio: 0.5 # 默认0.5，可选填

}

⑦SoftmaxWithLoss Layers

layer {

type: "SoftmaxWithLoss"

bottom: "fc1"

bottom: "label"

top: "loss"

}

⑧Softmax Layers

layers {

type: “Softmax"

bottom: " fc1"

top: "prob" # probability的缩写

}

#这里要注意区分一下 SoftmaxWithLoss和 Softmax的区别

·SoftmaxWithLoss，计算出的是loss值

·Softmax只计算出每个类别的概率似然值

若只是想得到每个类别的概率似然值，则只需使用softmax层即可，就不需调用SoftmaxWithLoss。

⑨Accuracy Layers

输出分类（预测）精确度，只有test阶段才有，因此需要加入include参数

layer {

type: "Accuracy"

bottom: "ip2"

bottom: "label"

top: "accuracy"

include {

phase:TEST # 必填

}

其实还有其他很多的Layer，但是常用的基本就是上述的这些了，其他各种Layer网上都也有很多的介绍。

这个时候再回头看一下前面贴的caffe.proto中关于Layer的描述，你会发现其实就是在那个框架下选填参数的问题

3、Net

Nets类负责按照网络定义文件将需要的layers和中间blobs进行实例化，并将所有的Layers组合成一个有向无环图。

Nets还提供了在整个网络上进行前向传播与后向传播的接口。

Ⅰ、Caffe.proto中Net的描述：

所有Net也是根据这个描述来构造的

message NetParameter {  optional string name = 1; // consider giving the network a name .#给network一个名字  // DEPRECATED. See InputParameter. The input blobs to the network.  #已弃用,参考新版中的InputParameter. 本来表示输入network的Blobs  repeated string input = 3;   // DEPRECATED. See InputParameter. The shape of the input blobs.  #已弃用,参考新版中的InputParameter. 本来表示输入Blob的维度信息  repeated BlobShape input_shape = 8;   // 4D input dimensions -- deprecated.  Use "input_shape" instead. # 指定Blobs的4D输入形状 -- 已改为新版：input_shape代替  // If specified, for each input blob there should be four values specifying the num, channels, height and width of the input blob.  // #如要使用旧版，对每个输入的blob都需要指定4个参数，Num×Channel×H×W    // Thus, there should be a total of (4 * #input) numbers. #因此 input_dim需要重复4次  repeated int32 input_dim = 4;  // Whether the network will force every layer to carry out backward operation.  // If set False, then whether to carry out backward is determined  // automatically according to the net structure and learning rates.  //#网络是否强制每个层执行后向传播计算。如果设置为false，那么是否执行后向传播计算由网络结构和学习速率自动确定  optional bool force_backward = 5 [default = false];  // The current "state" of the network, including the phase, level, and stage.  // Some layers may be included/excluded depending on this state and the states  // specified in the layers' include and exclude fields.  //#网络的当前状态"state"包括"phase","level","stage"。(还没弄懂level和stage是什么)    //#一些layers可能会被包括/排除 根据layers里面具体设置的state信息  optional NetState state = 6;  // Print debugging information about results while running Net::Forward,  // Net::Backward, and Net::Update.  //#运行Net::Forward, Net::Backward, and Net::Update时是否打印结果信息  optional bool debug_info = 7 [default = false];  // The layers that make up the net.  Each of their configurations, including  // connectivity and behavior, is specified as a LayerParameter.  //#构成net的layers。每个layer的链接和行为通过LayerParameter配置  repeated LayerParameter layer = 100;  // ID 100 so layers are printed last.  // DEPRECATED: use 'layer' instead. # 已弃用，使用layer代替  repeated V1LayerParameter layers = 2;}

Ⅱ、Net举例

我以 caffe/examples/mnist/lenet_train_test.prototxt 所定义的Lenet为例子，各位自行感受一下

至此，caffe中的三个层次的数据结构 Blob、Layer、Net就介绍完了，内容有点多，但是值得细看。不足之处还望各位不吝赐教，若有版权问题请评论留言或私信，侵删。

可能各位已经发现caffe.proto这个里面的描述很重要了，后面有空我会专门写一篇博文，但是也推荐各位自行去阅读，想用好caffe必读caffe.proto

4、官方文档

看完上面三部分介绍之后如果各位看官觉得还有闲情逸致，然后不放心我的笔记的话，推荐看一下caffe自带的对于这部分的描述，

对于数据结构的描述，除了我上面说的caffe.proto中对其各种定义之外，caffe中还有个地方也做了叙述

那就是 caffe/docs/tutorial 文件夹

有个 net_layer_blob.md文件，可用vim打开

windows版本的可以用文本文档打开，然后也就是caffe自己自带的说明书，我就不贴出来了，有点长

阅读全文

2 0