caffe各层参数详解（读文档记录）

来源：互联网发布：矩阵的张量积编辑：程序博客网时间：2024/06/05 19:25

前言：利用caffe工具来完成自己的模型搭建与训练，层级参数还是需要好好理解的，以便进行配置文件prototxt 的编写。

数据层

数据层是每一个模型的底层，我们需要通过它完成blobs格式数据的上传。需要注意数据层的某些参数实现数据的预处理（减去均值，放大缩小，剪裁，镜像等）。
数据层的数据来源：主要是数据库（leverDB和LMDB），此时数据层的type：“Data”。
top和bottom：指从bottom底层到top高层的数据流动，显然数据层只有top层的输出，没有更底的层兮兮。可以有多个bottom或top。
data与label： (data,label) 配对是分类模型所必需的。必须有一个名字是“data”的top。
include：这一层是在训练时候用还是测试时候用。
Transformations：数据的预处理。比如设置scale=0.00390625，实际就是1/255，代表输入数据由0-255归一化到0-1之间。
举例：
This data layer definition
layer {
name: “mnist”
# Data layer loads leveldb or lmdb storage DBs for high-throughput.
type: “Data”
# the 1st top is the data itself: the name is only convention
top: “data”
# the 2nd top is the ground truth: the name is only convention
top: “label”
include{
phase：TRAIN
}

# the Data layer configuration
data_param {
# path to the DB
source: “examples/mnist/mnist_train_lmdb”
# type of DB: LEVELDB or LMDB (LMDB supports concurrent reads)
backend: LMDB
# batch processing improves efficiency.
batch_size: 64
}
# common data transformations
transform_param {
# feature scaling coefficient: this maps the [0, 255] MNIST data to [0, 1]
scale: 0.00390625
mean_file_size: mean.binaryproto
# for images in particular horizontal mirroring and random cropping
# can be done as simple data augmentations.
mirror: 1 # 1 = on, 0 = off
# crop a crop_size x crop_size patch:
# - at random during training
# - from the center during testing
crop_size: 227#剪裁
}
}

视觉层之convolution：

卷积神经网络的核心层

- 选择了caffe model中的bvlc_reference_caffenet作为例子，如下：
- layer {
name: “conv1”
type: “Convolution”
bottom: “data”
top: “conv1”
param {
lr_mult: 1#权值学习率的系数（learning rate）
decay_mult: 1
}
param {
lr_mult: 2#这个是偏置项的学习率系数，一般是权值系数的两倍
decay_mult: 0
}
convolution_param {
num_output: 96 #卷积核的个数，必须设置
kernel_size: 11#长宽不等的话分别设定kernel_h,kernel_w
stride: 4#卷积核的步长，默认为1，也可以用stride_h,stride_w设置

#权值初始化,默认为“constant",值全为0,很多时候我们用"xavier"算法来进行初始化,这里设置为”gaussian"。weight_filler {  type: "gaussian"  std: 0.01}偏置项的初始化，一般是“constant”，默认开启偏置项base_term。bias_filler {  type: "constant"  value: 0}

}
}
这里写图片描述

视觉层之pooling：为了减少运算量和维度

layer {
name: “pool1”
type: “Pooling”
bottom: “conv1”
top: “pool1”
pooling_param {
pool: MAX #可用方法有max，ave或者stochastic
kernel_size: 3
stride: 2#默认是1，一般设置为2
}
}
这里写图片描述

视觉层之“LRN”

归一化公式：每个输入除以
这里写图片描述
layer {
name: “norm1”
type: “LRN” #Local Response Normalization ，对输入的局部区域进行归一化，达到“侧抑制”的效果。
bottom: “pool1”
top: “norm1”
lrn_param {
local_size: 5#默认值，跨通道表示求和的通道数，通道内表示求和的通道正方形长度。
alpha: 0.0001#归一化公式中的参数，默认是1
beta: 0.75#归一化公式中的参数，默认是5
norm_region:#默认为ACROSS_CHANNELS。另一个选择是WITHIN_CHANNELS。
}
}
在通道间归一化模式中，局部区域范围在相邻通道间，但没有空间扩展（即尺寸为 local_size x 1 x 1）；
在通道内归一化模式中，局部区域在空间上扩展，但只针对独立通道进行（即尺寸为 1 x local_size x local_size）；

视觉层之im2col层

它先将一 个大矩阵重叠地 划分为多个子矩阵，对每个子矩阵序列化成向量最后得到另外一个矩阵。***重点是为了卷积效率。***

caffe中卷积运算先对数据进行im2col操作，再进行内积运算（inner product），比原始的卷积操作更快，以下附图：

激活层（Activiation Layers）

数据大小不变，但是对输入数据逐元素进行函数变换。
caffe中有sigmod，RELU（Rectified-Linear and Leaky-Relu）使用最多收敛更快，tanH/Hyperbolic Tangent双曲正切，Absolute Value绝对值，power求幂，BNLL（binomial normal log likelihood ）二项正态对数似然。
举例如下：
这里写图片描述

softmax层得到概率似然值，softmax-loss得到最大似然估计

分别理解为求概率层（预测时）和损失层。
layer {
name: “loss”
type: “SoftmaxWithLoss”
bottom: “fc8”
bottom: “label”
top: “loss”
}

Inner Product

输出为一个向量，把输入数据blobs的width和height全变为一实际上是卷积核和原数据大小一致的卷积层，参数设置与卷积层一样。
eg：layer {
name: “fc8”
type: “InnerProduct”
bottom: “fc7”
top: “fc8”
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 2
weight_filler {
type: “gaussian”
std: 0.01
}
bias_filler {
type: “constant”
value: 0
}
}
}

accuracy 只有test阶段才有

layer {
name: “accuracy”
type: “Accuracy”
bottom: “fc8”
bottom: “label”
top: “accuracy”
include {
phase: TEST
}
}

reshape：在不改变数据的情况下，改变输入的维度：

这里写图片描述

dropout：防止过拟合的trick，随机让网络某些隐含层节点的权重不工作。

layer {
name: “drop7”
type: “Dropout”
bottom: “fc7”
top: “fc7”
dropout_param {
dropout_ratio: 0.5#仅设置此参数
}
}

Blob,Layer and Net

Blob

Blob是caffe中的标准数组，统一的内存接口。描述数据如何存储与通讯的。
图片的话（N*C*H*W）
模型中设定的参数也是用blob表示和运算。维度根据参数类型不同而改变。

layer
每一种layer都定义了三种关键的运算：setup，forward，backword。
Net:像搭积木一样，写net的name接着定义各层参数即可。
##Solver （这是caffe的核心的核心，求解器。）
caffe提供了六种优化算法，默认是SGD（Stochastic Gradient Descent）
注意配置文件编写时 test_iter*batch_size应该能覆盖测试样本总数。执行完一次全部数据，称之为一个epoch。
momentum一般在0.5–0.99之间，通常设置为0.9.

0 0