Caffe学习笔记3-Layer的相关学习

来源：互联网发布：慕恋的喜欢lofter乐乎编辑：程序博客网时间：2024/06/06 05:11

Caffe学习笔记3-Layer的相关学习

By YuFeiGan

2014-12-09 更新日期:2014-12-10

文章目錄

1.Layer
1. 1.1.data_layer
  1. 1.1.1.DATA
  2. 1.1.2.MEMORY_DATA
  3. 1.1.3.HDF5_DATA
  4. 1.1.4.HDF5_OUTPUT
  5. 1.1.5.IMAGE_DATA
2. 1.2.neuron_layer
3. 1.3.loss_layer
4. 1.4.common_layer
  1. 1.4.1.InnerProductLayer
  2. 1.4.2.SplitLayer
  3. 1.4.3.FlattenLayer
  4. 1.4.4.ConcatLayer
  5. 1.4.5.SilenceLayer
  6. 1.4.6.(Elementwise Operations)
5. 1.5.vision_layer
  1. 1.5.1.ConvolutionLayer
  2. 1.5.2.Im2colLayer
  3. 1.5.3.LRNLayer
  4. 1.5.4.PoolingLayer

Layer

Layer是所有层的基类，在Layer的基础上衍生出来的有5种Layers：

data_layer
neuron_layer
loss_layer
common_layer
vision_layer

它们都有对应的[.hpp .cpp]文件声明和实现了各个类的接口。下面一个一个地讲这5个Layer。

data_layer

先看data_layer.hpp中头文件调用情况：

123456789101112

#include "boost/scoped_ptr.hpp"#include "hdf5.h"#include "leveldb/db.h"#include "lmdb.h"//前4个都是数据格式有关的文件#include "caffe/blob.hpp"#include "caffe/common.hpp"#include "caffe/data_transformer.hpp"#include "caffe/filler.hpp"#include "caffe/internal_thread.hpp"#include "caffe/layer.hpp"#include "caffe/proto/caffe.pb.h"

不难看出data_layer主要包含与数据有关的文件。在官方文档中指出data是caffe数据的入口是网络的最低层，并且支持多种格式，在这之中又有5种LayerType：

DATA
MEMORY_DATA
HDF5_DATA
HDF5_OUTPUT
IMAGE_DATA

其实还有两种WINDOW_DATA, DUMMY_DATA用于测试和预留的接口，这里暂时不管。

DATA

template <typename Dtype>class BaseDataLayer : public Layer<Dtype>template <typename Dtype>class BasePrefetchingDataLayer : public BaseDataLayer<Dtype>, public InternalThreadtemplate <typename Dtype>class DataLayer : public BasePrefetchingDataLayer<Dtype>

用于LevelDB或LMDB数据格式的输入的类型，输入参数有source, batch_size, (rand_skip), (backend)。后两个是可选。

MEMORY_DATA

template <typename Dtype>class MemoryDataLayer : public BaseDataLayer<Dtype>

这种类型可以直接从内存读取数据使用时需要调用MemoryDataLayer::Reset，输入参数有batch_size,channels, height, width。

HDF5_DATA

template <typename Dtype>class HDF5DataLayer : public Layer<Dtype>

HDF5数据格式输入的类型，输入参数有source, batch_size。

HDF5_OUTPUT

template <typename Dtype>class HDF5OutputLayer : public Layer<Dtype>

HDF5数据格式输出的类型，输入参数有file_name。

IMAGE_DATA

template <typename Dtype>class ImageDataLayer : public BasePrefetchingDataLayer<Dtype>

图像格式数据输入的类型，输入参数有source, batch_size, (rand_skip), (shuffle), (new_height), (new_width)。

neuron_layer

先看neuron_layer.hpp中头文件调用情况

#include "caffe/blob.hpp"#include "caffe/common.hpp"#include "caffe/layer.hpp"#include "caffe/proto/caffe.pb.h"

同样是数据的操作层，neuron_layer实现里大量激活函数，主要是元素级别的操作，具有相同的bottom,topsize。
Caffe中实现了大量激活函数GPU和CPU的都有很多。它们的父类都是NeuronLayer

template <typename Dtype>class NeuronLayer : public Layer<Dtype>

这部分目前没什么需要深究的地方值得注意的是一般的参数设置格式如下（以ReLU为例）：

layers {  name: "relu1"  type: RELU  bottom: "conv1"  top: "conv1"}

loss_layer

Loss层计算网络误差，loss_layer.hpp头文件调用情况：

#include "caffe/blob.hpp"#include "caffe/common.hpp"#include "caffe/layer.hpp"#include "caffe/neuron_layers.hpp"#include "caffe/proto/caffe.pb.h"

可以看见调用了neuron_layers.hpp，估计是需要调用里面的函数计算Loss，一般来说Loss放在最后一层。caffe实现了大量loss function，它们的父类都是LossLayer。

template <typename Dtype>class LossLayer : public Layer<Dtype>

common_layer

先看common_layer.hpp头文件调用：

#include "caffe/blob.hpp"#include "caffe/common.hpp"#include "caffe/data_layers.hpp"#include "caffe/layer.hpp"#include "caffe/loss_layers.hpp"#include "caffe/neuron_layers.hpp"#include "caffe/proto/caffe.pb.h"

用到了前面提到的data_layers.hpp, loss_layers.hpp, neuron_layers.hpp说明这一层肯定开始有复杂的操作了。
这一层主要进行的是vision_layer的连接
声明了9个类型的common_layer，部分有GPU实现：

InnerProductLayer
SplitLayer
FlattenLayer
ConcatLayer
SilenceLayer
(Elementwise Operations) 这里面是我们常说的激活函数层Activation Layers。
- EltwiseLayer
- SoftmaxLayer
- ArgMaxLayer
- MVNLayer

InnerProductLayer

常常用来作为全连接层，设置格式为：

123456789101112131415161718192021

layers {  name: "fc8"  type: INNER_PRODUCT  blobs_lr: 1          # learning rate multiplier for the filters  blobs_lr: 2          # learning rate multiplier for the biases  weight_decay: 1      # weight decay mu  weight_decay: 0      # weight decay multiplier for the biases  inner_product_param {    num_output: 1000    weight_filler {      type: "gaussian"      std: 0.01    }    bias_filler {      type: "constant"      value: 0    }  }  bottom: "fc7"  top: "fc8}

SplitLayer

用于一输入对多输出的场合（对blob）

FlattenLayer

将n * c * h * w变成向量的格式n * ( c * h * w ) * 1 * 1

ConcatLayer

用于多输入一输出的场合。

12345678910

layers {  name: "concat"  bottom: "in1"  bottom: "in2"  top: "out"  type: CONCAT  concat_param {    concat_dim: 1  }}

SilenceLayer

用于一输入对多输出的场合（对layer）

(Elementwise Operations)

EltwiseLayer, SoftmaxLayer, ArgMaxLayer,MVNLayer

vision_layer

头文件包含前面所有文件，也就是说包含了最复杂的操作。

12345678

#include "caffe/blob.hpp"#include "caffe/common.hpp"#include "caffe/common_layers.hpp"#include "caffe/data_layers.hpp"#include "caffe/layer.hpp"#include "caffe/loss_layers.hpp"#include "caffe/neuron_layers.hpp"#include "caffe/proto/caffe.pb.h"

它主要是实现Convolution和Pooling操作。主要有以下几个类。

12345678

template <typename Dtype>class ConvolutionLayer : public Layer<Dtype>template <typename Dtype>class Im2colLayer : public Layer<Dtype>template <typename Dtype>class LRNLayer : public Layer<Dtype>template <typename Dtype>class PoolingLayer : public Layer<Dtype>

ConvolutionLayer

最常用的卷积操作，设置格式如下

1234567891011121314151617181920212223

layers {  name: "conv1"  type: CONVOLUTION  bottom: "data"  top: "conv1"  blobs_lr: 1          # learning rate multiplier for the filters  blobs_lr: 2          # learning rate multiplier for the biases  weight_decay: 1      # weight decay multiplier for the filters  weight_decay: 0      # weight decay multiplier for the biases  convolution_param {    num_output: 96     # learn 96 filters    kernel_size: 11    # each filter is 11x11    stride: 4          # step 4 pixels between each filter application    weight_filler {      type: "gaussian" # initialize the filters from a Gaussian      std: 0.01        # distribution with stdev 0.01 (default mean: 0)    }    bias_filler {      type: "constant" # initialize the biases to zero (0)      value: 0    }  }}

Im2colLayer

与MATLAB里面的im2col类似，即image-to-column transformation，转换后方便卷积计算

LRNLayer

全称local response normalization layer，在Hinton论文中有详细介绍ImageNet Classification with Deep Convolutional Neural Networks。

PoolingLayer

即Pooling操作，格式：

1234567891011

layers {  name: "pool1"  type: POOLING  bottom: "conv1"  top: "pool1"  pooling_param {    pool: MAX    kernel_size: 3 # pool over a 3x3 region    stride: 2      # step two pixels (in the bottom blob) between pooling regions  }}

0 0