实现一个卷积神经网络

来源：互联网发布：淘宝衣服厂家直销编辑：程序博客网时间：2024/06/05 15:06

转自:http://blog.csdn.net/l691899397/article/details/52233454

一、卷积神经网络(CNN)

卷积神经网络（ConvolutionalNeural Network,CNN）是人工神经网络的一种。当前已经成为图像和语音识别领域有十分广泛的应用，特别是在识别位移、缩放及其他形式扭曲不变性的二维图形方面有十分优异的表现，已经成为一个十分重要的研究方向。

关于CNN的详细解释可以看这里：http://blog.csdn.net/zouxy09/article/details/8781543

和这篇论文：http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf

接下来我将实现一个卷积神经网络，用来识别mnist数据集中的手写数字，我会给出每一层的前向和后向推导，以及caffe中的实现和我自己的实现方法，如有什么错误，欢迎指出。

二、网络结构

我要设计的网络结构如下图所示：包含2个卷积层，2个max池化层，2个全链接层和1个relu层与一个softmax层。

下面我来推导一下每层的神经元数目和参数的个数。

1、输入层：输入层输入一个28*28的图片。

2、卷积层1：该层使用20个5*5的卷积核分别对输入层图片进行卷积，所以包含20*5*5=500个参数权值参数。卷积后图片边长为(28-5+1)/1 = 24，故产生20个24*24个map，包含20*24*24 = 11520个神经元。

3、池化(pooling)层1：对上一层每个2*2区域进行降采样，选取每个区域最大值，这一层没有参数。降采样过后每个map的长和宽变为原来的一半。

4、卷积层2：该层使用20*50个5*5的卷积核分别对上一层的每一个map进行卷积，所以包含20*50*5*5=25000个参数权值参数。卷积后图片边长为(12-5+1)/1 = 8，故产生50个8*8个map，包含50*8*8 = 3200个神经元。

5、池化层2：和上一个池化层功能类似，将8*8的map降采样为4*4的map。该层无参数。

6、全连接层1：将上一层的所有神经元进行连接，该层含有500个神经元，故一共有50*4*4*500 = 400000个权值参数。

7、relu层：激活函数层，实现x=max[0,x]，该层神经元数目和上一层相同，无权值参数。

8、全连接层2：功能和上一个全连接层类似，该层共有10个神经元，包含500*10=5000个参数。

9、softmax层：实现分类和归一化，后面会详细介绍。

我个人觉得关于这个卷积神经网络比较难以理解的地方：

1、关于第一篇中神经元，如下图，它一层完成了我们上面2层(全连接层+一个激活函数层)的功能，所以，实际中的网络可能和前边理论中不是一一对应的。

2、卷积层的权值共享是指每一个map局部和整体权值共享，具体表现出来就是每一个map与卷积核进行卷积，卷积核在map上移动map的不同局部区域之间使用的同一个卷积核进行计算。而不是多个map使用相同的卷积核。

三、caffe中对该网络的实现

Caffe，全称Convolutional Architecture for Fast Feature Embedding，是一个计算CNN相关算法的框架，使用caffe，我们可以轻松实现自己的卷积神经网络。

在caffe中实现上一节中的网络，其配置文件如下：

[cpp] view plain copy

print?

name: "LeNet"
layer {
name: "data"
type: "Input"
top: "data"
input_param { shape: { dim: 64 dim: 1 dim: 28 dim: 28 } }
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 10
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "prob"
type: "Softmax"
bottom: "ip2"
top: "prob"
}

name: "LeNet"layer {  name: "data"  type: "Input"  top: "data"  input_param { shape: { dim: 64 dim: 1 dim: 28 dim: 28 } }}layer {  name: "conv1"  type: "Convolution"  bottom: "data"  top: "conv1"  param {    lr_mult: 1  }  param {    lr_mult: 2  }  convolution_param {    num_output: 20    kernel_size: 5    stride: 1    weight_filler {      type: "xavier"    }    bias_filler {      type: "constant"    }  }}layer {  name: "pool1"  type: "Pooling"  bottom: "conv1"  top: "pool1"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "conv2"  type: "Convolution"  bottom: "pool1"  top: "conv2"  param {    lr_mult: 1  }  param {    lr_mult: 2  }  convolution_param {    num_output: 50    kernel_size: 5    stride: 1    weight_filler {      type: "xavier"    }    bias_filler {      type: "constant"    }  }}layer {  name: "pool2"  type: "Pooling"  bottom: "conv2"  top: "pool2"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "ip1"  type: "InnerProduct"  bottom: "pool2"  top: "ip1"  param {    lr_mult: 1  }  param {    lr_mult: 2  }  inner_product_param {    num_output: 500    weight_filler {      type: "xavier"    }    bias_filler {      type: "constant"    }  }}layer {  name: "relu1"  type: "ReLU"  bottom: "ip1"  top: "ip1"}layer {  name: "ip2"  type: "InnerProduct"  bottom: "ip1"  top: "ip2"  param {    lr_mult: 1  }  param {    lr_mult: 2  }  inner_product_param {    num_output: 10    weight_filler {      type: "xavier"    }    bias_filler {      type: "constant"    }  }}layer {  name: "prob"  type: "Softmax"  bottom: "ip2"  top: "prob"}

通过前边的介绍，大致也能看懂其他一些参数的含义，在后面每一层的介绍中我会详细解释配置文件中每一个参数的含义。

阅读全文

0 0