卷积神经网络CNN(LeNet)的Theano实现
来源:互联网 发布:网易域名邮箱smtp 编辑:程序博客网 时间:2024/06/05 20:34
一、LeNet 结构
二、卷积层
ConvOp是在Theano中实现卷积层的主要主力。 ConvOp由theano.tensor.signal.conv2d使用,它具有两个输入:
1. input:对应于小批量输入图像的4D tensor。 tensor的形状如下:[mini-batch大小,输入的feature maps数量,图像height,图像width]。
2. W: 对应于权重矩阵W的4D tensor。tensor的形状是:[层 m的feature map的数量,层m-1处的feature map的数量,filter height,filter width]
举个简单的栗子:
输入: 3个feature map(一个500*500大小的RGB图片)
卷积:2个9*9*3
大小的卷积核
输出:2个feature map
代码如下:
import theanofrom theano import tensor as Tfrom theano.tensor.nnet import conv2dimport numpyimport pylabfrom PIL import Imagerng = numpy.random.RandomState(23455)# instantiate 4D tensor for inputinput = T.tensor4(name='input')# initialize shared variable for weights.w_shp = (2, 3, 9, 9)# [#f_map of the m layer,#f_map of the m-1 layer, n_filter,filter_height,filter_width]w_bound = numpy.sqrt(3 * 9 * 9)W = theano.shared(numpy.asarray( rng.uniform( low=-1.0 / w_bound, high=1.0 / w_bound, size=w_shp), dtype=input.dtype), name='W')# initialize shared variable for bias (1D tensor) with random values# IMPORTANT: biases are usually initialized to zero. However in this# particular application, we simply apply the convolutional layer to# an image without learning the parameters. We therefore initialize# them to random values to "simulate" learning.b_shp = (2,)b = theano.shared(numpy.asarray( rng.uniform(low=-.5, high=.5, size=b_shp), dtype=input.dtype), name='b')# build symbolic expression that computes the convolution of input with# filters in wconv_out = conv2d(input, W) #4d tensor dimension [batch_size, n_filter,f_map_height,f_map_width] [1,2,112,112]# build symbolic expression to add bias and apply activation function, i.e. produce neural net layer outputoutput = T.nnet.sigmoid(conv_out + b.dimshuffle('x', 0, 'x', 'x'))# create theano function to compute filtered imagesf = theano.function([input], output,allow_input_downcast=True)# open random image of dimensions 639x516#path=r'E:\\py\\theano\\3wolfmoon.jpg'img = Image.open('cat.jpg')# dimensions are (height, width, channel)img = numpy.asarray(img, dtype='float64') / 256.# put image in 4D tensor of shape (1, 3, height, width)img_ = img.transpose(2, 0, 1).reshape(1, 3, 500, 500)filtered_img = f(img_)# plot original image and first and second components of outputpylab.subplot(1, 3, 1); pylab.axis('off'); pylab.imshow(img)pylab.gray();# recall that the convOp output (filtered image) is actually a "minibatch",# of size 1 here, so we take index 0 in the first dimension:pylab.subplot(1, 3, 2); pylab.axis('off'); pylab.imshow(filtered_img[0, 0, :, :])pylab.subplot(1, 3, 3); pylab.axis('off'); pylab.imshow(filtered_img[0, 1, :, :])pylab.show()
运行结果如下:
可以看到,随机生成的卷积核也能达到类似边缘滤波的效果!
三、MaxPooling
Maxpooling 实际上是下采样的过程,在每个n*n的子块中取最大的值输出。
主要作用有两点:
1. 去除最大值以外的值,使得后续计算量减少
2. 提供平移不变性,原因如下:
每张图的底层是卷积层的输出,上层是池化后的输出。上图是原本的,下图是向右平移一位后的。可以看到虽然底层的每个数值都变了,但是池化后的输出中只有一半数值变化了。因为maxpooling只对最大值敏感,只要左边新引入的数值不比原来的最大值大,右边移出池化区域的数值不是原本的最大值,那么maxpooling的结果就不会变。
Maxpooling通过theano.tensor.signal.pool.pool_2d完成
输入:
1. input . N维tensor
2. 下采样因子,比如[2,2]
3. ignore_border: 一般设为True
再举个栗子:
from theano.tensor.signal import poolinput = T.dtensor4('input')maxpool_shape = (2, 2)pool_out = pool.pool_2d(input, maxpool_shape, ignore_border=True)f = theano.function([input],pool_out)invals = numpy.random.RandomState(1).rand(3, 2, 5, 5)print 'With ignore_border set to True:'print 'invals[0, 0, :, :] =\n', invals[0, 0, :, :]print 'output[0, 0, :, :] =\n', f(invals)[0, 0, :, :]pool_out = pool.pool_2d(input, maxpool_shape, ignore_border=False)f = theano.function([input],pool_out)print 'With ignore_border set to False:'print 'invals[1, 0, :, :] =\n ', invals[1, 0, :, :]print 'output[1, 0, :, :] =\n ', f(invals)[1, 0, :, :]
运行结果:
With ignore_border set to True: invals[0, 0, :, :] = [[ 4.17022005e-01 7.20324493e-01 1.14374817e-04 3.02332573e-01 1.46755891e-01] [ 9.23385948e-02 1.86260211e-01 3.45560727e-01 3.96767474e-01 5.38816734e-01] [ 4.19194514e-01 6.85219500e-01 2.04452250e-01 8.78117436e-01 2.73875932e-02] [ 6.70467510e-01 4.17304802e-01 5.58689828e-01 1.40386939e-01 1.98101489e-01] [ 8.00744569e-01 9.68261576e-01 3.13424178e-01 6.92322616e-01 8.76389152e-01]] output[0, 0, :, :] = [[ 0.72032449 0.39676747] [ 0.6852195 0.87811744]]With ignore_border set to False: invals[1, 0, :, :] = [[ 0.01936696 0.67883553 0.21162812 0.26554666 0.49157316] [ 0.05336255 0.57411761 0.14672857 0.58930554 0.69975836] [ 0.10233443 0.41405599 0.69440016 0.41417927 0.04995346] [ 0.53589641 0.66379465 0.51488911 0.94459476 0.58655504] [ 0.90340192 0.1374747 0.13927635 0.80739129 0.39767684]] output[1, 0, :, :] = [[ 0.67883553 0.58930554 0.69975836] [ 0.66379465 0.94459476 0.58655504] [ 0.90340192 0.80739129 0.39767684]]
四、CNN结构
具体采取的cnn结构如下图:
layer0: conv+pool
layer1: conv+pool
layer3: tanh
layer4: softmax
全部代码下载:csdn下载链接
这里贴一下build model的代码:
def evaluate_lenet(learning_rate=0.1, n_epochs=200, dataset='mnist.pkl.gz', nkerns=[20, 50], batch_size=500): rng = numpy.random.RandomState(23455) # load data datasets = load_data(dataset) train_setx, train_sety = datasets[0] valid_setx, valid_sety = datasets[1] test_setx, test_sety = datasets[2] # compute number of minibatches for train/valid/test n_train_batches = train_setx.get_value(borrow=True).shape[0] // batch_size n_valid_batches = valid_setx.get_value(borrow=True).shape[0] // batch_size n_test_batches = test_setx.get_value(borrow=True).shape[0] // batch_size # declare variables of the data index = T.lscalar() # index to a minibatch x = T.matrix('x') # data representing images y = T.ivector('y') # data representing tags(classes) ############### # build model # ############### print('... building the model') # reshape the image data into 4D tensor of the shape(batch_size,number of # feature map, image height,image width) in order to be compatible with # our LeNetConvPoolLayer. # (28,28) is the size of MNIST images layer0_input = x.reshape((batch_size, 1, 28, 28)) layer0 = LeNetConvPoolLayer( rng, input=layer0_input, filter_shape=(nkerns[0], 1, 5, 5), image_shape=(batch_size, 1, 28, 28), ) layer1 = LeNetConvPoolLayer( rng, input=layer0.output, filter_shape=(nkerns[1], nkerns[0], 5, 5), image_shape=(batch_size, nkerns[0], 12, 12) ) layer2_input = layer1.output.flatten(2) layer2 = HiddenLayer( rng, input=layer2_input, n_in=nkerns[1] * 4 * 4, n_out=500, activation=T.tanh ) layer3 = LogisticRegression(input=layer2.output, n_in=500, n_out=10) cost = layer3.negative_log_likelihood(y) params = layer3.params + layer2.params + layer1.params + layer0.params grads = T.grad(cost, params) updates = [ (param_i, param_i - learning_rate * grad_i) # param and its corresponding grad in pairs for param_i, grad_i in zip(params, grads) ] train_model = theano.function( [index], cost, updates=updates, givens={ x: train_setx[index * batch_size:(index + 1) * batch_size], y: train_sety[index * batch_size:(index + 1) * batch_size] } )
五、 结果
- 卷积神经网络CNN(LeNet)的Theano实现
- 卷积神经网络(CNN)及其theano实现
- 经典卷积神经网络(CNN)实现MNIST分类(基于LeNet-5模型)
- theano卷积神经网络实现
- DeepLearning工具Theano学习记录(三) CNN卷积神经网络
- 卷积神经网络Lenet-5实现
- 卷积神经网络Lenet-5实现
- 卷积神经网络Lenet-5实现
- 卷积神经网络Lenet-5实现
- 卷积神经网络Lenet-5实现
- 深度学习(DL)与卷积神经网络(CNN)学习随笔-05-基于Python的LeNet之CNN
- 深度学习(DL)与卷积神经网络(CNN)学习随笔-05-基于Python的LeNet之CNN
- 深度学习(DL)与卷积神经网络(CNN)学习随笔-05-基于Python的LeNet之CNN
- 深度学习(DL)与卷积神经网络(CNN)学习随笔-05-基于Python的LeNet之CNN
- [深度学习之CNN]CNN卷积神经网络LeNet-5
- 深度学习(DL)与卷积神经网络(CNN)学习笔记随笔-03-基于Python的LeNet之LR(转)
- 卷积神经网络(CNN)的简单实现(MNIST)
- 卷积神经网络(CNN)的简单实现(MNIST)
- Spring框架中context-param与servlet中init-param的contextConfigLocation的区别
- 文章标题HDOJ_P2013
- 分布式架构学习之持续集成:008--实用软件和架构图
- LeetCode 152. Maximum Product Subarray
- 移植系统时思路记录
- 卷积神经网络CNN(LeNet)的Theano实现
- 如鹏java学习进程 模拟小球弹跳
- python基础知识-元组和字符串
- View
- iOS UIPageControl使用
- 一、BIO、NIO、AIO通信机制理解
- nginx源码分析--内存池
- 移动窗口
- 1548 欧姆诺姆和糖果 51NOD