tensorflow 学习笔记9 卷积神经网络(CNN)实现mnist手写识别

来源：互联网发布：mp3网络歌曲地址编辑：程序博客网时间：2024/06/05 15:08

卷积神经网络的结构：

卷积层：卷积层最重要的部分是过滤器(fiter)，过滤器可以将当前层神经网络上的一个子节点矩阵转化为下一层神经网络上的一个单位节点矩阵。

例如：w=[[1,-1],[0,2]] b=1 a=[[0,0,0,0],[0,1,-1,0],[0,-1,2,1],[0,0,2,-2]] w就是一个过滤器，使用全0填充，步长为2，f=ReLU的卷积层前向传播计算得 b=[[3,1],[2,0]]

池化层：池化层可以有效地缩小矩阵的尺寸，从而减少最后全连接中的参数。使用池化层既可以加快计算速度也有防止过拟合问题的作用。与卷积层类似，池化层也是前向传播通过移动一个类似过滤器的结构完成的。不过池化层过滤器中的计算不是节点的加权和，而是采用更加简单的最大值或者平均值运算。

例如：3×3×2节点矩阵经过2×2的过滤器全0填充且步长为2的最大池化层前向传播计算过程 a[,,0]=[[0,0,0,0],[0,1,-1,0],[0,-1,2,1],[0,0,2,-2]] a[,,1]=[[0,0,0,0],[0,-2,0,0],[0,1,-2,0],[0,0,1,-2]] 得b[,,0]=[[1,0],[0,2]] b[,,1]=[[0,0],[1,1]]

mnist手写识别卷积神经网络结构：

输入层（28 * 28 * 1）

卷积层1（28 * 28 * 32）

pooling层1（14 * 14 * 32）

卷积层2（14 * 14 * 64）

pooling层2（7 * 7 * 64）

全连接层（1 * 1024）

softmax层（10）

import tensorflow as tffrom tensorflow.examples.tutorials.mnist import input_data#number 1 to 10 data自动下载数据集mnist = input_data.read_data_sets('MNIST_data', one_hot=True)#计算准确度def compute_accuracy(v_xs, v_ys):    global prediction    y_pre = sess.run(prediction, feed_dict={xs: v_xs, keep_prob: 1})    correct_prediction = tf.equal(tf.argmax(y_pre,1), tf.argmax(v_ys,1))#判断预测标签和实际标签是否匹配    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))    result = sess.run(accuracy, feed_dict={xs: v_xs, ys: v_ys, keep_prob: 1})    return result"""权重初始化初始化为一个接近0的很小的正数"""def weight_variable(shape):    #tf.truncated_normal(shape, mean, stddev) 生成变量数据:shape表示生成张量的维度，mean是均值，stddev是标准差。    initial=tf.truncated_normal(shape,stddev=0.1)    return tf.Variable(initial)def bias_variable(shape):    initial=tf.constant(0.1,shape=shape)    return tf.Variable(initial)"""卷积操作：使用卷积步长为1（stride size）,0边距（padding size）input：是一个形状为[batch, in_height, in_width, in_channels]的张量,每次batch数据的数量，输入矩阵的高和宽，输入通道数量filter：是一个形状为[filter_height, filter_width, in_channels, out_channels]的张量,卷积核的高与宽，输入通道数量，输出通道的数量strides：是指滑动窗口（卷积核）的滑动规则,其中batch和in_channels维度一般都设置为1，所以形状为[1, x_movement,y_movement, 1]padding：VALID：new_height = new_width = (W – Filter + 1) / Strides  SAME：new_height = new_width = W / Strides"""def conv2d(x,W):    #padding 方式有两种，SAME与VALID，SAME直接size除以步长，外面部分以0填充;VALID是size-filter+1再除以步长。    #strides是[1,x_movement,y_movement,1]    return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME')"""池化操作：池化用简单传统的2x2大小的模板做max poolingvalue：需要池化的输入，一般池化层接在卷积层后面，所以输入通常是feature map，依然是[batch, height, width, channels]这样的shapeksize：池的大小，一般是[1, height, width, 1]，因为我们不想在batch和channels上做池化，所以这两个维度设为了1,2*2的窗口做poolingstrides：窗口在每一个维度上滑动的步长，一般也是[1,x_movement,y_movement, 1]padding：可以取'VALID' 或者'SAME'"""def max_pool_2x2(x):    return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')"""define placeholder for inputs to network"""xs = tf.placeholder(tf.float32, [None, 784]) # 28x28ys = tf.placeholder(tf.float32, [None, 10]) #10种输出keep_prob = tf.placeholder(tf.float32) #dropout保留的数据x_image=tf.reshape(xs,[-1,28,28,1]) #每次batch数据的数量,28×28,inchannal为1"""第一层 卷积层x_image(batch, 28, 28, 1) -> h_pool1(batch, 14, 14, 32)"""W_conv1=weight_variable([5,5,1,32])#filter5×5,insize厚度为1,outsize厚度为32b_conv1=bias_variable([32])#b的维度直接是32h_conv1=tf.nn.relu(conv2d(x_image,W_conv1)+b_conv1)#卷积处理非线性处理激励一下，传到pooling中h_pool1=max_pool_2x2(h_conv1)#池化操作"""第二层 卷积层h_pool1(batch, 14, 14, 32) -> h_pool2(batch, 7, 7, 64)"""W_conv2=weight_variable([5,5,32,64])#filter5×5,insize厚度为32,outsize厚度为64b_conv2=bias_variable([64])#b的维度直接是64h_conv2=tf.nn.relu(conv2d(h_pool1,W_conv2)+b_conv2)#卷积处理非线性处理激励一下，传到pooling中h_pool2=max_pool_2x2(h_conv2)#池化操作"""第三层 全连接层h_pool2(batch, 7, 7, 64) -> h_fc1(1, 1024)"""W_fc1=weight_variable([7*7*64,1024])b_fc1=bias_variable([1024])h_pool2_flat=tf.reshape(h_pool2,[-1,7*7*64])#维度转换[n_samples, 7, 7, 64] -> [n_samples, 7*7*64]h_fc1=tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1)+b_fc1)"""Dropouth_fc1 -> h_fc1_drop, 训练中启用，测试中关闭"""h_fc1_drop=tf.nn.dropout(h_fc1,keep_prob)#h_fc1 -> h_fc1_drop,dropout处理防止过拟合"""第四层 Softmax输出层"""W_fc2=weight_variable([1024,10])b_fc2=bias_variable([10])prediction=tf.nn.softmax(tf.matmul(h_fc1_drop,W_fc2)+b_fc2)"""训练和评估模型ADAM优化器来做梯度最速下降,feed_dict中加入参数keep_prob控制dropout比例"""cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction),reduction_indices=[1])) # losstrain_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)#AdamOptimizer优化loss函数，学习率0.0001，ADAM优化器来做梯度最速下降sess = tf.Session()#启动创建的模型init = tf.global_variables_initializer()#important stepsess.run(init)for i in range(1000):#开始训练模型，循环训练1000次    batch_xs, batch_ys = mnist.train.next_batch(100)#batch大小设置为100    result=sess.run(train_step, feed_dict={xs: batch_xs, ys: batch_ys, keep_prob: 0.5})#keep_prob控制dropout比例50%    if i % 50 == 0:        print(compute_accuracy(mnist.test.images[:1000], mnist.test.labels[:1000]))

最后结果比之前普通神经网络手写识别准确度高多了：

阅读全文

1 0