TensorFlow实战5：利用卷积神经网络对图像分类（初阶：MNIST手写数字）代码实现

来源：互联网发布：亚麻籽油知乎编辑：程序博客网时间：2024/05/22 19:57

之前用简单的神经网络实现过一次手写数字识别，这次会使用卷积神经网络来进行识别。

普通的神经网络（ANN)来对图像进行识别时，主要有如下缺点：
1. 参数太多
2. 没有利用像素之间的位置关系，对于图线识别任务来说，每个像素与周围的像素都是联系的很紧密的
3. 神经网络的层数受限制

但是利用卷积神经网络来解决图像识别分类问题，就可以避免上述的问题。
此篇文章中实现卷积神经网络对图像进行分类的步骤如下：
1. 准备数据
2. 卷积、激活、池化（两层）
3. 两层全连接层（第一层先特征加权，然后激活；第二层特征加权）
4. 使用softmax和交叉熵计算损失
5. 用梯度下降减少损失，计算准确率
6. 在运行会话时，进行1000次迭代，每100次打印结果

代码如下：

# 生成权重def weight_variable(shape):    w = tf.Variable(tf.random_normal(shape=shape, mean=0.0, stddev=1.0))    return w# 生成偏置def bias_variable(shape):    b = tf.Variable(tf.constant(0.0, shape=shape))    return bdef model():    """    搭建的模型函数    :return:模型预测值、样本真实值、特征值    """    # 1、准确数据的输入占位符，x,y    with tf.variable_scope("data"):        # 特征值        x = tf.placeholder(tf.float32, [None, 784])        # 标签值        y_true = tf.placeholder(tf.int32, [None, 10])    # 2、进行卷积层1    with tf.variable_scope("conv_1"):        # 准备参数，权重和偏置        w_conv1 = weight_variable([5, 5, 1, 32])        b_conv1= bias_variable([32])        # 转换输入数据的形状，卷积要求        x_reshape = tf.reshape(x, [-1, 28, 28, 1])        # 卷积，激活，池化        x_relu1 = tf.nn.relu(tf.nn.conv2d(x_reshape, w_conv1, strides=[1, 1, 1, 1], padding="SAME") + b_conv1)        x_pool1 = tf.nn.max_pool(x_relu1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")    # 3、进行卷积层2    with tf.variable_scope("conv_2"):        # 准备参数，权重和偏置,输入通道为上一次卷积激活池化后的Filter数量32, 输出64        w_conv2 = weight_variable([5, 5, 32, 64])        b_conv2 = bias_variable([64])        # 进行卷积、激活、池化        x_relu2 = tf.nn.relu(tf.nn.conv2d(x_pool1, w_conv2, strides=[1, 1, 1, 1], padding="SAME") + b_conv2)        x_pool2 = tf.nn.max_pool(x_relu2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")        # print(x_pool2.get_shape().as_list())      #第二次池化层之后的数据维度[None,7,7,64]    # 4、进行全连接层1    with tf.variable_scope("FC1"):        # 初始化参数，权重和偏置        w_fc1 = weight_variable([7 * 7 * 64, 1024])        b_fc1 = bias_variable([1024])        # 输入数据的形状改变,[None, 7, 7, 64]-->[None, 7*7*64]=x        x_fc1 = tf.reshape(x_pool2, [-1, 7 * 7 * 64])        # 进行第一次全连接计算        x_fc1_relu = tf.nn.relu(tf.matmul(x_fc1, w_fc1) + b_fc1)    # 5、进行全连接层2    with tf.variable_scope("FC2"):        # 初始化参数，权重和偏置        w_fc2 = weight_variable([1024,10])        b_fc2 = bias_variable([10])        # 进行加权求和        y_logit = tf.matmul(x_fc1_relu, w_fc2) + b_fc2    return y_logit, y_true, xdef compute_loss(y_logit, y_true):    """    计算损失    :param y_logit: 模型预测结果    :param y_true: 样本真实值    :return: 损失loss    """    with tf.variable_scope("compute_loss"):        loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y_logit ,labels=y_true))    return lossdef train(loss, y_true, y_logit):    """    优化损失，计算准确率    :param loss: 损失值    :return: train_op, 准确率    """    with tf.variable_scope("train"):        # 梯度下降减少损失        train_op = tf.train.GradientDescentOptimizer(0.0001).minimize(loss)        # 计算准确率        # 得出每一个样本是否预测准确1D张量，[1,0,1,1,1,0,1]        equal_list = tf.equal(tf.argmax(y_true, 1), tf.argmax(y_logit, 1))        # 对是否准确的列表求平均值        accuracy = tf.reduce_mean(tf.cast(equal_list, tf.float32))    return train_op, accuracydef main(argv):    """    主函数，用来控制整个流程    :param argv:     :return: None    """    # 导入数据    mnist = input_data.read_data_sets("./data/input_data/", one_hot=True)    # model输出卷积网络的结果    y_logit, y_true, x = model()    # softmax回归和交叉熵损失    loss = compute_loss(y_logit, y_true)    # 梯度下降API减少损失,得出准确率    train_op, accuracy = train(loss, y_true, y_logit)    init_op = tf.global_variables_initializer()    with tf.Session() as sess:        # 初始化变量        sess.run(init_op)        # 迭代训练        for i in range(1000):            # mnist数据，mnist_x特征值，mnist_y标签值            mnist_x, mnist_y = mnist.train.next_batch(50)   #每次给50个数据            sess.run(train_op, feed_dict={x: mnist_x, y_true: mnist_y})            if i % 100 == 0:                print("准确率：",sess.run(accuracy, feed_dict={x: mnist_x, y_true: mnist_y}))        # 测试集准确率        print("测试准确率：",sess.run(accuracy,feed_dict={x:mnist.test.images,y_true:mnist.test.labels}))    return Noneif __name__ == '__main__':    tf.app.run()     #此处运行main函数

上段代码最后获得的结果如下图所示：

这里写图片描述

注：

在搭建模型的整个过程中，数据的形状是在随着层次的不同而变化的，下面就梳理一下：

最开始的数据为[None,784] ，经过reshape变化之后变为[None,28,28,1];

在第一层基础层：
1. 卷积中32个5*5大小的filter，步长为1,padding= 2,经过卷积之后数据变为[None,28,28,32];
2. 激活层中数据大小没有变化；
3. 池化层：ksize =[1,2,2,1],步长为2，经过池化层之后数据变为[None,14,14,32]

第二层基础层：
1. 卷积中64个5*5大小的filter，步长为1,padding= 2,经过卷积之后数据变为[None,14,14,64];
2. 激活层中数据大小没有变化；
3. 池化层：ksize =[1,2,2,1],步长为2，经过池化层之后数据变为[None,7,7,64]

第一次全连接：
先将输入的数据展开，变成[None,7*7*64],设定的权重w为[7*7*64,1024],经过全连接之后，变为[None,1024];

第二次全连接：
权重设定为[1024,10],经过全连接之后，得到最终的值[None,10]

在上面数据的形状变化中，None代表的是每个批次的数据个数（在这里即为每个batch_size中图片的张数），上面在全连接层中出现的1024维是可以根据情况手动设定的，只要统一就可以了。

阅读全文

1 0