Tensorflow手写数字识别之简单神经网络分类与CNN分类效果对比

来源:互联网 发布:计价软件 编辑:程序博客网 时间:2024/06/08 09:15

用Tensorflow进行深度学习和人工智能具有开发简单,建模速度快,准确度高的优点。作为学习图像识别分类的入门,手写输入数字识别是个很好的例子。

MNIST包中共有60000个手写数字笔迹灰度图像作为训练集,每张手写数字笔迹图片均已保存为28*28像素,同时还有一个label集对这60000个训练图像一一标识。此外,还有一个测试集,包括10000张新的手写笔记灰度图像,以及一个对应10000张图片的标记。通过使用60000张训练集图片及label集分别创建简单的MNIST模型和CNN卷积神经网络模型,而后使用10000张测试图片及对应的label集对比不同模型效果。

A. 创建简单神经网络模型步骤如下:

1. 鉴于每张图片分辨率为28*28像素,即28行28列个数据,对于简单MNIST模型,这样的数据结构还过于复杂,若将图像中所有像素的二维关系转化为一维关系,模型建立和训练将会很简单。为将该图片中的所有像素串行化,即将该图片格式变为一行784列(1*784的结构)。对于模型的输出,可使用一个一行十列的结构,表示该模型分析手写图片后对应数字0~9的概率,概率最大者为1,其余9个为0。假设输入图像为n,则输入数据集可表示为一个二维张量[n, 784],对于输出,使用[n, 10]的二维张量。程序中使用占位符placeholder表示,张数参数n使用None占位,由具体输入的图像张数初始化。

#define place holder for inputs to network

xs =tf.placeholder(tf.float32, [None,784])#28*28

ys =tf.placeholder(tf.float32, [None,10])

2. 添加中间层网络。可使用Y =XW + b的定义中间层模型,X表示输入的数据集(为[n,784]的二维张量); W为weight权重张量,为[784, 10]的张量,XW做矩阵乘法后得到[n, 10]的张量; b为bias量,维度为[1,10]; Y为预测结果张量,该结果张量还需要使用激励函数处理,以拉开预测各数字概率,提高预测正确性,本程序中使用tf.nn.softmax方法,专门针对n选一的用例

def add_layer(inputs,in_size, out_size, activation_function=None):

    #add one morelayer and return the output of this layer

    W = tf.Variable(tf.random_normal([in_size,out_size]))

    b = tf.Variable(tf.zeros([1,out_size])+0.1)

    Wb = tf.matmul(inputs, W)+b

    if activation_functionis None:

        outputs = Wb

    else:

        outputs = activation_function(Wb)

    return outputs


3. 创建并定义网络。首先定义prediction张量,其值为添加中间层网络的返回张量。之后计算交叉熵cross_entropy,并使用梯度下降优化器GradientDescentOptimizer对交叉熵处理并训练得到张量train_step。

#add output layer

prediction= add_layer(xs, 784, 10,activation_function= tf.nn.softmax)

#the error between prediction and real data

cross_entropy= tf.reduce_mean(-tf.reduce_sum(ys* tf.log(prediction),reduction_indices=[1]))#loss

train_step= tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

 

4. 训练网络,首先要对所有变量初始化,之后,每次从训练集中随机去除100个样本训练网络,总共训练1001次得到训练模型

with tf.Session()assess:   

    if int((tf.__version__).split('.')[1]) <12andint((tf.__version__).split('.')[0])<1:       

        init =tf.initialize_all_veriables()   

    else:       

        init =tf.global_variables_initializer()   

        print(tf.__version__)    sess.run(init)   

        for i inrange(1001):       

                 batch_xs, batch_ys =mnist.train.next_batch(100)       

                 sess.run(train_step, feed_dict = {xs: batch_xs, ys: batch_ys})

 

5. 计算模型准确性,算法如下,v_xs为输入的测试图像集,v_ys为输入测试图像对应的label集。依据输入v_xs计算出的预测结果集为y_pre将与v_ys这个label集进行对比,如果相同则判断正确,否则为错误,计算出的正确结果保存在correct_prediction 中。之后将correct_prediction张量转换为float32格式,并求均值得到正确率。

def compute_accuracy(v_xs,v_ys):

    global  prediction

    y_pre = sess.run(prediction, feed_dict= {xs:v_xs})

    correct_prediction = tf.equal(tf.argmax(y_pre,1), tf.argmax(v_ys,1))

    accuracy =tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    result = sess.run(accuracy, feed_dict= {xs: v_xs, ys:v_ys})

    return result



B. 创建CNN模型步骤如下:

1. 对于CNN网络,无需将图像转换为一维张量,保持其28*28*1(1为图像的channel数,灰度图像为1,彩色图像为3)的样式进行卷积,卷积后,图像将被变为28*28*32的张量。

2. 定义卷积核。卷积核为[5,5,1,32]的思维张量,该卷积核为5*5的大小,输入size为1,输出size为32

def kernel_variable(shape):

    initial = tf.truncated_normal(shape=shape,stddev=0.1)

    return tf.Variable(initial)

w_conv1 = kernel_variable([5,5,1,32])

 

3. 定义bias偏量,其输出size为32

def bias_variable(shape):

    initial = tf.constant(0.1, shape=shape)

    return tf.Variable(initial)

b_conv1 = bias_variable([32])

 

4. 构建两层卷积层,每层卷积的输出层均被relu激励函数处理,而后池化,作为下一层网络的输入。第一层卷积层处理后将n*28*28*1的图像集转换为n*28*28*32的维度,经历池化后变为n*14*14*32。第二层卷积层将第一层卷积层的输出由n*14*14*32变为n*14*14*64,经历池化后变为n*7*7*64维度。

# conv1 layer

w_conv1= kernel_variable([5,5,1,32])  #kernel 5*5, insize 1, out size 32

b_conv1= bias_variable([32])

h_conv1= tf.nn.relu(conv2d(x_image, w_conv1)+b_conv1) #output size 28*28*32

h_pool1= max_pool_2x2(h_conv1)         #output size 14*14*32

# conv2 layer

w_conv2= kernel_variable([5,5,32,64]) #kernel 5*5, insize 32, out size 64

b_conv2= bias_variable([64])

h_conv2= tf.nn.relu(conv2d(h_pool1, w_conv2)+ b_conv2) #outputsize 14*14*64

h_pool2= max_pool_2x2(h_conv2)         #output size 7*7*64

 

5. 建立两层神经网络预测结果。第一层神经网络现将第二次池化后的n*7*7*64的四维张量输入图像转换为n*3136的二维张量,3136是将7*7*64三维的数据转换为一维,之后该n*3136的张量与weight权重矩阵([3136,1024] 的张量)相乘得到n*1024的二维张量输出给第二层网络层。为了应对过拟合,使用dropout以0.5的概率故意丢弃部分网络节点以提高网络适应性。第二层网络层权重矩阵为1024*10,与第一次输出结果相乘后得到n*10的结果集合。对于一对一的输出结果,可采用sigmod处理,对于一对多的输出,如本例,采用softmax。

# fc1 layer

w_fc1= kernel_variable([7*7*64,1024])

b_fc1= bias_variable([1024])

h_pool2_flat= tf.reshape(h_pool2, [-1,7*7*64])

h_fc1= tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1)+b_fc1)

h_fc1_drop= tf.nn.dropout(h_fc1, keep_prob)

# fc2 layer

w_fc2= kernel_variable([1024,10])

b_fc2= bias_variable([10])

prediction_CNN= tf.nn.softmax(tf.matmul(h_fc1_drop,w_fc2)+b_fc2)

6.训练CNN网络。首先初始化所有变量。而后从训练集中每次取出100张图片和label训练网络,共训练1000次。

cross_entropy_CNN = tf.reduce_mean(-tf.reduce_sum(ys*tf.log(prediction_CNN),reduction_indices=[1]))#loss

train_step_CNN = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy_CNN)

with tf.Session()assess:

    if int((tf.__version__).split('.')[1]) <12andint((tf.__version__).split('.')[0])<1:

        init =tf.initialize_all_veriables()

    else:

        init =tf.global_variables_initializer()

    print(tf.__version__)

    sess. run(init)

 

    for i in range(1001):

        batch_xs, batch_ys =mnist.train.next_batch(100)

        sess.run(train_step_CNN, feed_dict={xs: batch_xs,ys: batch_ys, keep_prob:0.5})

 

7.计算模型准确性,算法如下,v_xs为输入的测试图像集,v_ys为输入测试图像对应的label集。依据输入v_xs计算出的预测结果集为y_pre将与v_ys这个label集进行对比,如果相同则判断正确,否则为错误,计算出的正确结果保存在correct_prediction 中。之后将correct_prediction张量转换为float32格式,并求均值得到正确率。 

def compute_accuracy(v_xs, v_ys):

    global  prediction_CNN

    y_pre = sess.run(prediction_CNN,feed_dict= {xs:v_xs})

    correct_prediction =tf.equal(tf.argmax(y_pre,1), tf.argmax(v_ys,1))

    accuracy =tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    result = sess.run(accuracy, feed_dict= {xs: v_xs,ys:v_ys})

    return result

8.每训练100次,使用测试集对网络当前训练结果进行检测,打印预测正确率。

for i inrange(1001):

    batch_xs, batch_ys =mnist.train.next_batch(100)

    sess.run(train_step, feed_dict = {xs: batch_xs,ys: batch_ys})

    sess.run(train_step_CNN, feed_dict={xs: batch_xs,ys: batch_ys, keep_prob:0.5})

 

    if i%100==0:

        print('correctness:         ', i,' is ',compute_accuracy(mnist.test.images, mnist.test.labels))

        print('correctness_CNN:     ', i,' is ',compute_accuracy_CNN(mnist.test.images, mnist.test.labels))

 

C. 结果对比如下:如下图可见,CNN网络准确性随着训练次数增加而提升,最后能打奥0.9683的准确度(完全正确为1),而简单MNIST在训练到800次时出现过拟合,准确率从最高的0.8692降到了0.098。我的电脑比较老,i5 (2410M)的CPU,8G内存,训练大约需要15分钟,对CPU使用率要求较高,内存在CNN网络训练时占用量较大。



途中红线为普通神经网络结果,蓝线为CNN网络结果,由左图可见,两种方法的loss都在随着训练次数的增加而降低,但是CNN能够更接近0,表现更出众,而预测精度也是类似,普通网络能达到约87%的正确率,但CNN网络可以达到97%,精度提升显著。每轮的计算结果如下:

correctness:          0   is  0.147100001574
correctness_CNN:      0   is  0.12120000273
loss:                 0   is  9.97904
loss_CNN:             0   is  5.7561
correctness:          100   is  0.73710000515
correctness_CNN:      100   is  0.888899981976
loss:                 100   is  1.38197
loss_CNN:             100   is  0.353873
correctness:          200   is  0.805999994278
correctness_CNN:      200   is  0.930100023746
loss:                 200   is  0.997057
loss_CNN:             200   is  0.235152
correctness:          300   is  0.825699985027
correctness_CNN:      300   is  0.940500020981
loss:                 300   is  0.866042
loss_CNN:             300   is  0.196917
correctness:          400   is  0.847999989986
correctness_CNN:      400   is  0.951200008392
loss:                 400   is  0.753898
loss_CNN:             400   is  0.165623
correctness:          500   is  0.853100001812
correctness_CNN:      500   is  0.954999983311
loss:                 500   is  0.697782
loss_CNN:             500   is  0.147157
correctness:          600   is  0.860800027847
correctness_CNN:      600   is  0.960699975491
loss:                 600   is  0.666501
loss_CNN:             600   is  0.137592
correctness:          700   is  0.866400003433
correctness_CNN:      700   is  0.963800013065
loss:                 700   is  0.618222
loss_CNN:             700   is  0.119138
correctness:          800   is  0.868799984455
correctness_CNN:      800   is  0.967599987984
loss:                 800   is  0.59465
loss_CNN:             800   is  0.108558
correctness:          900   is  0.875800013542
correctness_CNN:      900   is  0.969799995422
loss:                 900   is  0.567654
loss_CNN:             900   is  0.101511
correctness:          1000   is  0.87349998951
correctness_CNN:      1000   is  0.971400022507
loss:                 1000   is  0.564226
loss_CNN:             1000   is  0.0913478

D. 完整代码如下:

from __future__ import print_functionimport tensorflow as tffrom tensorflow.examples.tutorials.mnist import input_dataimport osimport numpy as npimport matplotlib.pyplot as pltMODEL_SAVE_PATH="my_net/"MODEL_NAME="save_net.ckpt"#number 1 to 10 datamnist = input_data.read_data_sets('MNIST_data', one_hot=True)def add_layer(inputs, in_size, out_size, activation_function=None):    #add one more layer and return the output of this layer    W = tf.Variable(tf.random_normal([in_size, out_size]))    b = tf.Variable(tf.zeros([1,out_size])+0.1)    Wb = tf.matmul(inputs, W)+b    if activation_function is None:        outputs = Wb    else:        outputs = activation_function(Wb)    return outputsdef compute_accuracy(v_xs, v_ys):    global  prediction    y_pre = sess.run(prediction, feed_dict = {xs:v_xs})    correct_prediction = tf.equal(tf.argmax(y_pre,1), tf.argmax(v_ys,1))    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))    result = sess.run(accuracy, feed_dict = {xs: v_xs, ys:v_ys})    return resultdef compute_accuracy_CNN(v_xs, v_ys):    global  prediction_CNN    y_pre = sess.run(prediction_CNN, feed_dict = {xs:v_xs, keep_prob:1})    correct_prediction = tf.equal(tf.argmax(y_pre,1), tf.argmax(v_ys,1))    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))    result = sess.run(accuracy, feed_dict = {xs: v_xs, ys:v_ys, keep_prob:1})    return resultdef kernel_variable(shape):    initial = tf.truncated_normal(shape=shape, stddev = 0.1)    return tf.Variable(initial)def bias_variable(shape):    initial = tf.constant(0.1, shape=shape)    return tf.Variable(initial)def conv2d(x,W):    #stride [1, x_movement, y_movement,1]    #stride[0] and stride[3] must be 1    return tf.nn.conv2d(x, W, strides = [1,1,1,1], padding = 'SAME')def max_pool_2x2(x):    # stride [1, x_movement, y_movement,1]    return tf.nn.max_pool(x, ksize= [1,2,2,1], strides=[1,2,2,1], padding='SAME')#define place holder for inputs to networkxs = tf.placeholder(tf.float32, [None, 784]) #28*28ys = tf.placeholder(tf.float32, [None, 10])keep_prob = tf.placeholder(tf.float32)x_image = tf.reshape(xs, [-1,28,28,1])# conv1 layerw_conv1 = kernel_variable([5,5,1,32])   #kernel 5*5, in size 1, out size 32b_conv1 = bias_variable([32])h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1)+b_conv1)  #output size 28*28*32h_pool1 = max_pool_2x2(h_conv1)         #output size 14*14*32# conv2 layerw_conv2 = kernel_variable([5,5,32,64])  #kernel 5*5, in size 32, out size 64b_conv2 = bias_variable([64])h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2)+ b_conv2) #output size 14*14*64h_pool2 = max_pool_2x2(h_conv2)         #output size 7*7*64# fc1 layerw_fc1 = kernel_variable([7*7*64, 1024])b_fc1 = bias_variable([1024])h_pool2_flat = tf.reshape(h_pool2, [-1,7*7*64])h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1)+b_fc1)h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)# fc2 layerw_fc2 = kernel_variable([1024,10])b_fc2 = bias_variable([10])prediction_CNN = tf.nn.softmax(tf.matmul(h_fc1_drop,w_fc2)+b_fc2)#add output layerprediction = add_layer(xs, 784, 10, activation_function= tf.nn.softmax)#the error between prediction and real datacross_entropy = tf.reduce_mean(-tf.reduce_sum(ys* tf.log(prediction), reduction_indices=[1])) #losscross_entropy_CNN = tf.reduce_mean(-tf.reduce_sum(ys* tf.log(prediction_CNN), reduction_indices=[1])) #losstrain_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)train_step_CNN = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy_CNN)saver = tf.train.Saver()  # define a saver for saving and restoringTotal_test_loss = np.zeros((int(1001/100)+1), float)Total_test_loss_CNN = np.zeros((int(1001/100)+1), float)Total_test_acc = np.zeros((int(1001/100)+1), float)Total_test_acc_CNN = np.zeros((int(1001/100)+1), float)count =0with tf.Session() as sess:    if int((tf.__version__).split('.')[1]) <12 and int((tf.__version__).split('.')[0])<1:        init = tf.initialize_all_veriables()    else:        init = tf.global_variables_initializer()    print(tf.__version__)    sess. run(init)    for i in range(1001):        batch_xs, batch_ys = mnist.train.next_batch(100)        sess.run(train_step, feed_dict = {xs: batch_xs, ys: batch_ys})        sess.run(train_step_CNN, feed_dict={xs: batch_xs, ys: batch_ys, keep_prob: 0.5})        if i%100 ==0:            Total_test_acc[count] = compute_accuracy(mnist.test.images, mnist.test.labels)            Total_test_acc_CNN[count] = compute_accuracy_CNN(mnist.test.images, mnist.test.labels)            print('correctness:         ', i, ' \tis \t', Total_test_acc[count])            print('correctness_CNN:     ', i, ' \tis \t', Total_test_acc_CNN[count])            loss = sess.run(cross_entropy, feed_dict={xs: mnist.test.images, ys: mnist.test.labels, keep_prob: 1.0})            loss_CNN = sess.run(cross_entropy_CNN,                                feed_dict={xs: mnist.test.images, ys: mnist.test.labels, keep_prob: 1.0})            print('loss:                ', i, ' \tis \t', loss)            print('loss_CNN:            ', i, ' \tis \t', loss_CNN)            Total_test_loss[count] = loss            Total_test_loss_CNN[count] = loss_CNN            count += 1    saver.save(sess, os.path.join(MODEL_SAVE_PATH, MODEL_NAME), write_meta_graph=False)    # plotting    plt.figure(1, figsize=(15, 5))    plt.subplot(121)    # plt.scatter(x, y)    plt.ylabel('Compare Losses')    plt.plot(Total_test_loss, 'r-', lw=5)    plt.plot(Total_test_loss_CNN, 'b-', lw=5)    plt.text(-1, -1, 'Loss Chart')    plt.subplot(122)    # plt.scatter(x, y)    plt.ylabel('Compare Accuracy:')    plt.plot(Total_test_acc, 'r-', lw=5)    plt.plot(Total_test_acc_CNN, 'b-', lw=5)    plt.text(-1, -1, 'Accuracy Chart')    plt.show()


阅读全文
1 0
原创粉丝点击