在tensorflow上进行机器学习的“Hello World”：MNIST 手写识别

来源：互联网发布：中国软件千股千评编辑：程序博客网时间：2024/06/13 04:23

softmax 实验过程

进入tfgpu虚拟环境后，首先进入目录:/anaconda2/envs/tfgpu/lib/python2.7/site-packages/tensorflow/examples/tutorials/mnist/,然后进入IPython交互终端。

In [4]: from tensorflow.examples.tutorials.mnist import input_data   ...: mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)   ...: Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.Extracting MNIST_data/train-images-idx3-ubyte.gzSuccessfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.Extracting MNIST_data/train-labels-idx1-ubyte.gzSuccessfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.Extracting MNIST_data/t10k-images-idx3-ubyte.gzSuccessfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.Extracting MNIST_data/t10k-labels-idx1-ubyte.gzIn [5]: import tensorflow as tfIn [6]: x = tf.placeholder(tf.float32, [None, 784])In [7]: W = tf.Variable(tf.zeros([784, 10]))   ...: b = tf.Variable(tf.zeros([10]))   ...: In [8]: y = tf.nn.softmax(tf.matmul(x, W) + b)In [9]: y_ = tf.placeholder(tf.float32, [None, 10])In [10]: cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))In [11]: train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)In [12]: init = tf.initialize_all_variables()In [13]: sess = tf.Session()    ...: sess.run(init)    ...: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zeroI tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: name: GeForce 940Mmajor: 5 minor: 0 memoryClockRate (GHz) 1.124pciBusID 0000:08:00.0Total memory: 1023.88MiBFree memory: 997.54MiBI tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:839] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce 940M, pci bus id: 0000:08:00.0)In [14]: for i in range(1000):    ...:   batch_xs, batch_ys = mnist.train.next_batch(100)    ...:   sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})    ...:   In [15]: correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))In [16]: accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))    ...: In [17]: print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))0.9186

softmax实验说明

In [4]：主要是下载数据In [6]：意思是先分配输入x，None即输入图片数量稍后运行时确定，784即28*28，把一张28*28的图片拉长成为一维向量，保证每张图片拉长方式相同即可In [7]：分配权重w和偏置bIn [8]：实现softmax模型，获得输出判断值yIn [9]: 分配实际判断值y_In [10]:获得交叉熵形式的代价函数In [11]：每一步使用0.5的学习率（步长）来进行梯度下降算法In [12]：初始化所有变量In [13]：开启一个会话，启动模型In [14]：进行1000次随机梯度下降算法In [15]:比较输出判断值y和真实判断值y_In [16]:获得准确率In [17]：获得测试集上的准确率:91.86%

神经网络实验过程

In [1]: from tensorflow.examples.tutorials.mnist import input_data   ...: mnist = input_data.read_data_sets('MNIST_data', one_hot=True)   ...: I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locallyI tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locallyI tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locallyI tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locallyI tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locallyExtracting MNIST_data/train-images-idx3-ubyte.gzExtracting MNIST_data/train-labels-idx1-ubyte.gzExtracting MNIST_data/t10k-images-idx3-ubyte.gzExtracting MNIST_data/t10k-labels-idx1-ubyte.gzIn [2]: import tensorflow as tf   ...: sess = tf.InteractiveSession()   ...: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zeroI tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: name: GeForce 940Mmajor: 5 minor: 0 memoryClockRate (GHz) 1.124pciBusID 0000:08:00.0Total memory: 1023.88MiBFree memory: 997.54MiBI tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:839] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce 940M, pci bus id: 0000:08:00.0)In [3]: def weight_variable(shape):   ...:   initial = tf.truncated_normal(shape, stddev=0.1)   ...:   return tf.Variable(initial)   ...:    ...: def bias_variable(shape):   ...:   initial = tf.constant(0.1, shape=shape)   ...:   return tf.Variable(initial)   ...: In [4]: In [4]: def conv2d(x, W):   ...:   return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')   ...:    ...: def max_pool_2x2(x):   ...:   return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],   ...:                         strides=[1, 2, 2, 1], padding='SAME')   ...: In [5]: W_conv1 = weight_variable([5, 5, 1, 32])   ...: b_conv1 = bias_variable([32])   ...: In [6]: In [7]: x = tf.placeholder(tf.float32, shape=[None, 784])   ...: y_ = tf.placeholder(tf.float32, shape=[None, 10])   ...: In [8]: x_image = tf.reshape(x, [-1,28,28,1])   ...: In [9]: h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)   ...: h_pool1 = max_pool_2x2(h_conv1)   ...: In [10]: W_conv2 = weight_variable([5, 5, 32, 64])    ...: b_conv2 = bias_variable([64])    ...:     ...: h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)    ...: h_pool2 = max_pool_2x2(h_conv2)    ...: In [11]: W_fc1 = weight_variable([7 * 7 * 64, 1024])    ...: b_fc1 = bias_variable([1024])    ...:     ...: h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])    ...: h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)    ...: In [12]: keep_prob = tf.placeholder(tf.float32)    ...: h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)    ...: In [13]: W_fc2 = weight_variable([1024, 10])    ...: b_fc2 = bias_variable([10])    ...:     ...: y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)    ...: In [14]: cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv), reduction_indices=[1]))    ...: train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)    ...: correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))    ...: accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))    ...: sess.run(tf.initialize_all_variables())    ...: for i in range(2000):    ...:   batch = mnist.train.next_batch(50)    ...:   if i%100 == 0:    ...:     train_accuracy = accuracy.eval(feed_dict={    ...:         x:batch[0], y_: batch[1], keep_prob: 1.0})    ...:     print("step %d, training accuracy %g"%(i, train_accuracy))    ...:   train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})    ...:     ...: print("test accuracy %g"%accuracy.eval(feed_dict={    ...:     x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))    ...: step 0, training accuracy 0.04step 100, training accuracy 0.86step 200, training accuracy 0.92step 300, training accuracy 0.88step 400, training accuracy 0.96step 500, training accuracy 0.9step 600, training accuracy 1step 700, training accuracy 0.98step 800, training accuracy 0.92step 900, training accuracy 0.98step 1000, training accuracy 0.94step 1100, training accuracy 0.96step 1200, training accuracy 1step 1300, training accuracy 0.98step 1400, training accuracy 0.94step 1500, training accuracy 0.96step 1600, training accuracy 1step 1700, training accuracy 0.92step 1800, training accuracy 0.92step 1900, training accuracy 0.96I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (256):     Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (512):     Total Chunks: 1, Chunks in use: 0 768B allocated for chunks. 6.4KiB client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.I tensorflow/core/common_runtime/bfc_allocator.cc:639] Bin (1024):     Total Chunks: 0, Chunks in use: 0 0B allocated for chunks. 0B client-requested for chunks. 0B in use in bin. 0B client-requested in use in bin.............Limit:                   836280320InUse:                    83845120MaxInUse:                117678336NumAllocs:                  246915MaxAllocSize:             45883392W tensorflow/core/common_runtime/bfc_allocator.cc:270] *****_******________________________________________________________________________________________W tensorflow/core/common_runtime/bfc_allocator.cc:271] Ran out of memory trying to allocate 957.03MiB.  See logs for memory state.W tensorflow/core/framework/op_kernel.cc:936] Resource exhausted: OOM when allocating tensor with shape[10000,28,28,32]E tensorflow/core/client/tensor_c_api.cc:485] OOM when allocating tensor with shape[10000,28,28,32]     [[Node: Conv2D = Conv2D[T=DT_FLOAT, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](Reshape, Variable/read)]]In [20]: cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv), reduction_indices=[1]))    ...: train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)    ...: correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))    ...: accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))    ...: sess.run(tf.initialize_all_variables())    ...: for i in range(20000):    ...:   batch = mnist.train.next_batch(50)    ...:   if i%100 == 0:    ...:     train_accuracy = accuracy.eval(feed_dict={    ...:         x:batch[0], y_: batch[1], keep_prob: 1.0})    ...:     print("step %d, training accuracy %g"%(i, train_accuracy))    ...:   train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})    ...:     ...: print("test accuracy %g"%accuracy.eval(feed_dict={    ...:     x: mnist.test.images[0:200,:], y_: mnist.test.labels[0:200,:], keep_prob: 1.0}))    ...: step 0, training accuracy 0.12step 100, training accuracy 0.78step 200, training accuracy 0.88step 300, training accuracy 0.96step 400, training accuracy 0.9step 500, training accuracy 0.96step 600, training accuracy 0.94step 700, training accuracy 0.92step 800, training accuracy 0.92step 900, training accuracy 0.96step 1000, training accuracy 0.94step 1100, training accuracy 0.98step 1200, training accuracy 0.96step 1300, training accuracy 1step 1400, training accuracy 0.98.........test accuracy 0.995In [21]: cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv))...: train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)...: correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))...: accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))...: sess.run(tf.initialize_all_variables())...: for i in range(20000):...:   batch = mnist.train.next_batch(50)...:   if i%100 == 0:...:     train_accuracy = accuracy.eval(feed_dict={...:         x:batch[0], y_: batch[1], keep_prob: 1.0})...:     print "step %d, training accuracy %g"%(i, train_accuracy)...:   train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})...: ...: print "test accuracy %g"%accuracy.eval(feed_dict={...:     x: mnist.test.images[200:400,:], y_: mnist.test.labels[200:400,:], keep_prob: 1.0})...:     ...: step 0, training accuracy 0.12step 100, training accuracy 0.94step 200, training accuracy 0.86step 300, training accuracy 0.96step 400, training accuracy 0.9step 500, training accuracy 1step 600, training accuracy 0.96step 700, training accuracy 0.88step 800, training accuracy 1step 900, training accuracy 0.98step 1000, training accuracy 0.96step 1100, training accuracy 0.94step 1200, training accuracy 0.96step 1300, training accuracy 0.96step 1400, training accuracy 0.94step 1500, training accuracy 0.98step 1600, training accuracy 0.96step 1700, training accuracy 0.98.........test accuracy 0.975In [22]: for i in range(20000):...:   batch = mnist.train.next_batch(50)...:   if i%100 == 0:...:     train_accuracy = accuracy.eval(feed_dict={...:         x:batch[0], y_: batch[1], keep_prob: 1.0})...:     print "step %d, training accuracy %g"%(i, train_accuracy)...:   train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})...: ...: print "test accuracy %g"%accuracy.eval(feed_dict={...:     x: mnist.test.images[400:1000,:], y_: mnist.test.labels[400:1000,:], keep_prob: 1.0})...: step 0, training accuracy 1step 100, training accuracy 1step 200, training accuracy 0.98step 300, training accuracy 0.98step 400, training accuracy 0.98step 500, training accuracy 1step 600, training accuracy 0.96step 700, training accuracy 0.96.........test accuracy  0.983333

神经网络实验说明

In [1]: 导入数据，即测试集和验证集In [2]: 引入 tensorflow 启动InteractiveSession(比session更灵活)In [3]: 定义两个初始化w和b的函数，方便后续操作In [4]: 定义卷积和池化函数，这里卷积采用padding，使得输入输出图像一样大，池化采取2x2，那么就是4格变一格In [5]: 定义第一层卷积的w和bIn [7]: 分配输入x和y_In [8]: 修改x的shapeIn [9]: 把x_image和w进行卷积，加上b，然后应用ReLU激活函数，最后进行max-poolingIn [10]: 第二层卷积，和第一层卷积类似In [11]: 全连接层In [12]: 为了减少过拟合，可以在输出层之前加入dropout。（但是本例子比较简单，即使不加，影响也不大）In [13]: 由一个softmax层来得到输出In [14]: 定义代价函数，训练步骤，用ADAM来进行优化，可以看出，最后测试集太大了，我得显存不够    In [20]: 只使用1~200个图片作为测试集，正确率是 0.995In [21]: 只使用201~400个图片作为测试集，正确率是 0.975In [22]: 只使用401~1000个图片作为测试集，正确率是 0.983333

这个CNN的结构如下图所示：

修改CNN

现在尝试修改这个CNN结构，增加特征数量以期获得更好的效果，修改后的CNN结构如图：

实验过程如下：

In [23]:     def weight_variable(shape):    ...:       initial = tf.truncated_normal(shape, stddev=0.1)    ...:       return tf.Variable(initial)    ...:         ...:     def bias_variable(shape):    ...:       initial = tf.constant(0.1, shape=shape)    ...:       return tf.Variable(initial)    ...:     def conv2d(x, W):    ...:       return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')    ...:     def max_pool_2x2(x):    ...:       return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],    ...:                             strides=[1, 2, 2, 1], padding='SAME')    ...:     W_conv1 = weight_variable([5, 5, 1, 64])    ...:     b_conv1 = bias_variable([64])    ...:     h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)    ...:     h_pool1 = max_pool_2x2(h_conv1)    ...:     W_conv2 = weight_variable([5, 5, 64, 128])    ...:     b_conv2 = bias_variable([128])    ...:         ...:     h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)    ...:     h_pool2 = max_pool_2x2(h_conv2)    ...:     W_fc1 = weight_variable([7 * 7 * 128, 1024])    ...:     b_fc1 = bias_variable([1024])    ...:         ...:     h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*128])    ...:     h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)    ...:     keep_prob = tf.placeholder("float")    ...:     h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)    ...:     W_fc2 = weight_variable([1024, 10])    ...:     b_fc2 = bias_variable([10])    ...:         ...:     y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)    ...:     cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv))    ...:     train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)    ...:     correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))    ...:     accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))    ...:     sess.run(tf.initialize_all_variables())    ...:     for i in range(20000):    ...:       batch = mnist.train.next_batch(50)    ...:       if i%100 == 0:    ...:         train_accuracy = accuracy.eval(feed_dict={    ...:             x:batch[0], y_: batch[1], keep_prob: 1.0})    ...:         print "step %d, training accuracy %g"%(i, train_accuracy)    ...:       train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})    ...:         ...:     print "test accuracy %g"%accuracy.eval(feed_dict={    ...:         x: mnist.test.images[0:200,:], y_: mnist.test.labels[0:200,:], keep_prob: 1.0})    ...:     ...    ...    ...    test accuracy 1In [24]: for i in range(20000):    ...:    batch = mnist.train.next_batch(50)    ...:    if i%100 == 0:    ...:       train_accuracy = accuracy.eval(feed_dict={    ...:          x:batch[0], y_: batch[1], keep_prob: 1.0})    ...:       print "step %d, training accuracy %g"%(i, train_accuracy)    ...:    train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})    ...:     ...: print "test accuracy %g"%accuracy.eval(feed_dict={x: mnist.test.images[200:400,:], y_: mnist.test.labels[200:400,:], keep_prob: 1.0})    ...:     ...    ...    ...    test accuracy 0.975In [25]: for i in range(20000):    ...:    batch = mnist.train.next_batch(50)    ...:    if i%100 == 0:    ...:       train_accuracy = accuracy.eval(feed_dict={    ...:          x:batch[0], y_: batch[1], keep_prob: 1.0})    ...:       print "step %d, training accuracy %g"%(i, train_accuracy)    ...:    train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})    ...:     ...: print "test accuracy %g"%accuracy.eval(feed_dict={x: mnist.test.images[400:1000,:], y_: mnist.test.labels[400:1000,:], keep_prob: 1.0})    ...:     ...    ...    ...    W tensorflow/core/common_runtime/bfc_allocator.cc:213] Ran out of memory trying to allocate 717.77MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.In [26]: print "test accuracy %g"%accuracy.eval(feed_dict={x: mnist.test.images[400:600,:], y_: mnist.test.labels[400:600,:], keep_prob: 1.0})    ...:    test accuracy 0.985In [28]: print "test accuracy %g"%accuracy.eval(feed_dict={x: mnist.test.images[600:800,:], y_: mnist.test.labels[600:800,:], keep_prob: 1.0})    ...:     ...:     ...: test accuracy 0.985In [29]: print "test accuracy %g"%accuracy.eval(feed_dict={x: mnist.test.images[800:1000,:], y_: mnist.test.labels[800:1000,:], keep_prob: 1.0})    ...:     ...: test accuracy 0.995

修改前的平均准确率是：

（ 0.995*2 + 0.975*2 + 0.9833*6 ）/ 10 = 0.98398

修改后的平均准确率是：

（1*2 + 0.975*2+ 0.985*4 + 0.995*2）/ 10 = 0.98800

可以看出增加特征过后，准确率提高了，但是内存消耗也变大了（400~1000的图片验证出现了OOM错误），而且实验过程中也感受到时间消耗更大，怎么取舍就取决于具体需求和具体的硬件配置了。

0 0