TensorFlow学习--实现简单的卷积网络

来源:互联网 发布:手机音乐恢复软件 编辑:程序博客网 时间:2024/06/05 19:15

创建一个多层卷积网络 :

卷积:

conv2d(input,filter,strides,padding,use_cudnn_on_gpu=None,data_format=None, name=None)

input:
要做卷积的数据,为一个4维张量,类型为float32或float64之一,shape为[batch, height, width, channels]即[一个batch的图像数量, 图像高度, 图像宽度, 图像通道数].

fliter:
过滤器相当于CNN中的卷积核,为一个4维向量,类型为float32或float64之一,shape为[filter_height, filter_width, in_channels, out_channels]即[卷积核高度,卷积核宽度,图像通道数,卷积核个数]

strides:
卷积时在图像每一维上的步长,是一个长度为4的一维向量.
一般不对Input的第一维和第四维进行操作,故strides 一般为[1,X,X,1];

padding:
string类型,选”SAME”或”VALID”.
选”SAME”时,会对原图像进行补零,使输入输出的图像大小一致;
选”VALID”时,不会对原图像进行补零,会缩小原图像的大小;

use_cudnn_on_gpu:
bool类型,是否用cudnn加速,默认为True.

data_format:
string类型,选”NHWC”或”NCHW”,默认为NHWC.

最大池化:

max_pool(value, ksize, strides, padding, data_format="NHWC", name=None)

计算输入的最大池化.
value:一个4维张量,shape为[batch,height,width,channels]即[一个batch内图像的数量,图像的高度,图像的宽度,通道数],类型为tf.float32.
ksize:长度大于等于4的整型列表,代表输入张量每个维度的窗口的大小.
strides:长度大于等于4的整型列表,输入张量的每个维度的滑动窗口的跨度.
padding:填充,字符串类型可选项为VALID或SAME.
data_format:字符串类型。 支持“NHWC”和“NCHW”.
name:操作的名称.
输出为一个类型为tf.float32的张量.

修正线性单元

relu(features, name=None)

计算修正线性单元:max(features,0).
feature:一个张量,类型可以为 float32, float64, int32, int64, uint8, int16, int8, uint16, half.

代码及注释:

#!/usr/bin/python# coding:utf-8import tensorflow as tf# import tensorflow.examples.tutorials.mnist.input_data as input_dataimport input_data# 下载并读取数据mnist = input_data.read_data_sets("Mnist_data/", one_hot=True)# 运行交互计算图# sess = tf.InteractiveSession()x = tf.placeholder("float", [None, 784])y_ = tf.placeholder("float", shape=[None, 10])# 权重初始化# 用一较小正数初始化偏置项以避免神经元节点输出恒为0def weight_variable(shape):    initial = tf.truncated_normal(shape, stddev=0.1)    return tf.Variable(initial)def bias_variable(shape):    initial = tf.constant(0.1, shape=shape)    return tf.Variable(initial)# 卷积和池化# 卷积步长为1边界用0填充def conv2d(x, W):    # 输入图像shape=[batch, in_height, in_width, in_channels] float32/float64    # 卷积核  shape=[filter_height, filter_width, in_channels, out_channels]    # strides 在图像每一维的步长    # padding string类型的量”SAME”/”VALID”    # 结果返回一个Tensor 即feature map    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')# 池化用2x2大小的模板做max poolingdef max_pool_2x2(x):    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')# [图片数,宽,高,通道数]x_image = tf.reshape(x, [-1, 28, 28, 1])# 第一层卷积# 卷积的权重张量形状是[5, 5, 1, 32],前两个维度是patch的大小,接着是输入的通道数目,最后是输出的通道数目W_conv1 = weight_variable([5, 5, 1, 32])b_conv1 = bias_variable([32])h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)h_pool1 = max_pool_2x2(h_conv1)# 第二层卷积# 每个5x5的patch会得到64个特征W_conv2 = weight_variable([5, 5, 32, 64])b_conv2 = bias_variable([64])h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)h_pool2 = max_pool_2x2(h_conv2)# 密集连接层# 加入一个有1024个神经元的全连接层将图片尺寸减小到7x7W_fc1 = weight_variable([7*7*64, 1024])b_fc1 = bias_variable([1024])h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)# Dropout# 用一个placeholder来代表一个神经元的输出在dropout中保持不变的概率keep_prob = tf.placeholder("float")h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)# 输出层# 添加一个softmax层W_fc2 = weight_variable([1024, 10])b_fc2 = bias_variable([10])y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)# 训练和评估模型# 在feed_dict中加入额外的参数keep_prob来控制dropout比例# 每100次迭代输出一次日志cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv))train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))sess = tf.InteractiveSession()sess.run(tf.initialize_all_variables())for i in range(2000):    batch = mnist.train.next_batch(50)    if i % 100 == 0:        # 训练过程中启用dropout        train_accuracy = accuracy.eval(feed_dict={x: batch[0], y_: batch[1], keep_prob: 1.0})        print "step %d, training accuracy %g" % (i, train_accuracy)    train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})# 测试过程中关闭dropoutacc = accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0})print "test accuracy %g" % accsess.close()

输出:

step 0, training accuracy 0.14step 100, training accuracy 0.82step 200, training accuracy 0.92step 300, training accuracy 0.88step 400, training accuracy 0.98step 500, training accuracy 0.92step 600, training accuracy 0.98step 700, training accuracy 0.96step 800, training accuracy 0.88step 900, training accuracy 1step 1000, training accuracy 0.94step 1100, training accuracy 1step 1200, training accuracy 0.98step 1300, training accuracy 0.96step 1400, training accuracy 0.94step 1500, training accuracy 0.92step 1600, training accuracy 0.96step 1700, training accuracy 0.98step 1800, training accuracy 0.98step 1900, training accuracy 0.96test accuracy 0.9767
阅读全文
0 0