Tensorflow学习笔记

来源:互联网 发布:bat 算法工程师 编辑:程序博客网 时间:2024/06/06 02:37

Tensorflow学习笔记

参考

知乎
莫烦 Tensorflow

安装

# python 2+ 的用户:$ pip install tensorflow# python 3+ 的用户:$ pip3 install tensorflow

更新

# 如果你是 Python 2, 请复制下面pip uninstall tensorflow# 如果你是 Python 3, 请复制下面pip3 uninstall tensorflow

基础结构

TensorFlow是采用数据流图(组装计算图(管道)):
- 计算图(graph):要组装的结构。由许多操作组成。
- 操作(ops):接受(流入)零个或多个输入(液体),返回(流出)零个或多个输出。
- 数据类型:主要分为张量(tensor)、变量(variable)和常量(constant)

数据以张量(tensor)形式存在,张量(tensor):
- 张量有多种. 零阶张量为 纯量或标量 (scalar) 也就是一个数值. 比如 [1]
- 阶张量为 向量 (vector), 比如 一维的 [1, 2, 3]
- 二阶张量为 矩阵 (matrix), 比如 二维的 [[1, 2, 3],[4, 5, 6],[7, 8, 9]]
- 以此类推, 还有 三阶 三维的 …多维array或list(管道中的液体)
- 创建tensor_name=tf.placeholder(type, shape, name)

import tensorflow as tfimport numpy as np# create datax_data = np.random.rand(100).astype(np.float32)y_data = x_data*0.1 + 0.3### create tensorflow structure start ###Weights = tf.Variable(tf.random_uniform([1], -1.0, 1.0))biases = tf.Variable(tf.zeros([1]))y = Weights*x_data + biasesloss = tf.reduce_mean(tf.square(y-y_data))optimizer = tf.train.GradientDescentOptimizer(0.5)train = optimizer.minimize(loss)init = tf.initialize_all_variables()### create tensorflow structure end ###sess = tf.Session()# tf.initialize_all_variables() no long valid from# 2017-03-02 if using tensorflow >= 0.12if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1:    init = tf.initialize_all_variables()else:    init = tf.global_variables_initializer()sess.run(init)for step in range(201):    sess.run(train)    if step % 20 == 0:        print(step, sess.run(Weights), sess.run(biases))

基本运算

##创建实例matrix1=tf.constant([[3., 3.]])##乘法与矩阵乘法mul = tf.mul(tf.constant(3.0), tf.constant(4.0))product = tf.matmul(matrix1, matrix2)##减法subsub = tf.sub(x, a)print sub.eval()##加法addnew_value = tf.add(state, one)##更新assignupdate = tf.assign(state, new_value)##随机生成3x1矩阵tf.random_normal([3,1])##生成1x2的0矩阵tf.zeros([1, 2])tf.zeros([1, 2])+0.1 #所有位置加0.1##平方tf.square(2)##在某一维上的其数据最大值所在的索引值tf.argmax(y,1)

tf.equal(A, B)

tf.equal(A, B)是对比这两个矩阵或者向量的相等的元素,如果是相等的那就返回True,反正返回False

import tensorflow as tf  import numpy as np  A = [[1,3,4,5,6]]  B = [[1,3,4,3,2]]  with tf.Session() as sess:      print(sess.run(tf.equal(A, B))) ## [[ True  True  True False False]]     

tf.argmax(input,axis) 0表示按列,1表示按行

tf.argmax(input,axis) 0表示按列,1表示按行 返回最大值所在索引位置

tf.cast(x, dtype, name=None) 格式转化

tf.cast(x, dtype, name=None) 将x的数据格式转化成dtype

tf.reduce_mean(x, axis)

求平均值,如果不指定第二个参数,那么就在所有的元素中取平均值

tf.truncated_normal(shape, mean, stddev)

产生指定均值和标准差的正太分布
tf.truncated_normal(shape=[10,10], mean=0, stddev=1)

tf.constant(value,dtype=None,shape=None,name=’Const’)

创建一个常量tensor,按照给出value来赋值,可以用shape来指定其形状

Session 会话控制

Session 是 Tensorflow 为了控制,和输出文件的执行的语句. 使用计算图(获取液体),执行(launch)构建的计算图
运行 session.run() 可以获得你要得知的运算结果, 或者是你所要运算的部分.
两种使用session的形似:
输出两个 matrix 矩阵相乘

import tensorflow as tf# create two matrixesmatrix1 = tf.constant([[3,3,1]])matrix2 = tf.constant([[2],                       [2],                       [1]])product = tf.matmul(matrix1,matrix2) #定义后不会直接出结果,需要用session计算# method 1  执行语句:sess.run(op)sess = tf.Session()result = sess.run(product)print(result)sess.close()# [[12]]# method 2  "with" 代码块 来自动完成关闭动作.with tf.Session() as sess:    result2 = sess.run(product)    print(result2)# [[12]]

Variable 变量

tensorflow定义了某字符串是变量,它才是变量
定义语法: state = tf.Variable()
Tensorflow 中设定了变量,一定要初始化变量
init = tf.initialize_all_variables()
sess.run(init) , 激活 init

import tensorflow as tfstate = tf.Variable(0, name='counter')# 定义常量 oneone = tf.constant(1)# 定义加法步骤 (注: 此步并没有直接计算)new_value = tf.add(state, one)# 将 State 更新成 new_valueupdate = tf.assign(state, new_value)# 如果定义 Variable, 就一定要 initializeinit = tf.initialize_all_variables() # tf 马上就要废弃这种写法#init = tf.global_variables_initializer()  # 替换成这样就好# 使用 Sessionwith tf.Session() as sess:    sess.run(init)    for _ in range(3):        sess.run(update)        print(sess.run(state)) #直接 print(state) 不起作用

Placeholder 占位符

placeholder 是 Tensorflow 中的占位符,暂时储存变量.
Tensorflow 如果想要从外部传入data, 那就需要用到 tf.placeholder()
然后以这种形式传输数据 sess.run(*, feed_dict={input: }).
feed 使用一个 tensor 值临时替换一个操作的输出结果
送值(feed):输入操作的输入值(输入液体)
取值(fetch):获取操作的输出值(得到液体)

import tensorflow as tf#feed#在 Tensorflow 中需要定义 placeholder 的 type ,一般为 float32 形式input1 = tf.placeholder(tf.float32)input2 = tf.placeholder(tf.float32)# mul = multiply 是将input1和input2 做乘法运算,并输出为 output ouput = tf.mul(input1, input2)#送值(feed):输入操作的输入值(输入液体)with tf.Session() as sess:    print(sess.run(ouput, feed_dict={input1: [7.], input2: [2.]}))#取值(fetch):获取操作的输出值(得到液体)#Fetch为了取回操作的输出内容, 可以在使用 Session 对象的 run() 调用 执行图时, 传入一些 tensorinput1 = tf.constant(3.0)input2 = tf.constant(2.0)input3 = tf.constant(5.0)intermed = tf.add(input2, input3)mul = tf.mul(input1, intermed)with tf.Session() as sess:  result = sess.run([mul, intermed])  print result

激励函数 Activation Function

定义添加层 def add_layer()

定义添加神经层的函数def add_layer(),
它有四个参数:输入值、输入的大小、输出的大小和激励函数

 def add_layer(inputs, in_size, out_size, activation_function=None):    Weights = tf.Variable(tf.random_normal([in_size, out_size])) #初始参数 权重    biases = tf.Variable(tf.zeros([1, out_size]) + 0.1) #偏置 在机器学习中,biases的推荐值不为0    Wx_plus_b = tf.matmul(inputs, Weights) + biases #神经网络未激活的值    #激励函数为None时,输出就是当前的预测值(输出=输入)    #不为None时,就把输入wx+b传到activation_function()函数    if activation_function is None:        outputs = Wx_plus_b    else:        outputs = activation_function(Wx_plus_b)    return outputs       

构建网络

import tensorflow as tfimport numpy as npdef add_layer(inputs, in_size, out_size, activation_function=None):    # add one more layer and return the output of this layer    Weights = tf.Variable(tf.random_normal([in_size, out_size]))    biases = tf.Variable(tf.zeros([1, out_size]) + 0.1)    Wx_plus_b = tf.matmul(inputs, Weights) + biases    if activation_function is None:        outputs = Wx_plus_b    else:        outputs = activation_function(Wx_plus_b)    return outputs# Make up some real datax_data = np.linspace(-1,1,300)[:, np.newaxis] #linspace指定的间隔返回均匀间隔的数字 newaxis变为1维noise = np.random.normal(0, 0.05, x_data.shape)y_data = np.square(x_data) - 0.5 + noise# define placeholder for inputs to networkxs = tf.placeholder(tf.float32, [None, 1]) #None代表无论输入有多少都可以 输入只有一个特征,所以这里是1ys = tf.placeholder(tf.float32, [None, 1])#输入层1个、隐藏层10个、输出层1个的神经网络# add hidden layerl1 = add_layer(xs, 1, 10, activation_function=tf.nn.relu)# add output layerprediction = add_layer(l1, 10, 1, activation_function=None)# the error between prediction and real dataloss = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction),                     reduction_indices=[1]))train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)# important step# tf.initialize_all_variables() no long valid from# 2017-03-02 if using tensorflow >= 0.12if int((tf.__version__).split('.')[1]) < 12:    init = tf.initialize_all_variables()else:    init = tf.global_variables_initializer()sess = tf.Session()sess.run(init)for i in range(1000):    # training    sess.run(train_step, feed_dict={xs: x_data, ys: y_data})    if i % 50 == 0:        # to see the step improvement        print(sess.run(loss, feed_dict={xs: x_data, ys: y_data}))##可视化结果import matplotlib.pyplot as pltfig = plt.figure()ax = fig.add_subplot(1,1,1)ax.scatter(x_data, y_data)#plt.ion()#本次运行请注释,全局运行不要注释  plt.ion()用于连续显示#plt.show()#每隔50次训练刷新一次图形,用红色、宽度为5的线来显示我们的预测数据和输入之间的关系,并暂停0.1sfor i in range(1000):    # training    sess.run(train_step, feed_dict={xs: x_data, ys: y_data})    if i % 50 == 0:        # to visualize the result and improvement        try:            ax.lines.remove(lines[0])        except Exception:            pass        prediction_value = sess.run(prediction, feed_dict={xs: x_data})        # plot the prediction        lines = ax.plot(x_data, prediction_value, 'r-', lw=5)        plt.pause(0.1)plt.show()

优化器 optimizer

tf的7种优化器
最常用的: GradientDescentOptimizer

tf.train.Optimizer
tf.train.GradientDescentOptimizer
tf.train.AdadeltaOptimizer
tf.train.AdagradOptimizer
tf.train.AdagradDAOptimizer
tf.train.MomentumOptimizer
tf.train.AdamOptimizer
tf.train.FtrlOptimizer
tf.train.ProximalGradientDescentOptimizer
tf.train.ProximalAdagradOptimizer
tf.train.RMSPropOptimizer

简介Optimizer
推荐使用的更新是SGD + Nesterov Momentum或Adam
- Stochastic Gradient Descent (SGD):数据拆分成小批小批的, 然后再分批不断放入 NN 中计算,加速计算
- AdaGrad 更新方法:修改学习率 v+=dx^2 W+=-Learningrate * dx / v^0.5 对错误方向的阻力
- Momentum 更新方法:W+=-Learningrate * dx 改为 m=b1 * m -Learningrate * dx W+=m 下降更快 惯性原则
- RMSProp 更新方法:有了 momentum 的惯性原则 , 加上 adagrad 的对错误方向的阻力
- Adam 更新方法:计算m 时有 momentum 下坡的属性, 计算 v 时有 adagrad 阻力的属性, 然后再更新参数时 把 m 和 V 都考虑进去,大多数时候, 使用 adam 都能又快又好的达到目标, 迅速收敛
各种 Optimizer 的对比

Tensorboard 可视化

使用这个工具我们可以很直观的看到整个神经网络的结构、框架,使用Chrome查看
with tf.name_scope 定义每一层的名称
使用tensorboard –logdir logs启动服务 在输出的网页地址 里面查看详细情况graphs

可视化TesorBorad整个神经网络结构的过程:

import tensorflow as tfdef add_layer(inputs, in_size, out_size, activation_function=None):    # add one more layer and return the output of this layer    with tf.name_scope('layer'):        with tf.name_scope('weights'):            Weights = tf.Variable(tf.random_normal([in_size, out_size]), name='W')        with tf.name_scope('biases'):            biases = tf.Variable(tf.zeros([1, out_size]) + 0.1, name='b')        with tf.name_scope('Wx_plus_b'):            Wx_plus_b = tf.add(tf.matmul(inputs, Weights), biases)        if activation_function is None:            outputs = Wx_plus_b        else:            outputs = activation_function(Wx_plus_b, )        return outputs# define placeholder for inputs to networkwith tf.name_scope('inputs'):    xs = tf.placeholder(tf.float32, [None, 1], name='x_input')    ys = tf.placeholder(tf.float32, [None, 1], name='y_input')# add hidden layerl1 = add_layer(xs, 1, 10, activation_function=tf.nn.relu)# add output layerprediction = add_layer(l1, 10, 1, activation_function=None)# the error between prediciton and real datawith tf.name_scope('loss'):    loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction),                                        reduction_indices=[1]))with tf.name_scope('train'):    train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)sess = tf.Session()# tf.train.SummaryWriter soon be deprecated, use followingif int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1:  # tensorflow version < 0.12 将上面‘绘画’出的图保存到一个目录中    writer = tf.train.SummaryWriter('logs/', sess.graph)else: # tensorflow version >= 0.12    writer = tf.summary.FileWriter("logs/", sess.graph)# tf.initialize_all_variables() no long valid from# 2017-03-02 if using tensorflow >= 0.12if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1:    init = tf.initialize_all_variables()else:    init = tf.global_variables_initializer()sess.run(init)# direct to the local dir and run this in terminal:# $ tensorboard --logdir=logs

可视化训练过程:

import tensorflow as tfimport numpy as np#在 layer 中为 Weights, biases 设置变化图表#添加一个参数 n_layer,用来标识层数,layer_name 代表其每层的名称def add_layer(inputs, in_size, out_size, n_layer, activation_function=None):    # add one more layer and return the output of this layer     layer_name = 'layer%s' % n_layer    with tf.name_scope(layer_name):        with tf.name_scope('weights'):            Weights = tf.Variable(tf.random_normal([in_size, out_size]), name='W')            #tf.histogram_summary()方法,用来绘制图片, 第一个参数是图表的名称, 第二个参数是图表要记录的变量            tf.histogram_summary(layer_name+'/weights',Weights)   # tensorflow 0.12 以下版的            # tf.summary.histogram(layer_name + '/weights', Weights) # tensorflow >= 0.12        with tf.name_scope('biases'):            biases = tf.Variable(tf.zeros([1, out_size]) + 0.1, name='b')            tf.histogram_summary(layer_name+'/biases',biases)            #tf.summary.histogram(layer_name + '/biases', biases)        with tf.name_scope('Wx_plus_b'):            Wx_plus_b = tf.add(tf.matmul(inputs, Weights), biases)        if activation_function is None:            outputs = Wx_plus_b        else:            outputs = activation_function(Wx_plus_b, )        tf.histogram_summary(layer_name + '/outputs', outputs)        #tf.summary.histogram(layer_name + '/outputs', outputs)    return outputs# Make up some real datax_data = np.linspace(-1, 1, 300)[:, np.newaxis]noise = np.random.normal(0, 0.05, x_data.shape)y_data = np.square(x_data) - 0.5 + noise# define placeholder for inputs to networkwith tf.name_scope('inputs'):    xs = tf.placeholder(tf.float32, [None, 1], name='x_input')    ys = tf.placeholder(tf.float32, [None, 1], name='y_input')#可视化,显示每一层的情况,添加n_layer参数# add hidden layerl1 = add_layer(xs, 1, 10, n_layer=1, activation_function=tf.nn.relu)# add output layerprediction = add_layer(l1, 10, 1, n_layer=2, activation_function=None)# the error between prediciton and real datawith tf.name_scope('loss'):    loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction),                                        reduction_indices=[1]))    #tf.scalar_summary() 方法.在tesnorBorad 的event下面可视化           tf.scalar_summary('loss',loss) # tensorflow < 0.12    #tf.summary.scalar('loss', loss) # tensorflow >= 0.12                            with tf.name_scope('train'):    train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)sess = tf.Session()#给所有训练图‘合并‘ tf.merge_all_summaries()merged= tf.merge_all_summaries()    # tensorflow < 0.12# merged = tf.summary.merge_all()  # tensorflow >= 0.12writer = tf.train.SummaryWriter('logs/', sess.graph)    # tensorflow < 0.12# writer = tf.summary.FileWriter("logs/", sess.graph) # tensorflow >=0.12init = tf.initialize_all_variables() # tensorflow < 0.12# init = tf.global_variables_initializer()sess.run(init)for i in range(1000):    sess.run(train_step, feed_dict={xs: x_data, ys: y_data})    if i % 50 == 0:        ##run放入merged记录训练结果        result = sess.run(merged,                          feed_dict={xs: x_data, ys: y_data})        writer.add_summary(result, i)# direct to the local dir and run this in terminal:# $ tensorboard --logdir logs

Classification 分类学习

MNIST机器学习入门,利用Softmax

from tensorflow.examples.tutorials.mnist import input_datamnist = input_data.read_data_sets('MNIST_data', one_hot=True)#60000 28像素X28像素 train 10000 28像素X28像素 test 28x28 = 784import tensorflow as tf#占位符x = tf.placeholder("float", [None, 784])#变量W = tf.Variable(tf.zeros([784,10]))  #0-9 总共10类b = tf.Variable(tf.zeros([10]))##softmax模型可以用来给不同的对象分配概率,把输入值当成幂指数求值,再正则化这些结果值。y = tf.nn.softmax(tf.matmul(x,W) + b)#训练模型y_ = tf.placeholder("float", [None,10])#交叉熵cross_entropy = -tf.reduce_sum(y_*tf.log(y))#梯度下降算法 反向传播train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)init = tf.initialize_all_variables()sess = tf.Session()sess.run(init)#训练for i in range(1000):  batch_xs, batch_ys = mnist.train.next_batch(100)  sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})##评估#tf.argmax(y,1)在某一维上的其数据最大值1所在的索引值correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))#correct_prediction布尔值转换成浮点数 [True, False, True, True] 会变成 [1,0,1,1]accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))##最好的模型甚至可以获得超过99.7%的准确率##http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.htmlprint sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})
import tensorflow as tffrom tensorflow.examples.tutorials.mnist import input_data# number 1 to 10 datamnist = input_data.read_data_sets('MNIST_data', one_hot=True)def add_layer(inputs, in_size, out_size, activation_function=None,):    # add one more layer and return the output of this layer    Weights = tf.Variable(tf.random_normal([in_size, out_size]))    biases = tf.Variable(tf.zeros([1, out_size]) + 0.1,)    Wx_plus_b = tf.matmul(inputs, Weights) + biases    if activation_function is None:        outputs = Wx_plus_b    else:        outputs = activation_function(Wx_plus_b,)    return outputsdef compute_accuracy(v_xs, v_ys):    global prediction    y_pre = sess.run(prediction, feed_dict={xs: v_xs})    correct_prediction = tf.equal(tf.argmax(y_pre,1), tf.argmax(v_ys,1))    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))    result = sess.run(accuracy, feed_dict={xs: v_xs, ys: v_ys})    return result# define placeholder for inputs to networkxs = tf.placeholder(tf.float32, [None, 784]) # 28x28ys = tf.placeholder(tf.float32, [None, 10])# add output layerprediction = add_layer(xs, 784, 10,  activation_function=tf.nn.softmax)# the error between prediction and real datacross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction),                                              reduction_indices=[1]))       # losstrain_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)sess = tf.Session()# important step# tf.initialize_all_variables() no long valid from# 2017-03-02 if using tensorflow >= 0.12if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1:    init = tf.initialize_all_variables()else:    init = tf.global_variables_initializer()sess.run(init)for i in range(1000):    batch_xs, batch_ys = mnist.train.next_batch(100)    sess.run(train_step, feed_dict={xs: batch_xs, ys: batch_ys})    if i % 50 == 0:        print(compute_accuracy(            mnist.test.images, mnist.test.labels))

Dropout 解决 overfitting

keep_prob是保留概率,即我们要保留的结果所占比例,它作为一个placeholder,在run时传入

import tensorflow as tffrom sklearn.datasets import load_digitsfrom sklearn.cross_validation import train_test_splitfrom sklearn.preprocessing import LabelBinarizer# load datadigits = load_digits()X = digits.datay = digits.targety = LabelBinarizer().fit_transform(y)X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3)def add_layer(inputs, in_size, out_size, layer_name, activation_function=None, ):    # add one more layer and return the output of this layer    Weights = tf.Variable(tf.random_normal([in_size, out_size]))    biases = tf.Variable(tf.zeros([1, out_size]) + 0.1, )    Wx_plus_b = tf.matmul(inputs, Weights) + biases    # here to dropout    Wx_plus_b = tf.nn.dropout(Wx_plus_b, keep_prob)    if activation_function is None:        outputs = Wx_plus_b    else:        outputs = activation_function(Wx_plus_b, )    # tf.histogram_summary(layer_name + '/outputs', outputs)          ##tf.summary.histogram(layer_name + '/outputs', outputs)    return outputs# define placeholder for inputs to network#keep_prob是保留概率,即我们要保留的结果所占比例,它作为一个placeholder,在run时传入keep_prob = tf.placeholder(tf.float32)xs = tf.placeholder(tf.float32, [None, 64])  # 8x8ys = tf.placeholder(tf.float32, [None, 10])# add output layerl1 = add_layer(xs, 64, 50, 'l1', activation_function=tf.nn.tanh)prediction = add_layer(l1, 50, 10, 'l2', activation_function=tf.nn.softmax)# the loss between prediction and real datacross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction),                                              reduction_indices=[1]))  # loss# tf.scalar_summary('loss',cross_entropy) #tensorflow<0.12                      #tf.summary.scalar('loss', cross_entropy)train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)sess = tf.Session() #merged = tf.summary.merge_all()# merged= tf.merge_all_summaries()    # tensorflow < 0.12# summary writer goes in here# train_writer = tf.SummaryWriter('logs/train', sess.graph)    # tensorflow < 0.12# test_writer = tf.SummaryWriter('logs/test', sess.graph)    # tensorflow < 0.12#train_writer = tf.summary.FileWriter("logs/train", sess.graph)#test_writer = tf.summary.FileWriter("logs/test", sess.graph)# tf.initialize_all_variables() no long valid from# 2017-03-02 if using tensorflow >= 0.12if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1:    init = tf.initialize_all_variables()else:    init = tf.global_variables_initializer()sess.run(init)for i in range(500):    # here to determine the keeping probability    sess.run(train_step, feed_dict={xs: X_train, ys: y_train, keep_prob: 0.5})    if i % 50 == 0:        # record loss        train_result = sess.run(train_step,feed_dict={xs: X_train, ys: y_train, keep_prob: 1})        test_result = sess.run(train_step,feed_dict={xs: X_test, ys: y_test, keep_prob: 1})#        train_result = sess.run(merged, feed_dict={xs: X_train, ys: y_train, keep_prob: 1})#        test_result = sess.run(merged, feed_dict={xs: X_test, ys: y_test, keep_prob: 1})#       train_writer.add_summary(train_result, i)#       test_writer.add_summary(test_result, i)

CNN 卷积神经网络

from __future__ import print_functionimport tensorflow as tffrom tensorflow.examples.tutorials.mnist import input_data# number 1 to 10 datamnist = input_data.read_data_sets('MNIST_data', one_hot=True)def compute_accuracy(v_xs, v_ys):    global prediction    y_pre = sess.run(prediction, feed_dict={xs: v_xs, keep_prob: 1})    correct_prediction = tf.equal(tf.argmax(y_pre,1), tf.argmax(v_ys,1))    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))    result = sess.run(accuracy, feed_dict={xs: v_xs, ys: v_ys, keep_prob: 1})    return result##产生随机变量def weight_variable(shape):    initial = tf.truncated_normal(shape, stddev=0.1)    return tf.Variable(initial)def bias_variable(shape):    initial = tf.constant(0.1, shape=shape)    return tf.Variable(initial)##tf.nn.conv2d二维的卷积函数 x是图片的所有参数,W是此卷积层的权重 strides步长 中间两个1代表padding时在x方向运动一步,y方向运动一步def conv2d(x, W):    # stride [1, x_movement, y_movement, 1]    # Must have strides[0] = strides[3] = 1    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')def max_pool_2x2(x):    # stride [1, x_movement, y_movement, 1]    return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')# define placeholder for inputs to networkxs = tf.placeholder(tf.float32, [None, 784])/255.   # 28x28ys = tf.placeholder(tf.float32, [None, 10])keep_prob = tf.placeholder(tf.float32)## -1 不管输入大小 1是黑白 rgb 3通道x_image = tf.reshape(xs, [-1, 28, 28, 1])# print(x_image.shape)  # [n_samples, 28,28,1]## conv1 layer ##W_conv1 = weight_variable([5,5, 1,32]) # patch 5x5, in size 1, out size 32b_conv1 = bias_variable([32])h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) # output size 28x28x32h_pool1 = max_pool_2x2(h_conv1)                                         # output size 14x14x32## conv2 layer ##W_conv2 = weight_variable([5,5, 32, 64]) # patch 5x5, in size 32, out size 64b_conv2 = bias_variable([64])h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) # output size 14x14x64h_pool2 = max_pool_2x2(h_conv2)                                         # output size 7x7x64## fc1 layer ##W_fc1 = weight_variable([7*7*64, 1024])b_fc1 = bias_variable([1024])# [n_samples, 7, 7, 64] ->> [n_samples, 7*7*64]h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)## fc2 layer ##W_fc2 = weight_variable([1024, 10])b_fc2 = bias_variable([10])prediction = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)# the error between prediction and real datacross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction),                                              reduction_indices=[1]))       # losstrain_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)sess = tf.Session()# important step# tf.initialize_all_variables() no long valid from# 2017-03-02 if using tensorflow >= 0.12if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1:    init = tf.initialize_all_variables()else:    init = tf.global_variables_initializer()sess.run(init)for i in range(1000):    batch_xs, batch_ys = mnist.train.next_batch(100)    sess.run(train_step, feed_dict={xs: batch_xs, ys: batch_ys, keep_prob: 0.5})    if i % 50 == 0:        print(compute_accuracy(            mnist.test.images, mnist.test.labels))

保存与导入模型

from __future__ import print_functionimport tensorflow as tfimport numpy as np# Save to file# remember to define the same dtype and shape when restore# W = tf.Variable([[1,2,3],[3,4,5]], dtype=tf.float32, name='weights')# b = tf.Variable([[1,2,3]], dtype=tf.float32, name='biases')# tf.initialize_all_variables() no long valid from# 2017-03-02 if using tensorflow >= 0.12# if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1:#     init = tf.initialize_all_variables()# else:#     init = tf.global_variables_initializer()## saver = tf.train.Saver()## with tf.Session() as sess:#    sess.run(init)#    save_path = saver.save(sess, "my_net/save_net.ckpt")#    print("Save to path: ", save_path)################################################# restore variables# redefine the same shape and same type for your variablesW = tf.Variable(np.arange(6).reshape((2, 3)), dtype=tf.float32, name="weights")b = tf.Variable(np.arange(3).reshape((1, 3)), dtype=tf.float32, name="biases")# not need init stepsaver = tf.train.Saver()with tf.Session() as sess:    saver.restore(sess, "my_net/save_net.ckpt")    print("weights:", sess.run(W))    print("biases:", sess.run(b))

RNN回归

import tensorflow as tfimport numpy as npimport matplotlib.pyplot as pltBATCH_START = 0TIME_STEPS = 20BATCH_SIZE = 50INPUT_SIZE = 1OUTPUT_SIZE = 1CELL_SIZE = 10LR = 0.006def get_batch():    global BATCH_START, TIME_STEPS    # xs shape (50batch, 20steps)    xs = np.arange(BATCH_START, BATCH_START+TIME_STEPS*BATCH_SIZE).reshape((BATCH_SIZE, TIME_STEPS)) / (10*np.pi)    seq = np.sin(xs)    res = np.cos(xs)    BATCH_START += TIME_STEPS    # plt.plot(xs[0, :], res[0, :], 'r', xs[0, :], seq[0, :], 'b--')    # plt.show()    # returned seq, res and xs: shape (batch, step, input)    return [seq[:, :, np.newaxis], res[:, :, np.newaxis], xs]class LSTMRNN(object):    def __init__(self, n_steps, input_size, output_size, cell_size, batch_size):        self.n_steps = n_steps        self.input_size = input_size        self.output_size = output_size        self.cell_size = cell_size        self.batch_size = batch_size        with tf.name_scope('inputs'):            self.xs = tf.placeholder(tf.float32, [None, n_steps, input_size], name='xs')            self.ys = tf.placeholder(tf.float32, [None, n_steps, output_size], name='ys')        with tf.variable_scope('in_hidden'):            self.add_input_layer()        with tf.variable_scope('LSTM_cell'):            self.add_cell()        with tf.variable_scope('out_hidden'):            self.add_output_layer()        with tf.name_scope('cost'):            self.compute_cost()        with tf.name_scope('train'):            self.train_op = tf.train.AdamOptimizer(LR).minimize(self.cost)    def add_input_layer(self,):        l_in_x = tf.reshape(self.xs, [-1, self.input_size], name='2_2D')  # (batch*n_step, in_size)        # Ws (in_size, cell_size)        Ws_in = self._weight_variable([self.input_size, self.cell_size])        # bs (cell_size, )        bs_in = self._bias_variable([self.cell_size,])        # l_in_y = (batch * n_steps, cell_size)        with tf.name_scope('Wx_plus_b'):            l_in_y = tf.matmul(l_in_x, Ws_in) + bs_in        # reshape l_in_y ==> (batch, n_steps, cell_size)        self.l_in_y = tf.reshape(l_in_y, [-1, self.n_steps, self.cell_size], name='2_3D')    def add_cell(self):        lstm_cell = tf.contrib.rnn.BasicLSTMCell(self.cell_size, forget_bias=1.0, state_is_tuple=True)        with tf.name_scope('initial_state'):            self.cell_init_state = lstm_cell.zero_state(self.batch_size, dtype=tf.float32)        self.cell_outputs, self.cell_final_state = tf.nn.dynamic_rnn(            lstm_cell, self.l_in_y, initial_state=self.cell_init_state, time_major=False)    def add_output_layer(self):        # shape = (batch * steps, cell_size)        l_out_x = tf.reshape(self.cell_outputs, [-1, self.cell_size], name='2_2D')        Ws_out = self._weight_variable([self.cell_size, self.output_size])        bs_out = self._bias_variable([self.output_size, ])        # shape = (batch * steps, output_size)        with tf.name_scope('Wx_plus_b'):            self.pred = tf.matmul(l_out_x, Ws_out) + bs_out    def compute_cost(self):        losses = tf.contrib.legacy_seq2seq.sequence_loss_by_example(            [tf.reshape(self.pred, [-1], name='reshape_pred')],            [tf.reshape(self.ys, [-1], name='reshape_target')],            [tf.ones([self.batch_size * self.n_steps], dtype=tf.float32)],            average_across_timesteps=True,            softmax_loss_function=self.ms_error,            name='losses'        )        with tf.name_scope('average_cost'):            self.cost = tf.div(                tf.reduce_sum(losses, name='losses_sum'),                self.batch_size,                name='average_cost')            tf.summary.scalar('cost', self.cost)    def ms_error(self, labels, logits):        return tf.square(tf.subtract(labels, logits))    def _weight_variable(self, shape, name='weights'):        initializer = tf.random_normal_initializer(mean=0., stddev=1.,)        return tf.get_variable(shape=shape, initializer=initializer, name=name)    def _bias_variable(self, shape, name='biases'):        initializer = tf.constant_initializer(0.1)        return tf.get_variable(name=name, shape=shape, initializer=initializer)if __name__ == '__main__':    model = LSTMRNN(TIME_STEPS, INPUT_SIZE, OUTPUT_SIZE, CELL_SIZE, BATCH_SIZE)    sess = tf.Session()    merged = tf.summary.merge_all()    writer = tf.summary.FileWriter("logs", sess.graph)    # tf.initialize_all_variables() no long valid from    # 2017-03-02 if using tensorflow >= 0.12    if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1:        init = tf.initialize_all_variables()    else:        init = tf.global_variables_initializer()    sess.run(init)    # relocate to the local dir and run this line to view it on Chrome (http://0.0.0.0:6006/):    # $ tensorboard --logdir='logs'    plt.ion()    plt.show()    for i in range(200):        seq, res, xs = get_batch()        if i == 0:            feed_dict = {                    model.xs: seq,                    model.ys: res,                    # create initial state            }        else:            feed_dict = {                model.xs: seq,                model.ys: res,                model.cell_init_state: state    # use last state as the initial state for this run            }        _, cost, state, pred = sess.run(            [model.train_op, model.cost, model.cell_final_state, model.pred],            feed_dict=feed_dict)        # plotting        plt.plot(xs[0, :], res[0].flatten(), 'r', xs[0, :], pred.flatten()[:TIME_STEPS], 'b--')        plt.ylim((-1.2, 1.2))        plt.draw()        plt.pause(0.3)        if i % 20 == 0:            print('cost: ', round(cost, 4))            result = sess.run(merged, feed_dict)            writer.add_summary(result, i)

RNN分类

import tensorflow as tffrom tensorflow.examples.tutorials.mnist import input_data# set random seed for comparing the two result calculationstf.set_random_seed(1)# this is datamnist = input_data.read_data_sets('MNIST_data', one_hot=True)# hyperparameterslr = 0.001training_iters = 100000batch_size = 128n_inputs = 28   # MNIST data input (img shape: 28*28)n_steps = 28    # time stepsn_hidden_units = 128   # neurons in hidden layern_classes = 10      # MNIST classes (0-9 digits)# tf Graph inputx = tf.placeholder(tf.float32, [None, n_steps, n_inputs])y = tf.placeholder(tf.float32, [None, n_classes])# Define weightsweights = {    # (28, 128)    'in': tf.Variable(tf.random_normal([n_inputs, n_hidden_units])),    # (128, 10)    'out': tf.Variable(tf.random_normal([n_hidden_units, n_classes]))}biases = {    # (128, )    'in': tf.Variable(tf.constant(0.1, shape=[n_hidden_units, ])),    # (10, )    'out': tf.Variable(tf.constant(0.1, shape=[n_classes, ]))}def RNN(X, weights, biases):    # hidden layer for input to cell    ########################################    # transpose the inputs shape from    # X ==> (128 batch * 28 steps, 28 inputs)    X = tf.reshape(X, [-1, n_inputs])    # into hidden    # X_in = (128 batch * 28 steps, 128 hidden)    X_in = tf.matmul(X, weights['in']) + biases['in']    # X_in ==> (128 batch, 28 steps, 128 hidden)    X_in = tf.reshape(X_in, [-1, n_steps, n_hidden_units])    # cell    ##########################################    # basic LSTM Cell.    if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1:        cell = tf.nn.rnn_cell.BasicLSTMCell(n_hidden_units, forget_bias=1.0, state_is_tuple=True)    else:        cell = tf.contrib.rnn.BasicLSTMCell(n_hidden_units)    # lstm cell is divided into two parts (c_state, h_state)    init_state = cell.zero_state(batch_size, dtype=tf.float32)    # You have 2 options for following step.    # 1: tf.nn.rnn(cell, inputs);    # 2: tf.nn.dynamic_rnn(cell, inputs).    # If use option 1, you have to modified the shape of X_in, go and check out this:    # https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/recurrent_network.py    # In here, we go for option 2.    # dynamic_rnn receive Tensor (batch, steps, inputs) or (steps, batch, inputs) as X_in.    # Make sure the time_major is changed accordingly.    outputs, final_state = tf.nn.dynamic_rnn(cell, X_in, initial_state=init_state, time_major=False)    # hidden layer for output as the final results    #############################################    # results = tf.matmul(final_state[1], weights['out']) + biases['out']    # # or    # unpack to list [(batch, outputs)..] * steps    if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1:        outputs = tf.unpack(tf.transpose(outputs, [1, 0, 2]))    # states is the last outputs    else:        outputs = tf.unstack(tf.transpose(outputs, [1,0,2]))    results = tf.matmul(outputs[-1], weights['out']) + biases['out']    # shape = (128, 10)    return resultspred = RNN(x, weights, biases)cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))train_op = tf.train.AdamOptimizer(lr).minimize(cost)correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))with tf.Session() as sess:    # tf.initialize_all_variables() no long valid from    # 2017-03-02 if using tensorflow >= 0.12    if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1:        init = tf.initialize_all_variables()    else:        init = tf.global_variables_initializer()    sess.run(init)    step = 0    while step * batch_size < training_iters:        batch_xs, batch_ys = mnist.train.next_batch(batch_size)        batch_xs = batch_xs.reshape([batch_size, n_steps, n_inputs])        sess.run([train_op], feed_dict={            x: batch_xs,            y: batch_ys,        })        if step % 20 == 0:            print(sess.run(accuracy, feed_dict={            x: batch_xs,            y: batch_ys,            }))        step += 1

Autoencoder

# View more python learning tutorial on my Youtube and Youku channel!!!# My tutorial website: https://morvanzhou.github.io/tutorials/from __future__ import division, print_function, absolute_importimport tensorflow as tfimport numpy as npimport matplotlib.pyplot as plt# Import MNIST datafrom tensorflow.examples.tutorials.mnist import input_datamnist = input_data.read_data_sets("/tmp/data/", one_hot=False)# Visualize decoder setting# Parameterslearning_rate = 0.01training_epochs = 5batch_size = 256display_step = 1examples_to_show = 10# Network Parametersn_input = 784  # MNIST data input (img shape: 28*28)# tf Graph input (only pictures)X = tf.placeholder("float", [None, n_input])# hidden layer settingsn_hidden_1 = 256 # 1st layer num featuresn_hidden_2 = 128 # 2nd layer num featuresweights = {    'encoder_h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),    'encoder_h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),    'decoder_h1': tf.Variable(tf.random_normal([n_hidden_2, n_hidden_1])),    'decoder_h2': tf.Variable(tf.random_normal([n_hidden_1, n_input])),}biases = {    'encoder_b1': tf.Variable(tf.random_normal([n_hidden_1])),    'encoder_b2': tf.Variable(tf.random_normal([n_hidden_2])),    'decoder_b1': tf.Variable(tf.random_normal([n_hidden_1])),    'decoder_b2': tf.Variable(tf.random_normal([n_input])),}# Building the encoderdef encoder(x):    # Encoder Hidden layer with sigmoid activation #1    layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x, weights['encoder_h1']),                                   biases['encoder_b1']))    # Decoder Hidden layer with sigmoid activation #2    layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1, weights['encoder_h2']),                                   biases['encoder_b2']))    return layer_2# Building the decoderdef decoder(x):    # Encoder Hidden layer with sigmoid activation #1    layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x, weights['decoder_h1']),                                   biases['decoder_b1']))    # Decoder Hidden layer with sigmoid activation #2    layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1, weights['decoder_h2']),                                   biases['decoder_b2']))    return layer_2"""# Visualize encoder setting# Parameterslearning_rate = 0.01    # 0.01 this learning rate will be better! Testedtraining_epochs = 10batch_size = 256display_step = 1# Network Parametersn_input = 784  # MNIST data input (img shape: 28*28)# tf Graph input (only pictures)X = tf.placeholder("float", [None, n_input])# hidden layer settingsn_hidden_1 = 128n_hidden_2 = 64n_hidden_3 = 10n_hidden_4 = 2weights = {    'encoder_h1': tf.Variable(tf.truncated_normal([n_input, n_hidden_1],)),    'encoder_h2': tf.Variable(tf.truncated_normal([n_hidden_1, n_hidden_2],)),    'encoder_h3': tf.Variable(tf.truncated_normal([n_hidden_2, n_hidden_3],)),    'encoder_h4': tf.Variable(tf.truncated_normal([n_hidden_3, n_hidden_4],)),    'decoder_h1': tf.Variable(tf.truncated_normal([n_hidden_4, n_hidden_3],)),    'decoder_h2': tf.Variable(tf.truncated_normal([n_hidden_3, n_hidden_2],)),    'decoder_h3': tf.Variable(tf.truncated_normal([n_hidden_2, n_hidden_1],)),    'decoder_h4': tf.Variable(tf.truncated_normal([n_hidden_1, n_input],)),}biases = {    'encoder_b1': tf.Variable(tf.random_normal([n_hidden_1])),    'encoder_b2': tf.Variable(tf.random_normal([n_hidden_2])),    'encoder_b3': tf.Variable(tf.random_normal([n_hidden_3])),    'encoder_b4': tf.Variable(tf.random_normal([n_hidden_4])),    'decoder_b1': tf.Variable(tf.random_normal([n_hidden_3])),    'decoder_b2': tf.Variable(tf.random_normal([n_hidden_2])),    'decoder_b3': tf.Variable(tf.random_normal([n_hidden_1])),    'decoder_b4': tf.Variable(tf.random_normal([n_input])),}def encoder(x):    layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x, weights['encoder_h1']),                                   biases['encoder_b1']))    layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1, weights['encoder_h2']),                                   biases['encoder_b2']))    layer_3 = tf.nn.sigmoid(tf.add(tf.matmul(layer_2, weights['encoder_h3']),                                   biases['encoder_b3']))    layer_4 = tf.add(tf.matmul(layer_3, weights['encoder_h4']),                                    biases['encoder_b4'])    return layer_4def decoder(x):    layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x, weights['decoder_h1']),                                   biases['decoder_b1']))    layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1, weights['decoder_h2']),                                   biases['decoder_b2']))    layer_3 = tf.nn.sigmoid(tf.add(tf.matmul(layer_2, weights['decoder_h3']),                                biases['decoder_b3']))    layer_4 = tf.nn.sigmoid(tf.add(tf.matmul(layer_3, weights['decoder_h4']),                                biases['decoder_b4']))    return layer_4"""# Construct modelencoder_op = encoder(X)decoder_op = decoder(encoder_op)# Predictiony_pred = decoder_op# Targets (Labels) are the input data.y_true = X# Define loss and optimizer, minimize the squared errorcost = tf.reduce_mean(tf.pow(y_true - y_pred, 2))optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)# Launch the graphwith tf.Session() as sess:    # tf.initialize_all_variables() no long valid from    # 2017-03-02 if using tensorflow >= 0.12    if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1:        init = tf.initialize_all_variables()    else:        init = tf.global_variables_initializer()    sess.run(init)    total_batch = int(mnist.train.num_examples/batch_size)    # Training cycle    for epoch in range(training_epochs):        # Loop over all batches        for i in range(total_batch):            batch_xs, batch_ys = mnist.train.next_batch(batch_size)  # max(x) = 1, min(x) = 0            # Run optimization op (backprop) and cost op (to get loss value)            _, c = sess.run([optimizer, cost], feed_dict={X: batch_xs})        # Display logs per epoch step        if epoch % display_step == 0:            print("Epoch:", '%04d' % (epoch+1),                  "cost=", "{:.9f}".format(c))    print("Optimization Finished!")    # # Applying encode and decode over test set    encode_decode = sess.run(        y_pred, feed_dict={X: mnist.test.images[:examples_to_show]})    # Compare original images with their reconstructions    f, a = plt.subplots(2, 10, figsize=(10, 2))    for i in range(examples_to_show):        a[0][i].imshow(np.reshape(mnist.test.images[i], (28, 28)))        a[1][i].imshow(np.reshape(encode_decode[i], (28, 28)))    plt.show()    # encoder_result = sess.run(encoder_op, feed_dict={X: mnist.test.images})    # plt.scatter(encoder_result[:, 0], encoder_result[:, 1], c=mnist.test.labels)    # plt.colorbar()    # plt.show()

Batch Normalization

在每批数据进行前向传递 forward propagation 的时候, 对每一层都进行 normalization 的处理
Batch Normalization (BN) 就被添加在每一个全连接和激励函数之间.

# 23 Batch Normalizationimport numpy as npimport tensorflow as tfimport matplotlib.pyplot as pltACTIVATION = tf.nn.reluN_LAYERS = 7N_HIDDEN_UNITS = 30def fix_seed(seed=1):    # reproducible    np.random.seed(seed)    tf.set_random_seed(seed)def plot_his(inputs, inputs_norm):    # plot histogram for the inputs of every layer    for j, all_inputs in enumerate([inputs, inputs_norm]):        for i, input in enumerate(all_inputs):            plt.subplot(2, len(all_inputs), j*len(all_inputs)+(i+1))            plt.cla()            if i == 0:                the_range = (-7, 10)            else:                the_range = (-1, 1)            plt.hist(input.ravel(), bins=15, range=the_range, color='#FF5733')            plt.yticks(())            if j == 1:                plt.xticks(the_range)            else:                plt.xticks(())            ax = plt.gca()            ax.spines['right'].set_color('none')            ax.spines['top'].set_color('none')        plt.title("%s normalizing" % ("Without" if j == 0 else "With"))    plt.draw()    plt.pause(0.01)def built_net(xs, ys, norm):    def add_layer(inputs, in_size, out_size, activation_function=None, norm=False):        # weights and biases (bad initialization for this case)        Weights = tf.Variable(tf.random_normal([in_size, out_size], mean=0., stddev=1.))        biases = tf.Variable(tf.zeros([1, out_size]) + 0.1)        # fully connected product        Wx_plus_b = tf.matmul(inputs, Weights) + biases        # normalize fully connected product        if norm:            # Batch Normalize            fc_mean, fc_var = tf.nn.moments(                Wx_plus_b,                axes=[0],   # the dimension you wanna normalize, here [0] for batch                            # for image, you wanna do [0, 1, 2] for [batch, height, width] but not channel            )            scale = tf.Variable(tf.ones([out_size]))            shift = tf.Variable(tf.zeros([out_size]))            epsilon = 0.001            # apply moving average for mean and var when train on batch            ema = tf.train.ExponentialMovingAverage(decay=0.5)            def mean_var_with_update():                ema_apply_op = ema.apply([fc_mean, fc_var])                with tf.control_dependencies([ema_apply_op]):                    return tf.identity(fc_mean), tf.identity(fc_var)            mean, var = mean_var_with_update()            Wx_plus_b = tf.nn.batch_normalization(Wx_plus_b, mean, var, shift, scale, epsilon)            # similar with this two steps:            # Wx_plus_b = (Wx_plus_b - fc_mean) / tf.sqrt(fc_var + 0.001)            # Wx_plus_b = Wx_plus_b * scale + shift        # activation        if activation_function is None:            outputs = Wx_plus_b        else:            outputs = activation_function(Wx_plus_b)        return outputs    fix_seed(1)    if norm:        # BN for the first input        fc_mean, fc_var = tf.nn.moments(            xs,            axes=[0],        )        scale = tf.Variable(tf.ones([1]))        shift = tf.Variable(tf.zeros([1]))        epsilon = 0.001        # apply moving average for mean and var when train on batch        ema = tf.train.ExponentialMovingAverage(decay=0.5)        def mean_var_with_update():            ema_apply_op = ema.apply([fc_mean, fc_var])            with tf.control_dependencies([ema_apply_op]):                return tf.identity(fc_mean), tf.identity(fc_var)        mean, var = mean_var_with_update()        xs = tf.nn.batch_normalization(xs, mean, var, shift, scale, epsilon)    # record inputs for every layer    layers_inputs = [xs]    # build hidden layers    for l_n in range(N_LAYERS):        layer_input = layers_inputs[l_n]        in_size = layers_inputs[l_n].get_shape()[1].value        output = add_layer(            layer_input,    # input            in_size,        # input size            N_HIDDEN_UNITS, # output size            ACTIVATION,     # activation function            norm,           # normalize before activation        )        layers_inputs.append(output)    # add output for next run    # build output layer    prediction = add_layer(layers_inputs[-1], 30, 1, activation_function=None)    cost = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction), reduction_indices=[1]))    train_op = tf.train.GradientDescentOptimizer(0.001).minimize(cost)    return [train_op, cost, layers_inputs]# make up datafix_seed(1)x_data = np.linspace(-7, 10, 2500)[:, np.newaxis]np.random.shuffle(x_data)noise = np.random.normal(0, 8, x_data.shape)y_data = np.square(x_data) - 5 + noise# plot input dataplt.scatter(x_data, y_data)plt.show()xs = tf.placeholder(tf.float32, [None, 1])  # [num_samples, num_features]ys = tf.placeholder(tf.float32, [None, 1])train_op, cost, layers_inputs = built_net(xs, ys, norm=False)   # without BNtrain_op_norm, cost_norm, layers_inputs_norm = built_net(xs, ys, norm=True) # with BNsess = tf.Session()if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1:    init = tf.initialize_all_variables()else:    init = tf.global_variables_initializer()sess.run(init)# record costcost_his = []cost_his_norm = []record_step = 5plt.ion()plt.figure(figsize=(7, 3))for i in range(250):    if i % 50 == 0:        # plot histogram        all_inputs, all_inputs_norm = sess.run([layers_inputs, layers_inputs_norm], feed_dict={xs: x_data, ys: y_data})        plot_his(all_inputs, all_inputs_norm)    # train on batch    sess.run([train_op, train_op_norm], feed_dict={xs: x_data[i*10:i*10+10], ys: y_data[i*10:i*10+10]})    if i % record_step == 0:        # record cost        cost_his.append(sess.run(cost, feed_dict={xs: x_data, ys: y_data}))        cost_his_norm.append(sess.run(cost_norm, feed_dict={xs: x_data, ys: y_data}))plt.ioff()plt.figure()plt.plot(np.arange(len(cost_his))*record_step, np.array(cost_his), label='no BN')     # no normplt.plot(np.arange(len(cost_his))*record_step, np.array(cost_his_norm), label='BN')   # normplt.legend()plt.show()
原创粉丝点击