Tensorflow学习笔记
来源:互联网 发布:bat 算法工程师 编辑:程序博客网 时间:2024/06/06 02:37
Tensorflow学习笔记
参考
知乎
莫烦 Tensorflow
安装
# python 2+ 的用户:$ pip install tensorflow# python 3+ 的用户:$ pip3 install tensorflow
更新
# 如果你是 Python 2, 请复制下面pip uninstall tensorflow# 如果你是 Python 3, 请复制下面pip3 uninstall tensorflow
基础结构
TensorFlow是采用数据流图(组装计算图(管道)):
- 计算图(graph):要组装的结构。由许多操作组成。
- 操作(ops):接受(流入)零个或多个输入(液体),返回(流出)零个或多个输出。
- 数据类型:主要分为张量(tensor)、变量(variable)和常量(constant)
数据以张量(tensor)形式存在,张量(tensor):
- 张量有多种. 零阶张量为 纯量或标量 (scalar) 也就是一个数值. 比如 [1]
- 阶张量为 向量 (vector), 比如 一维的 [1, 2, 3]
- 二阶张量为 矩阵 (matrix), 比如 二维的 [[1, 2, 3],[4, 5, 6],[7, 8, 9]]
- 以此类推, 还有 三阶 三维的 …多维array或list(管道中的液体)
- 创建tensor_name=tf.placeholder(type, shape, name)
import tensorflow as tfimport numpy as np# create datax_data = np.random.rand(100).astype(np.float32)y_data = x_data*0.1 + 0.3### create tensorflow structure start ###Weights = tf.Variable(tf.random_uniform([1], -1.0, 1.0))biases = tf.Variable(tf.zeros([1]))y = Weights*x_data + biasesloss = tf.reduce_mean(tf.square(y-y_data))optimizer = tf.train.GradientDescentOptimizer(0.5)train = optimizer.minimize(loss)init = tf.initialize_all_variables()### create tensorflow structure end ###sess = tf.Session()# tf.initialize_all_variables() no long valid from# 2017-03-02 if using tensorflow >= 0.12if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1: init = tf.initialize_all_variables()else: init = tf.global_variables_initializer()sess.run(init)for step in range(201): sess.run(train) if step % 20 == 0: print(step, sess.run(Weights), sess.run(biases))
基本运算
##创建实例matrix1=tf.constant([[3., 3.]])##乘法与矩阵乘法mul = tf.mul(tf.constant(3.0), tf.constant(4.0))product = tf.matmul(matrix1, matrix2)##减法subsub = tf.sub(x, a)print sub.eval()##加法addnew_value = tf.add(state, one)##更新assignupdate = tf.assign(state, new_value)##随机生成3x1矩阵tf.random_normal([3,1])##生成1x2的0矩阵tf.zeros([1, 2])tf.zeros([1, 2])+0.1 #所有位置加0.1##平方tf.square(2)##在某一维上的其数据最大值所在的索引值tf.argmax(y,1)
tf.equal(A, B)
tf.equal(A, B)是对比这两个矩阵或者向量的相等的元素,如果是相等的那就返回True,反正返回False
import tensorflow as tf import numpy as np A = [[1,3,4,5,6]] B = [[1,3,4,3,2]] with tf.Session() as sess: print(sess.run(tf.equal(A, B))) ## [[ True True True False False]]
tf.argmax(input,axis) 0表示按列,1表示按行
tf.argmax(input,axis) 0表示按列,1表示按行 返回最大值所在索引位置
tf.cast(x, dtype, name=None) 格式转化
tf.cast(x, dtype, name=None) 将x的数据格式转化成dtype
tf.reduce_mean(x, axis)
求平均值,如果不指定第二个参数,那么就在所有的元素中取平均值
tf.truncated_normal(shape, mean, stddev)
产生指定均值和标准差的正太分布
tf.truncated_normal(shape=[10,10], mean=0, stddev=1)
tf.constant(value,dtype=None,shape=None,name=’Const’)
创建一个常量tensor,按照给出value来赋值,可以用shape来指定其形状
Session 会话控制
Session 是 Tensorflow 为了控制,和输出文件的执行的语句. 使用计算图(获取液体),执行(launch)构建的计算图
运行 session.run() 可以获得你要得知的运算结果, 或者是你所要运算的部分.
两种使用session的形似:
输出两个 matrix 矩阵相乘
import tensorflow as tf# create two matrixesmatrix1 = tf.constant([[3,3,1]])matrix2 = tf.constant([[2], [2], [1]])product = tf.matmul(matrix1,matrix2) #定义后不会直接出结果,需要用session计算# method 1 执行语句:sess.run(op)sess = tf.Session()result = sess.run(product)print(result)sess.close()# [[12]]# method 2 "with" 代码块 来自动完成关闭动作.with tf.Session() as sess: result2 = sess.run(product) print(result2)# [[12]]
Variable 变量
tensorflow定义了某字符串是变量,它才是变量
定义语法: state = tf.Variable()
Tensorflow 中设定了变量,一定要初始化变量
init = tf.initialize_all_variables()
sess.run(init) , 激活 init
import tensorflow as tfstate = tf.Variable(0, name='counter')# 定义常量 oneone = tf.constant(1)# 定义加法步骤 (注: 此步并没有直接计算)new_value = tf.add(state, one)# 将 State 更新成 new_valueupdate = tf.assign(state, new_value)# 如果定义 Variable, 就一定要 initializeinit = tf.initialize_all_variables() # tf 马上就要废弃这种写法#init = tf.global_variables_initializer() # 替换成这样就好# 使用 Sessionwith tf.Session() as sess: sess.run(init) for _ in range(3): sess.run(update) print(sess.run(state)) #直接 print(state) 不起作用
Placeholder 占位符
placeholder 是 Tensorflow 中的占位符,暂时储存变量.
Tensorflow 如果想要从外部传入data, 那就需要用到 tf.placeholder()
然后以这种形式传输数据 sess.run(*, feed_dict={input: }).
feed 使用一个 tensor 值临时替换一个操作的输出结果
送值(feed):输入操作的输入值(输入液体)
取值(fetch):获取操作的输出值(得到液体)
import tensorflow as tf#feed#在 Tensorflow 中需要定义 placeholder 的 type ,一般为 float32 形式input1 = tf.placeholder(tf.float32)input2 = tf.placeholder(tf.float32)# mul = multiply 是将input1和input2 做乘法运算,并输出为 output ouput = tf.mul(input1, input2)#送值(feed):输入操作的输入值(输入液体)with tf.Session() as sess: print(sess.run(ouput, feed_dict={input1: [7.], input2: [2.]}))#取值(fetch):获取操作的输出值(得到液体)#Fetch为了取回操作的输出内容, 可以在使用 Session 对象的 run() 调用 执行图时, 传入一些 tensorinput1 = tf.constant(3.0)input2 = tf.constant(2.0)input3 = tf.constant(5.0)intermed = tf.add(input2, input3)mul = tf.mul(input1, intermed)with tf.Session() as sess: result = sess.run([mul, intermed]) print result
激励函数 Activation Function
定义添加层 def add_layer()
定义添加神经层的函数def add_layer(),
它有四个参数:输入值、输入的大小、输出的大小和激励函数
def add_layer(inputs, in_size, out_size, activation_function=None): Weights = tf.Variable(tf.random_normal([in_size, out_size])) #初始参数 权重 biases = tf.Variable(tf.zeros([1, out_size]) + 0.1) #偏置 在机器学习中,biases的推荐值不为0 Wx_plus_b = tf.matmul(inputs, Weights) + biases #神经网络未激活的值 #激励函数为None时,输出就是当前的预测值(输出=输入) #不为None时,就把输入wx+b传到activation_function()函数 if activation_function is None: outputs = Wx_plus_b else: outputs = activation_function(Wx_plus_b) return outputs
构建网络
import tensorflow as tfimport numpy as npdef add_layer(inputs, in_size, out_size, activation_function=None): # add one more layer and return the output of this layer Weights = tf.Variable(tf.random_normal([in_size, out_size])) biases = tf.Variable(tf.zeros([1, out_size]) + 0.1) Wx_plus_b = tf.matmul(inputs, Weights) + biases if activation_function is None: outputs = Wx_plus_b else: outputs = activation_function(Wx_plus_b) return outputs# Make up some real datax_data = np.linspace(-1,1,300)[:, np.newaxis] #linspace指定的间隔返回均匀间隔的数字 newaxis变为1维noise = np.random.normal(0, 0.05, x_data.shape)y_data = np.square(x_data) - 0.5 + noise# define placeholder for inputs to networkxs = tf.placeholder(tf.float32, [None, 1]) #None代表无论输入有多少都可以 输入只有一个特征,所以这里是1ys = tf.placeholder(tf.float32, [None, 1])#输入层1个、隐藏层10个、输出层1个的神经网络# add hidden layerl1 = add_layer(xs, 1, 10, activation_function=tf.nn.relu)# add output layerprediction = add_layer(l1, 10, 1, activation_function=None)# the error between prediction and real dataloss = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction), reduction_indices=[1]))train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)# important step# tf.initialize_all_variables() no long valid from# 2017-03-02 if using tensorflow >= 0.12if int((tf.__version__).split('.')[1]) < 12: init = tf.initialize_all_variables()else: init = tf.global_variables_initializer()sess = tf.Session()sess.run(init)for i in range(1000): # training sess.run(train_step, feed_dict={xs: x_data, ys: y_data}) if i % 50 == 0: # to see the step improvement print(sess.run(loss, feed_dict={xs: x_data, ys: y_data}))##可视化结果import matplotlib.pyplot as pltfig = plt.figure()ax = fig.add_subplot(1,1,1)ax.scatter(x_data, y_data)#plt.ion()#本次运行请注释,全局运行不要注释 plt.ion()用于连续显示#plt.show()#每隔50次训练刷新一次图形,用红色、宽度为5的线来显示我们的预测数据和输入之间的关系,并暂停0.1sfor i in range(1000): # training sess.run(train_step, feed_dict={xs: x_data, ys: y_data}) if i % 50 == 0: # to visualize the result and improvement try: ax.lines.remove(lines[0]) except Exception: pass prediction_value = sess.run(prediction, feed_dict={xs: x_data}) # plot the prediction lines = ax.plot(x_data, prediction_value, 'r-', lw=5) plt.pause(0.1)plt.show()
优化器 optimizer
tf的7种优化器
最常用的: GradientDescentOptimizer
tf.train.Optimizer
tf.train.GradientDescentOptimizer
tf.train.AdadeltaOptimizer
tf.train.AdagradOptimizer
tf.train.AdagradDAOptimizer
tf.train.MomentumOptimizer
tf.train.AdamOptimizer
tf.train.FtrlOptimizer
tf.train.ProximalGradientDescentOptimizer
tf.train.ProximalAdagradOptimizer
tf.train.RMSPropOptimizer
简介Optimizer
推荐使用的更新是SGD + Nesterov Momentum或Adam
- Stochastic Gradient Descent (SGD):数据拆分成小批小批的, 然后再分批不断放入 NN 中计算,加速计算
- AdaGrad 更新方法:修改学习率 v+=dx^2 W+=-Learningrate * dx / v^0.5 对错误方向的阻力
- Momentum 更新方法:W+=-Learningrate * dx 改为 m=b1 * m -Learningrate * dx W+=m 下降更快 惯性原则
- RMSProp 更新方法:有了 momentum 的惯性原则 , 加上 adagrad 的对错误方向的阻力
- Adam 更新方法:计算m 时有 momentum 下坡的属性, 计算 v 时有 adagrad 阻力的属性, 然后再更新参数时 把 m 和 V 都考虑进去,大多数时候, 使用 adam 都能又快又好的达到目标, 迅速收敛
各种 Optimizer 的对比
Tensorboard 可视化
使用这个工具我们可以很直观的看到整个神经网络的结构、框架,使用Chrome查看
with tf.name_scope 定义每一层的名称
使用tensorboard –logdir logs启动服务 在输出的网页地址 里面查看详细情况graphs
可视化TesorBorad整个神经网络结构的过程:
import tensorflow as tfdef add_layer(inputs, in_size, out_size, activation_function=None): # add one more layer and return the output of this layer with tf.name_scope('layer'): with tf.name_scope('weights'): Weights = tf.Variable(tf.random_normal([in_size, out_size]), name='W') with tf.name_scope('biases'): biases = tf.Variable(tf.zeros([1, out_size]) + 0.1, name='b') with tf.name_scope('Wx_plus_b'): Wx_plus_b = tf.add(tf.matmul(inputs, Weights), biases) if activation_function is None: outputs = Wx_plus_b else: outputs = activation_function(Wx_plus_b, ) return outputs# define placeholder for inputs to networkwith tf.name_scope('inputs'): xs = tf.placeholder(tf.float32, [None, 1], name='x_input') ys = tf.placeholder(tf.float32, [None, 1], name='y_input')# add hidden layerl1 = add_layer(xs, 1, 10, activation_function=tf.nn.relu)# add output layerprediction = add_layer(l1, 10, 1, activation_function=None)# the error between prediciton and real datawith tf.name_scope('loss'): loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction), reduction_indices=[1]))with tf.name_scope('train'): train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)sess = tf.Session()# tf.train.SummaryWriter soon be deprecated, use followingif int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1: # tensorflow version < 0.12 将上面‘绘画’出的图保存到一个目录中 writer = tf.train.SummaryWriter('logs/', sess.graph)else: # tensorflow version >= 0.12 writer = tf.summary.FileWriter("logs/", sess.graph)# tf.initialize_all_variables() no long valid from# 2017-03-02 if using tensorflow >= 0.12if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1: init = tf.initialize_all_variables()else: init = tf.global_variables_initializer()sess.run(init)# direct to the local dir and run this in terminal:# $ tensorboard --logdir=logs
可视化训练过程:
import tensorflow as tfimport numpy as np#在 layer 中为 Weights, biases 设置变化图表#添加一个参数 n_layer,用来标识层数,layer_name 代表其每层的名称def add_layer(inputs, in_size, out_size, n_layer, activation_function=None): # add one more layer and return the output of this layer layer_name = 'layer%s' % n_layer with tf.name_scope(layer_name): with tf.name_scope('weights'): Weights = tf.Variable(tf.random_normal([in_size, out_size]), name='W') #tf.histogram_summary()方法,用来绘制图片, 第一个参数是图表的名称, 第二个参数是图表要记录的变量 tf.histogram_summary(layer_name+'/weights',Weights) # tensorflow 0.12 以下版的 # tf.summary.histogram(layer_name + '/weights', Weights) # tensorflow >= 0.12 with tf.name_scope('biases'): biases = tf.Variable(tf.zeros([1, out_size]) + 0.1, name='b') tf.histogram_summary(layer_name+'/biases',biases) #tf.summary.histogram(layer_name + '/biases', biases) with tf.name_scope('Wx_plus_b'): Wx_plus_b = tf.add(tf.matmul(inputs, Weights), biases) if activation_function is None: outputs = Wx_plus_b else: outputs = activation_function(Wx_plus_b, ) tf.histogram_summary(layer_name + '/outputs', outputs) #tf.summary.histogram(layer_name + '/outputs', outputs) return outputs# Make up some real datax_data = np.linspace(-1, 1, 300)[:, np.newaxis]noise = np.random.normal(0, 0.05, x_data.shape)y_data = np.square(x_data) - 0.5 + noise# define placeholder for inputs to networkwith tf.name_scope('inputs'): xs = tf.placeholder(tf.float32, [None, 1], name='x_input') ys = tf.placeholder(tf.float32, [None, 1], name='y_input')#可视化,显示每一层的情况,添加n_layer参数# add hidden layerl1 = add_layer(xs, 1, 10, n_layer=1, activation_function=tf.nn.relu)# add output layerprediction = add_layer(l1, 10, 1, n_layer=2, activation_function=None)# the error between prediciton and real datawith tf.name_scope('loss'): loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction), reduction_indices=[1])) #tf.scalar_summary() 方法.在tesnorBorad 的event下面可视化 tf.scalar_summary('loss',loss) # tensorflow < 0.12 #tf.summary.scalar('loss', loss) # tensorflow >= 0.12 with tf.name_scope('train'): train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)sess = tf.Session()#给所有训练图‘合并‘ tf.merge_all_summaries()merged= tf.merge_all_summaries() # tensorflow < 0.12# merged = tf.summary.merge_all() # tensorflow >= 0.12writer = tf.train.SummaryWriter('logs/', sess.graph) # tensorflow < 0.12# writer = tf.summary.FileWriter("logs/", sess.graph) # tensorflow >=0.12init = tf.initialize_all_variables() # tensorflow < 0.12# init = tf.global_variables_initializer()sess.run(init)for i in range(1000): sess.run(train_step, feed_dict={xs: x_data, ys: y_data}) if i % 50 == 0: ##run放入merged记录训练结果 result = sess.run(merged, feed_dict={xs: x_data, ys: y_data}) writer.add_summary(result, i)# direct to the local dir and run this in terminal:# $ tensorboard --logdir logs
Classification 分类学习
MNIST机器学习入门,利用Softmax
from tensorflow.examples.tutorials.mnist import input_datamnist = input_data.read_data_sets('MNIST_data', one_hot=True)#60000 28像素X28像素 train 10000 28像素X28像素 test 28x28 = 784import tensorflow as tf#占位符x = tf.placeholder("float", [None, 784])#变量W = tf.Variable(tf.zeros([784,10])) #0-9 总共10类b = tf.Variable(tf.zeros([10]))##softmax模型可以用来给不同的对象分配概率,把输入值当成幂指数求值,再正则化这些结果值。y = tf.nn.softmax(tf.matmul(x,W) + b)#训练模型y_ = tf.placeholder("float", [None,10])#交叉熵cross_entropy = -tf.reduce_sum(y_*tf.log(y))#梯度下降算法 反向传播train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)init = tf.initialize_all_variables()sess = tf.Session()sess.run(init)#训练for i in range(1000): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})##评估#tf.argmax(y,1)在某一维上的其数据最大值1所在的索引值correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))#correct_prediction布尔值转换成浮点数 [True, False, True, True] 会变成 [1,0,1,1]accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))##最好的模型甚至可以获得超过99.7%的准确率##http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.htmlprint sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})
import tensorflow as tffrom tensorflow.examples.tutorials.mnist import input_data# number 1 to 10 datamnist = input_data.read_data_sets('MNIST_data', one_hot=True)def add_layer(inputs, in_size, out_size, activation_function=None,): # add one more layer and return the output of this layer Weights = tf.Variable(tf.random_normal([in_size, out_size])) biases = tf.Variable(tf.zeros([1, out_size]) + 0.1,) Wx_plus_b = tf.matmul(inputs, Weights) + biases if activation_function is None: outputs = Wx_plus_b else: outputs = activation_function(Wx_plus_b,) return outputsdef compute_accuracy(v_xs, v_ys): global prediction y_pre = sess.run(prediction, feed_dict={xs: v_xs}) correct_prediction = tf.equal(tf.argmax(y_pre,1), tf.argmax(v_ys,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) result = sess.run(accuracy, feed_dict={xs: v_xs, ys: v_ys}) return result# define placeholder for inputs to networkxs = tf.placeholder(tf.float32, [None, 784]) # 28x28ys = tf.placeholder(tf.float32, [None, 10])# add output layerprediction = add_layer(xs, 784, 10, activation_function=tf.nn.softmax)# the error between prediction and real datacross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction), reduction_indices=[1])) # losstrain_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)sess = tf.Session()# important step# tf.initialize_all_variables() no long valid from# 2017-03-02 if using tensorflow >= 0.12if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1: init = tf.initialize_all_variables()else: init = tf.global_variables_initializer()sess.run(init)for i in range(1000): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_step, feed_dict={xs: batch_xs, ys: batch_ys}) if i % 50 == 0: print(compute_accuracy( mnist.test.images, mnist.test.labels))
Dropout 解决 overfitting
keep_prob是保留概率,即我们要保留的结果所占比例,它作为一个placeholder,在run时传入
import tensorflow as tffrom sklearn.datasets import load_digitsfrom sklearn.cross_validation import train_test_splitfrom sklearn.preprocessing import LabelBinarizer# load datadigits = load_digits()X = digits.datay = digits.targety = LabelBinarizer().fit_transform(y)X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3)def add_layer(inputs, in_size, out_size, layer_name, activation_function=None, ): # add one more layer and return the output of this layer Weights = tf.Variable(tf.random_normal([in_size, out_size])) biases = tf.Variable(tf.zeros([1, out_size]) + 0.1, ) Wx_plus_b = tf.matmul(inputs, Weights) + biases # here to dropout Wx_plus_b = tf.nn.dropout(Wx_plus_b, keep_prob) if activation_function is None: outputs = Wx_plus_b else: outputs = activation_function(Wx_plus_b, ) # tf.histogram_summary(layer_name + '/outputs', outputs) ##tf.summary.histogram(layer_name + '/outputs', outputs) return outputs# define placeholder for inputs to network#keep_prob是保留概率,即我们要保留的结果所占比例,它作为一个placeholder,在run时传入keep_prob = tf.placeholder(tf.float32)xs = tf.placeholder(tf.float32, [None, 64]) # 8x8ys = tf.placeholder(tf.float32, [None, 10])# add output layerl1 = add_layer(xs, 64, 50, 'l1', activation_function=tf.nn.tanh)prediction = add_layer(l1, 50, 10, 'l2', activation_function=tf.nn.softmax)# the loss between prediction and real datacross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction), reduction_indices=[1])) # loss# tf.scalar_summary('loss',cross_entropy) #tensorflow<0.12 #tf.summary.scalar('loss', cross_entropy)train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)sess = tf.Session() #merged = tf.summary.merge_all()# merged= tf.merge_all_summaries() # tensorflow < 0.12# summary writer goes in here# train_writer = tf.SummaryWriter('logs/train', sess.graph) # tensorflow < 0.12# test_writer = tf.SummaryWriter('logs/test', sess.graph) # tensorflow < 0.12#train_writer = tf.summary.FileWriter("logs/train", sess.graph)#test_writer = tf.summary.FileWriter("logs/test", sess.graph)# tf.initialize_all_variables() no long valid from# 2017-03-02 if using tensorflow >= 0.12if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1: init = tf.initialize_all_variables()else: init = tf.global_variables_initializer()sess.run(init)for i in range(500): # here to determine the keeping probability sess.run(train_step, feed_dict={xs: X_train, ys: y_train, keep_prob: 0.5}) if i % 50 == 0: # record loss train_result = sess.run(train_step,feed_dict={xs: X_train, ys: y_train, keep_prob: 1}) test_result = sess.run(train_step,feed_dict={xs: X_test, ys: y_test, keep_prob: 1})# train_result = sess.run(merged, feed_dict={xs: X_train, ys: y_train, keep_prob: 1})# test_result = sess.run(merged, feed_dict={xs: X_test, ys: y_test, keep_prob: 1})# train_writer.add_summary(train_result, i)# test_writer.add_summary(test_result, i)
CNN 卷积神经网络
from __future__ import print_functionimport tensorflow as tffrom tensorflow.examples.tutorials.mnist import input_data# number 1 to 10 datamnist = input_data.read_data_sets('MNIST_data', one_hot=True)def compute_accuracy(v_xs, v_ys): global prediction y_pre = sess.run(prediction, feed_dict={xs: v_xs, keep_prob: 1}) correct_prediction = tf.equal(tf.argmax(y_pre,1), tf.argmax(v_ys,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) result = sess.run(accuracy, feed_dict={xs: v_xs, ys: v_ys, keep_prob: 1}) return result##产生随机变量def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial)def bias_variable(shape): initial = tf.constant(0.1, shape=shape) return tf.Variable(initial)##tf.nn.conv2d二维的卷积函数 x是图片的所有参数,W是此卷积层的权重 strides步长 中间两个1代表padding时在x方向运动一步,y方向运动一步def conv2d(x, W): # stride [1, x_movement, y_movement, 1] # Must have strides[0] = strides[3] = 1 return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')def max_pool_2x2(x): # stride [1, x_movement, y_movement, 1] return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')# define placeholder for inputs to networkxs = tf.placeholder(tf.float32, [None, 784])/255. # 28x28ys = tf.placeholder(tf.float32, [None, 10])keep_prob = tf.placeholder(tf.float32)## -1 不管输入大小 1是黑白 rgb 3通道x_image = tf.reshape(xs, [-1, 28, 28, 1])# print(x_image.shape) # [n_samples, 28,28,1]## conv1 layer ##W_conv1 = weight_variable([5,5, 1,32]) # patch 5x5, in size 1, out size 32b_conv1 = bias_variable([32])h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) # output size 28x28x32h_pool1 = max_pool_2x2(h_conv1) # output size 14x14x32## conv2 layer ##W_conv2 = weight_variable([5,5, 32, 64]) # patch 5x5, in size 32, out size 64b_conv2 = bias_variable([64])h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) # output size 14x14x64h_pool2 = max_pool_2x2(h_conv2) # output size 7x7x64## fc1 layer ##W_fc1 = weight_variable([7*7*64, 1024])b_fc1 = bias_variable([1024])# [n_samples, 7, 7, 64] ->> [n_samples, 7*7*64]h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)## fc2 layer ##W_fc2 = weight_variable([1024, 10])b_fc2 = bias_variable([10])prediction = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)# the error between prediction and real datacross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction), reduction_indices=[1])) # losstrain_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)sess = tf.Session()# important step# tf.initialize_all_variables() no long valid from# 2017-03-02 if using tensorflow >= 0.12if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1: init = tf.initialize_all_variables()else: init = tf.global_variables_initializer()sess.run(init)for i in range(1000): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_step, feed_dict={xs: batch_xs, ys: batch_ys, keep_prob: 0.5}) if i % 50 == 0: print(compute_accuracy( mnist.test.images, mnist.test.labels))
保存与导入模型
from __future__ import print_functionimport tensorflow as tfimport numpy as np# Save to file# remember to define the same dtype and shape when restore# W = tf.Variable([[1,2,3],[3,4,5]], dtype=tf.float32, name='weights')# b = tf.Variable([[1,2,3]], dtype=tf.float32, name='biases')# tf.initialize_all_variables() no long valid from# 2017-03-02 if using tensorflow >= 0.12# if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1:# init = tf.initialize_all_variables()# else:# init = tf.global_variables_initializer()## saver = tf.train.Saver()## with tf.Session() as sess:# sess.run(init)# save_path = saver.save(sess, "my_net/save_net.ckpt")# print("Save to path: ", save_path)################################################# restore variables# redefine the same shape and same type for your variablesW = tf.Variable(np.arange(6).reshape((2, 3)), dtype=tf.float32, name="weights")b = tf.Variable(np.arange(3).reshape((1, 3)), dtype=tf.float32, name="biases")# not need init stepsaver = tf.train.Saver()with tf.Session() as sess: saver.restore(sess, "my_net/save_net.ckpt") print("weights:", sess.run(W)) print("biases:", sess.run(b))
RNN回归
import tensorflow as tfimport numpy as npimport matplotlib.pyplot as pltBATCH_START = 0TIME_STEPS = 20BATCH_SIZE = 50INPUT_SIZE = 1OUTPUT_SIZE = 1CELL_SIZE = 10LR = 0.006def get_batch(): global BATCH_START, TIME_STEPS # xs shape (50batch, 20steps) xs = np.arange(BATCH_START, BATCH_START+TIME_STEPS*BATCH_SIZE).reshape((BATCH_SIZE, TIME_STEPS)) / (10*np.pi) seq = np.sin(xs) res = np.cos(xs) BATCH_START += TIME_STEPS # plt.plot(xs[0, :], res[0, :], 'r', xs[0, :], seq[0, :], 'b--') # plt.show() # returned seq, res and xs: shape (batch, step, input) return [seq[:, :, np.newaxis], res[:, :, np.newaxis], xs]class LSTMRNN(object): def __init__(self, n_steps, input_size, output_size, cell_size, batch_size): self.n_steps = n_steps self.input_size = input_size self.output_size = output_size self.cell_size = cell_size self.batch_size = batch_size with tf.name_scope('inputs'): self.xs = tf.placeholder(tf.float32, [None, n_steps, input_size], name='xs') self.ys = tf.placeholder(tf.float32, [None, n_steps, output_size], name='ys') with tf.variable_scope('in_hidden'): self.add_input_layer() with tf.variable_scope('LSTM_cell'): self.add_cell() with tf.variable_scope('out_hidden'): self.add_output_layer() with tf.name_scope('cost'): self.compute_cost() with tf.name_scope('train'): self.train_op = tf.train.AdamOptimizer(LR).minimize(self.cost) def add_input_layer(self,): l_in_x = tf.reshape(self.xs, [-1, self.input_size], name='2_2D') # (batch*n_step, in_size) # Ws (in_size, cell_size) Ws_in = self._weight_variable([self.input_size, self.cell_size]) # bs (cell_size, ) bs_in = self._bias_variable([self.cell_size,]) # l_in_y = (batch * n_steps, cell_size) with tf.name_scope('Wx_plus_b'): l_in_y = tf.matmul(l_in_x, Ws_in) + bs_in # reshape l_in_y ==> (batch, n_steps, cell_size) self.l_in_y = tf.reshape(l_in_y, [-1, self.n_steps, self.cell_size], name='2_3D') def add_cell(self): lstm_cell = tf.contrib.rnn.BasicLSTMCell(self.cell_size, forget_bias=1.0, state_is_tuple=True) with tf.name_scope('initial_state'): self.cell_init_state = lstm_cell.zero_state(self.batch_size, dtype=tf.float32) self.cell_outputs, self.cell_final_state = tf.nn.dynamic_rnn( lstm_cell, self.l_in_y, initial_state=self.cell_init_state, time_major=False) def add_output_layer(self): # shape = (batch * steps, cell_size) l_out_x = tf.reshape(self.cell_outputs, [-1, self.cell_size], name='2_2D') Ws_out = self._weight_variable([self.cell_size, self.output_size]) bs_out = self._bias_variable([self.output_size, ]) # shape = (batch * steps, output_size) with tf.name_scope('Wx_plus_b'): self.pred = tf.matmul(l_out_x, Ws_out) + bs_out def compute_cost(self): losses = tf.contrib.legacy_seq2seq.sequence_loss_by_example( [tf.reshape(self.pred, [-1], name='reshape_pred')], [tf.reshape(self.ys, [-1], name='reshape_target')], [tf.ones([self.batch_size * self.n_steps], dtype=tf.float32)], average_across_timesteps=True, softmax_loss_function=self.ms_error, name='losses' ) with tf.name_scope('average_cost'): self.cost = tf.div( tf.reduce_sum(losses, name='losses_sum'), self.batch_size, name='average_cost') tf.summary.scalar('cost', self.cost) def ms_error(self, labels, logits): return tf.square(tf.subtract(labels, logits)) def _weight_variable(self, shape, name='weights'): initializer = tf.random_normal_initializer(mean=0., stddev=1.,) return tf.get_variable(shape=shape, initializer=initializer, name=name) def _bias_variable(self, shape, name='biases'): initializer = tf.constant_initializer(0.1) return tf.get_variable(name=name, shape=shape, initializer=initializer)if __name__ == '__main__': model = LSTMRNN(TIME_STEPS, INPUT_SIZE, OUTPUT_SIZE, CELL_SIZE, BATCH_SIZE) sess = tf.Session() merged = tf.summary.merge_all() writer = tf.summary.FileWriter("logs", sess.graph) # tf.initialize_all_variables() no long valid from # 2017-03-02 if using tensorflow >= 0.12 if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1: init = tf.initialize_all_variables() else: init = tf.global_variables_initializer() sess.run(init) # relocate to the local dir and run this line to view it on Chrome (http://0.0.0.0:6006/): # $ tensorboard --logdir='logs' plt.ion() plt.show() for i in range(200): seq, res, xs = get_batch() if i == 0: feed_dict = { model.xs: seq, model.ys: res, # create initial state } else: feed_dict = { model.xs: seq, model.ys: res, model.cell_init_state: state # use last state as the initial state for this run } _, cost, state, pred = sess.run( [model.train_op, model.cost, model.cell_final_state, model.pred], feed_dict=feed_dict) # plotting plt.plot(xs[0, :], res[0].flatten(), 'r', xs[0, :], pred.flatten()[:TIME_STEPS], 'b--') plt.ylim((-1.2, 1.2)) plt.draw() plt.pause(0.3) if i % 20 == 0: print('cost: ', round(cost, 4)) result = sess.run(merged, feed_dict) writer.add_summary(result, i)
RNN分类
import tensorflow as tffrom tensorflow.examples.tutorials.mnist import input_data# set random seed for comparing the two result calculationstf.set_random_seed(1)# this is datamnist = input_data.read_data_sets('MNIST_data', one_hot=True)# hyperparameterslr = 0.001training_iters = 100000batch_size = 128n_inputs = 28 # MNIST data input (img shape: 28*28)n_steps = 28 # time stepsn_hidden_units = 128 # neurons in hidden layern_classes = 10 # MNIST classes (0-9 digits)# tf Graph inputx = tf.placeholder(tf.float32, [None, n_steps, n_inputs])y = tf.placeholder(tf.float32, [None, n_classes])# Define weightsweights = { # (28, 128) 'in': tf.Variable(tf.random_normal([n_inputs, n_hidden_units])), # (128, 10) 'out': tf.Variable(tf.random_normal([n_hidden_units, n_classes]))}biases = { # (128, ) 'in': tf.Variable(tf.constant(0.1, shape=[n_hidden_units, ])), # (10, ) 'out': tf.Variable(tf.constant(0.1, shape=[n_classes, ]))}def RNN(X, weights, biases): # hidden layer for input to cell ######################################## # transpose the inputs shape from # X ==> (128 batch * 28 steps, 28 inputs) X = tf.reshape(X, [-1, n_inputs]) # into hidden # X_in = (128 batch * 28 steps, 128 hidden) X_in = tf.matmul(X, weights['in']) + biases['in'] # X_in ==> (128 batch, 28 steps, 128 hidden) X_in = tf.reshape(X_in, [-1, n_steps, n_hidden_units]) # cell ########################################## # basic LSTM Cell. if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1: cell = tf.nn.rnn_cell.BasicLSTMCell(n_hidden_units, forget_bias=1.0, state_is_tuple=True) else: cell = tf.contrib.rnn.BasicLSTMCell(n_hidden_units) # lstm cell is divided into two parts (c_state, h_state) init_state = cell.zero_state(batch_size, dtype=tf.float32) # You have 2 options for following step. # 1: tf.nn.rnn(cell, inputs); # 2: tf.nn.dynamic_rnn(cell, inputs). # If use option 1, you have to modified the shape of X_in, go and check out this: # https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/recurrent_network.py # In here, we go for option 2. # dynamic_rnn receive Tensor (batch, steps, inputs) or (steps, batch, inputs) as X_in. # Make sure the time_major is changed accordingly. outputs, final_state = tf.nn.dynamic_rnn(cell, X_in, initial_state=init_state, time_major=False) # hidden layer for output as the final results ############################################# # results = tf.matmul(final_state[1], weights['out']) + biases['out'] # # or # unpack to list [(batch, outputs)..] * steps if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1: outputs = tf.unpack(tf.transpose(outputs, [1, 0, 2])) # states is the last outputs else: outputs = tf.unstack(tf.transpose(outputs, [1,0,2])) results = tf.matmul(outputs[-1], weights['out']) + biases['out'] # shape = (128, 10) return resultspred = RNN(x, weights, biases)cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))train_op = tf.train.AdamOptimizer(lr).minimize(cost)correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))with tf.Session() as sess: # tf.initialize_all_variables() no long valid from # 2017-03-02 if using tensorflow >= 0.12 if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1: init = tf.initialize_all_variables() else: init = tf.global_variables_initializer() sess.run(init) step = 0 while step * batch_size < training_iters: batch_xs, batch_ys = mnist.train.next_batch(batch_size) batch_xs = batch_xs.reshape([batch_size, n_steps, n_inputs]) sess.run([train_op], feed_dict={ x: batch_xs, y: batch_ys, }) if step % 20 == 0: print(sess.run(accuracy, feed_dict={ x: batch_xs, y: batch_ys, })) step += 1
Autoencoder
# View more python learning tutorial on my Youtube and Youku channel!!!# My tutorial website: https://morvanzhou.github.io/tutorials/from __future__ import division, print_function, absolute_importimport tensorflow as tfimport numpy as npimport matplotlib.pyplot as plt# Import MNIST datafrom tensorflow.examples.tutorials.mnist import input_datamnist = input_data.read_data_sets("/tmp/data/", one_hot=False)# Visualize decoder setting# Parameterslearning_rate = 0.01training_epochs = 5batch_size = 256display_step = 1examples_to_show = 10# Network Parametersn_input = 784 # MNIST data input (img shape: 28*28)# tf Graph input (only pictures)X = tf.placeholder("float", [None, n_input])# hidden layer settingsn_hidden_1 = 256 # 1st layer num featuresn_hidden_2 = 128 # 2nd layer num featuresweights = { 'encoder_h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])), 'encoder_h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])), 'decoder_h1': tf.Variable(tf.random_normal([n_hidden_2, n_hidden_1])), 'decoder_h2': tf.Variable(tf.random_normal([n_hidden_1, n_input])),}biases = { 'encoder_b1': tf.Variable(tf.random_normal([n_hidden_1])), 'encoder_b2': tf.Variable(tf.random_normal([n_hidden_2])), 'decoder_b1': tf.Variable(tf.random_normal([n_hidden_1])), 'decoder_b2': tf.Variable(tf.random_normal([n_input])),}# Building the encoderdef encoder(x): # Encoder Hidden layer with sigmoid activation #1 layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x, weights['encoder_h1']), biases['encoder_b1'])) # Decoder Hidden layer with sigmoid activation #2 layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1, weights['encoder_h2']), biases['encoder_b2'])) return layer_2# Building the decoderdef decoder(x): # Encoder Hidden layer with sigmoid activation #1 layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x, weights['decoder_h1']), biases['decoder_b1'])) # Decoder Hidden layer with sigmoid activation #2 layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1, weights['decoder_h2']), biases['decoder_b2'])) return layer_2"""# Visualize encoder setting# Parameterslearning_rate = 0.01 # 0.01 this learning rate will be better! Testedtraining_epochs = 10batch_size = 256display_step = 1# Network Parametersn_input = 784 # MNIST data input (img shape: 28*28)# tf Graph input (only pictures)X = tf.placeholder("float", [None, n_input])# hidden layer settingsn_hidden_1 = 128n_hidden_2 = 64n_hidden_3 = 10n_hidden_4 = 2weights = { 'encoder_h1': tf.Variable(tf.truncated_normal([n_input, n_hidden_1],)), 'encoder_h2': tf.Variable(tf.truncated_normal([n_hidden_1, n_hidden_2],)), 'encoder_h3': tf.Variable(tf.truncated_normal([n_hidden_2, n_hidden_3],)), 'encoder_h4': tf.Variable(tf.truncated_normal([n_hidden_3, n_hidden_4],)), 'decoder_h1': tf.Variable(tf.truncated_normal([n_hidden_4, n_hidden_3],)), 'decoder_h2': tf.Variable(tf.truncated_normal([n_hidden_3, n_hidden_2],)), 'decoder_h3': tf.Variable(tf.truncated_normal([n_hidden_2, n_hidden_1],)), 'decoder_h4': tf.Variable(tf.truncated_normal([n_hidden_1, n_input],)),}biases = { 'encoder_b1': tf.Variable(tf.random_normal([n_hidden_1])), 'encoder_b2': tf.Variable(tf.random_normal([n_hidden_2])), 'encoder_b3': tf.Variable(tf.random_normal([n_hidden_3])), 'encoder_b4': tf.Variable(tf.random_normal([n_hidden_4])), 'decoder_b1': tf.Variable(tf.random_normal([n_hidden_3])), 'decoder_b2': tf.Variable(tf.random_normal([n_hidden_2])), 'decoder_b3': tf.Variable(tf.random_normal([n_hidden_1])), 'decoder_b4': tf.Variable(tf.random_normal([n_input])),}def encoder(x): layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x, weights['encoder_h1']), biases['encoder_b1'])) layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1, weights['encoder_h2']), biases['encoder_b2'])) layer_3 = tf.nn.sigmoid(tf.add(tf.matmul(layer_2, weights['encoder_h3']), biases['encoder_b3'])) layer_4 = tf.add(tf.matmul(layer_3, weights['encoder_h4']), biases['encoder_b4']) return layer_4def decoder(x): layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x, weights['decoder_h1']), biases['decoder_b1'])) layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1, weights['decoder_h2']), biases['decoder_b2'])) layer_3 = tf.nn.sigmoid(tf.add(tf.matmul(layer_2, weights['decoder_h3']), biases['decoder_b3'])) layer_4 = tf.nn.sigmoid(tf.add(tf.matmul(layer_3, weights['decoder_h4']), biases['decoder_b4'])) return layer_4"""# Construct modelencoder_op = encoder(X)decoder_op = decoder(encoder_op)# Predictiony_pred = decoder_op# Targets (Labels) are the input data.y_true = X# Define loss and optimizer, minimize the squared errorcost = tf.reduce_mean(tf.pow(y_true - y_pred, 2))optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)# Launch the graphwith tf.Session() as sess: # tf.initialize_all_variables() no long valid from # 2017-03-02 if using tensorflow >= 0.12 if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1: init = tf.initialize_all_variables() else: init = tf.global_variables_initializer() sess.run(init) total_batch = int(mnist.train.num_examples/batch_size) # Training cycle for epoch in range(training_epochs): # Loop over all batches for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) # max(x) = 1, min(x) = 0 # Run optimization op (backprop) and cost op (to get loss value) _, c = sess.run([optimizer, cost], feed_dict={X: batch_xs}) # Display logs per epoch step if epoch % display_step == 0: print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c)) print("Optimization Finished!") # # Applying encode and decode over test set encode_decode = sess.run( y_pred, feed_dict={X: mnist.test.images[:examples_to_show]}) # Compare original images with their reconstructions f, a = plt.subplots(2, 10, figsize=(10, 2)) for i in range(examples_to_show): a[0][i].imshow(np.reshape(mnist.test.images[i], (28, 28))) a[1][i].imshow(np.reshape(encode_decode[i], (28, 28))) plt.show() # encoder_result = sess.run(encoder_op, feed_dict={X: mnist.test.images}) # plt.scatter(encoder_result[:, 0], encoder_result[:, 1], c=mnist.test.labels) # plt.colorbar() # plt.show()
Batch Normalization
在每批数据进行前向传递 forward propagation 的时候, 对每一层都进行 normalization 的处理
Batch Normalization (BN) 就被添加在每一个全连接和激励函数之间.
# 23 Batch Normalizationimport numpy as npimport tensorflow as tfimport matplotlib.pyplot as pltACTIVATION = tf.nn.reluN_LAYERS = 7N_HIDDEN_UNITS = 30def fix_seed(seed=1): # reproducible np.random.seed(seed) tf.set_random_seed(seed)def plot_his(inputs, inputs_norm): # plot histogram for the inputs of every layer for j, all_inputs in enumerate([inputs, inputs_norm]): for i, input in enumerate(all_inputs): plt.subplot(2, len(all_inputs), j*len(all_inputs)+(i+1)) plt.cla() if i == 0: the_range = (-7, 10) else: the_range = (-1, 1) plt.hist(input.ravel(), bins=15, range=the_range, color='#FF5733') plt.yticks(()) if j == 1: plt.xticks(the_range) else: plt.xticks(()) ax = plt.gca() ax.spines['right'].set_color('none') ax.spines['top'].set_color('none') plt.title("%s normalizing" % ("Without" if j == 0 else "With")) plt.draw() plt.pause(0.01)def built_net(xs, ys, norm): def add_layer(inputs, in_size, out_size, activation_function=None, norm=False): # weights and biases (bad initialization for this case) Weights = tf.Variable(tf.random_normal([in_size, out_size], mean=0., stddev=1.)) biases = tf.Variable(tf.zeros([1, out_size]) + 0.1) # fully connected product Wx_plus_b = tf.matmul(inputs, Weights) + biases # normalize fully connected product if norm: # Batch Normalize fc_mean, fc_var = tf.nn.moments( Wx_plus_b, axes=[0], # the dimension you wanna normalize, here [0] for batch # for image, you wanna do [0, 1, 2] for [batch, height, width] but not channel ) scale = tf.Variable(tf.ones([out_size])) shift = tf.Variable(tf.zeros([out_size])) epsilon = 0.001 # apply moving average for mean and var when train on batch ema = tf.train.ExponentialMovingAverage(decay=0.5) def mean_var_with_update(): ema_apply_op = ema.apply([fc_mean, fc_var]) with tf.control_dependencies([ema_apply_op]): return tf.identity(fc_mean), tf.identity(fc_var) mean, var = mean_var_with_update() Wx_plus_b = tf.nn.batch_normalization(Wx_plus_b, mean, var, shift, scale, epsilon) # similar with this two steps: # Wx_plus_b = (Wx_plus_b - fc_mean) / tf.sqrt(fc_var + 0.001) # Wx_plus_b = Wx_plus_b * scale + shift # activation if activation_function is None: outputs = Wx_plus_b else: outputs = activation_function(Wx_plus_b) return outputs fix_seed(1) if norm: # BN for the first input fc_mean, fc_var = tf.nn.moments( xs, axes=[0], ) scale = tf.Variable(tf.ones([1])) shift = tf.Variable(tf.zeros([1])) epsilon = 0.001 # apply moving average for mean and var when train on batch ema = tf.train.ExponentialMovingAverage(decay=0.5) def mean_var_with_update(): ema_apply_op = ema.apply([fc_mean, fc_var]) with tf.control_dependencies([ema_apply_op]): return tf.identity(fc_mean), tf.identity(fc_var) mean, var = mean_var_with_update() xs = tf.nn.batch_normalization(xs, mean, var, shift, scale, epsilon) # record inputs for every layer layers_inputs = [xs] # build hidden layers for l_n in range(N_LAYERS): layer_input = layers_inputs[l_n] in_size = layers_inputs[l_n].get_shape()[1].value output = add_layer( layer_input, # input in_size, # input size N_HIDDEN_UNITS, # output size ACTIVATION, # activation function norm, # normalize before activation ) layers_inputs.append(output) # add output for next run # build output layer prediction = add_layer(layers_inputs[-1], 30, 1, activation_function=None) cost = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction), reduction_indices=[1])) train_op = tf.train.GradientDescentOptimizer(0.001).minimize(cost) return [train_op, cost, layers_inputs]# make up datafix_seed(1)x_data = np.linspace(-7, 10, 2500)[:, np.newaxis]np.random.shuffle(x_data)noise = np.random.normal(0, 8, x_data.shape)y_data = np.square(x_data) - 5 + noise# plot input dataplt.scatter(x_data, y_data)plt.show()xs = tf.placeholder(tf.float32, [None, 1]) # [num_samples, num_features]ys = tf.placeholder(tf.float32, [None, 1])train_op, cost, layers_inputs = built_net(xs, ys, norm=False) # without BNtrain_op_norm, cost_norm, layers_inputs_norm = built_net(xs, ys, norm=True) # with BNsess = tf.Session()if int((tf.__version__).split('.')[1]) < 12 and int((tf.__version__).split('.')[0]) < 1: init = tf.initialize_all_variables()else: init = tf.global_variables_initializer()sess.run(init)# record costcost_his = []cost_his_norm = []record_step = 5plt.ion()plt.figure(figsize=(7, 3))for i in range(250): if i % 50 == 0: # plot histogram all_inputs, all_inputs_norm = sess.run([layers_inputs, layers_inputs_norm], feed_dict={xs: x_data, ys: y_data}) plot_his(all_inputs, all_inputs_norm) # train on batch sess.run([train_op, train_op_norm], feed_dict={xs: x_data[i*10:i*10+10], ys: y_data[i*10:i*10+10]}) if i % record_step == 0: # record cost cost_his.append(sess.run(cost, feed_dict={xs: x_data, ys: y_data})) cost_his_norm.append(sess.run(cost_norm, feed_dict={xs: x_data, ys: y_data}))plt.ioff()plt.figure()plt.plot(np.arange(len(cost_his))*record_step, np.array(cost_his), label='no BN') # no normplt.plot(np.arange(len(cost_his))*record_step, np.array(cost_his_norm), label='BN') # normplt.legend()plt.show()
- TensorFlow学习笔记-1
- TensorFlow学习笔记
- TensorFlow 深度学习笔记
- TensorFlow学习笔记1
- tensorflow-Alexnet学习笔记
- TensorFlow学习笔记
- Tensorflow学习笔记
- Tensorflow学习笔记(1)
- Tensorflow学习笔记(2)
- tensorflow学习笔记
- Tensorflow学习笔记
- tensorflow基础学习笔记
- TensorFlow--学习笔记
- tensorflow学习笔记
- TensorFlow学习笔记----TensorBoard_1
- TensorFlow学习笔记----TensorBoard_2
- Tensorflow学习笔记
- TensorFlow学习笔记
- 进程与线程的区别
- Big-man与PHP的交战(一)——阅读PHP代码
- 合并表记录(集合与泛型)
- tensorflow: 激活函数(Activation_Functions) 探究
- 数学建模--插值
- Tensorflow学习笔记
- hdu6162 Ch’s gift(LCA)
- 整理的关于DOM的表格(四——DOM变化)
- Python数据类型之“序列概述与基本序列类型(Basic Sequences)”
- SharedPreferenceUtil
- 9.5联合作战战果
- Spring MVC 解读——<context:component-scan/>
- 机器学习算法-kNN
- HTML5标签小结