利用 tf.gradients 在 TensorFlow 中实现梯度下降
来源:互联网 发布:拍大师登录网络超时 编辑:程序博客网 时间:2024/06/05 20:43
作者:chen_h
微信号 & QQ:862251340
微信公众号:coderpai
简书地址:http://www.jianshu.com/p/13e024c8ea44
我喜欢 TensorFlow 的其中一个原因是它可以自动的计算函数的梯度。我们只需要设计我们的函数,然后去调用 tf.gradients
函数就可以了。是不是非常简单。
接下来让我们来举个例子,具体说明一下。
使用 TensorFlow 内置的优化器对 MNIST 数据集进行 softmax 回归
在使用 tf.gradients
实现梯度下降之前,我们先尝试使用 TensorFlow 的内置优化器(比如 GradientDescentOptimizer)来解决MNIST数据集分类问题。
import tensorflow as tf# Import MNIST datafrom tensorflow.examples.tutorials.mnist import input_datamnist = input_data.read_data_sets("/tmp/data/", one_hot=True)# Parameterslearning_rate = 0.01training_epochs = 10batch_size = 100display_step = 1# tf Graph Inputx = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes# Set model weightsW = tf.Variable(tf.zeros([784, 10]))b = tf.Variable(tf.zeros([10]))# Construct modelpred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax# Minimize error using cross entropycost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)# Start trainingwith tf.Session() as sess: sess.run(tf.global_variables_initializer()) # Training cycle for epoch in range(training_epochs): avg_cost = 0. total_batch = int(mnist.train.num_examples/batch_size) # Loop over all batches for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) # Fit training using batch data _, c = sess.run([optimizer, cost], feed_dict={x: batch_xs, y: batch_ys})# print(__w) # Compute average loss avg_cost += c / total_batch # Display logs per epoch step if (epoch+1) % display_step == 0:# print(sess.run(W)) print ("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost)) print ("Optimization Finished!") # Test model correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1)) # Calculate accuracy for 3000 examples accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) print ("Accuracy:", accuracy.eval({x: mnist.test.images[:3000], y: mnist.test.labels[:3000]}))#### Output# Extracting /tmp/data/train-images-idx3-ubyte.gz# Extracting /tmp/data/train-labels-idx1-ubyte.gz# Extracting /tmp/data/t10k-images-idx3-ubyte.gz# Extracting /tmp/data/t10k-labels-idx1-ubyte.gz# Epoch: 0001 cost= 1.184285608# Epoch: 0002 cost= 0.665428013# Epoch: 0003 cost= 0.552858426# Epoch: 0004 cost= 0.498728328# Epoch: 0005 cost= 0.465593693# Epoch: 0006 cost= 0.442609185# Epoch: 0007 cost= 0.425552949# Epoch: 0008 cost= 0.412188290# Epoch: 0009 cost= 0.401390140# Epoch: 0010 cost= 0.392354651# Optimization Finished!# Accuracy: 0.873333
所以,我们在这里做的是利用内置的优化器来计算损失值。如果我们想自己计算渐变过程和更新权重,那应该怎么办?这就是 tf.gradients
的作用了。
使用 tf.gradients 对MNIST数据集进行 softmax 回归
通过梯度下降公式,权重的更新方式如下:
为了实现梯度下降,我将不使用优化器的代码,而是采用自己写的权重更新。
因为这里有权重矩阵 w
和偏差项矩阵 b
,所以我们需要去计算这些矩阵的梯度。所以实现的代码如下:
# Computing the gradient of cost with respect to W and bgrad_W, grad_b = tf.gradients(xs=[W, b], ys=cost)# Gradient Stepnew_W = W.assign(W - learning_rate * grad_W)new_b = b.assign(b - learning_rate * grad_b)
这三行代码只是替代前面的一行代码,干嘛给自己造成这么大的麻烦呢?因为如果你需要自己的损失函数的梯度,并且你不想编写严格的数学函数,那么 TensorFlow 就可以帮助你了。
我们已经构建好了计算图,所以接下来我们只需要在会话中运行这个计算图就行了。让我来试试吧。
# Fit training using batch data _, _, c = sess.run([new_W, new_b ,cost], feed_dict={x: batch_xs, y: batch_ys})
我们不需要 new_W
和 new_b
的输出,所以我忽略了这些变量。
完整代码如下:
import tensorflow as tf# Import MNIST datafrom tensorflow.examples.tutorials.mnist import input_datamnist = input_data.read_data_sets("/tmp/data/", one_hot=True)# Parameterslearning_rate = 0.01training_epochs = 10batch_size = 100display_step = 1# Parameterslearning_rate = 0.01training_epochs = 10batch_size = 100display_step = 1# tf Graph Inputx = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes# Set model weightsW = tf.Variable(tf.zeros([784, 10]))b = tf.Variable(tf.zeros([10]))# Construct modelpred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax# Minimize error using cross entropycost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))grad_W, grad_b = tf.gradients(xs=[W, b], ys=cost)new_W = W.assign(W - learning_rate * grad_W)new_b = b.assign(b - learning_rate * grad_b)# Initialize the variables (i.e. assign their default value)init = tf.global_variables_initializer()# Start trainingwith tf.Session() as sess: sess.run(init) # Training cycle for epoch in range(training_epochs): avg_cost = 0. total_batch = int(mnist.train.num_examples/batch_size) # Loop over all batches for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) # Fit training using batch data _, _, c = sess.run([new_W, new_b ,cost], feed_dict={x: batch_xs, y: batch_ys}) # Compute average loss avg_cost += c / total_batch # Display logs per epoch step if (epoch+1) % display_step == 0:# print(sess.run(W)) print ("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost)) print ("Optimization Finished!") # Test model correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1)) # Calculate accuracy for 3000 examples accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) print ("Accuracy:", accuracy.eval({x: mnist.test.images[:3000], y: mnist.test.labels[:3000]}))# Output# Epoch: 0001 cost= 1.183741399# Epoch: 0002 cost= 0.665312284# Epoch: 0003 cost= 0.552796521# Epoch: 0004 cost= 0.498697014# Epoch: 0005 cost= 0.465521633# Epoch: 0006 cost= 0.442611256# Epoch: 0007 cost= 0.425528946# Epoch: 0008 cost= 0.412203073# Epoch: 0009 cost= 0.401364554# Epoch: 0010 cost= 0.392398663# Optimization Finished!# Accuracy: 0.874
使用梯度公式的 softmax 回归
我们对于权重 w 的梯度处理如下:
如前所示,不使用 tf.gradients
或使用 TensorFlow 的内置优化器,这样可以实现梯度方程。完整代码如下:
import tensorflow as tf# Import MNIST datafrom tensorflow.examples.tutorials.mnist import input_datamnist = input_data.read_data_sets("/tmp/data/", one_hot=True)# Parameterslearning_rate = 0.01training_epochs = 10batch_size = 100display_step = 1# Parameterslearning_rate = 0.01training_epochs = 10batch_size = 100display_step = 1# tf Graph Inputx = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes# Set model weightsW = tf.Variable(tf.zeros([784, 10]))b = tf.Variable(tf.zeros([10]))# Construct modelpred = tf.nn.softmax(tf.matmul(x, W)) # Softmax# Minimize error using cross entropycost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))W_grad = - tf.matmul ( tf.transpose(x) , y - pred) b_grad = - tf.reduce_mean( tf.matmul(tf.transpose(x), y - pred), reduction_indices=0)new_W = W.assign(W - learning_rate * W_grad)new_b = b.assign(b - learning_rate * b_grad)init = tf.global_variables_initializer()with tf.Session() as sess: sess.run(init) # Training cycle for epoch in range(training_epochs): avg_cost = 0. total_batch = int(mnist.train.num_examples/batch_size) # Loop over all batches for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) # Fit training using batch data _, _, c = sess.run([new_W, new_b, cost], feed_dict={x: batch_xs, y: batch_ys}) # Compute average loss avg_cost += c / total_batch # Display logs per epoch step if (epoch+1) % display_step == 0: print ("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost)) print ("Optimization Finished!") # Test model correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1)) # Calculate accuracy for 3000 examples accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) print ("Accuracy:", accuracy.eval({x: mnist.test.images[:3000], y: mnist.test.labels[:3000]}))# Output# Extracting /tmp/data/train-images-idx3-ubyte.gz# Extracting /tmp/data/train-labels-idx1-ubyte.gz# Extracting /tmp/data/t10k-images-idx3-ubyte.gz# Extracting /tmp/data/t10k-labels-idx1-ubyte.gz# Epoch: 0001 cost= 0.432943137# Epoch: 0002 cost= 0.330031527# Epoch: 0003 cost= 0.313661941# Epoch: 0004 cost= 0.306443773# Epoch: 0005 cost= 0.300219418# Epoch: 0006 cost= 0.298976618# Epoch: 0007 cost= 0.293222957# Epoch: 0008 cost= 0.291407861# Epoch: 0009 cost= 0.288372261# Epoch: 0010 cost= 0.286749691# Optimization Finished!# Accuracy: 0.898
Tensorflow 是如何计算梯度的?
你可以在思考,TensorFlow是如何计算函数的梯度?
TensorFlow 使用的是一种称为 Automatic Differentiation 的方法,具体你可以查看 Wikipedia。
我希望这篇文章对你有帮会帮助。
算法直播课:请点击这里
作者:chen_h
微信号 & QQ:862251340
简书地址:http://www.jianshu.com/p/13e024c8ea44
CoderPai 是一个专注于算法实战的平台,从基础的算法到人工智能算法都有设计。如果你对算法实战感兴趣,请快快关注我们吧。加入AI实战微信群,AI实战QQ群,ACM算法微信群,ACM算法QQ群。长按或者扫描如下二维码,关注 “CoderPai” 微信号(coderpai)
- 利用 tf.gradients 在 TensorFlow 中实现梯度下降
- TensorFlow梯度求解tf.gradients
- 在Tensorflow环境下利用梯度下降法进行线性回归
- tensorflow学习笔记(三十):tf.gradients 与 tf.stop_gradient()
- tensorflow 中的tf.gradients 与 tf.stop_gradient() 函数
- 线性回归、梯度下降以及运用TensorFlow进行简单实现
- 利用梯度下降法实现简单的线性回归
- matlab实现梯度下降
- 梯度下降算法实现
- 梯度下降C++实现
- 机器学习中利用牛顿迭代法代替梯度下降
- 10、Tensorflow:梯度下降、随机梯度下降和批量梯度下降
- 线性回归、梯度下降算法与 tensorflow
- 梯度下降的C#实现
- java实现梯度下降算法
- Python实现梯度下降法
- 梯度下降的python实现
- 梯度下降算法 Python实现
- celery 配置手册
- NGUI文本列表TextList
- 展示系统应用和用户应用的列表
- 拖拽轨迹的回放
- pygame安装问题(1) 'module' object has no attribute 'init'
- 利用 tf.gradients 在 TensorFlow 中实现梯度下降
- com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
- Unity 带参数启动
- pycharm常用快捷键
- win7系统安装 cygwin 的详细步骤
- Creating a Phong Specular type中对插图的疑问
- Struts2小结(2)
- 关于线程池的“惊群效应”
- 【七月Python入门】 第六课高级面向对象