[深度学习]-初识 TensorFlow (Python)
来源:互联网 发布:数据库管理员是青春饭 编辑:程序博客网 时间:2024/05/21 16:53
综述
TensorFlow 是一个编程系统, 使用图来表示计算任务. 图中的节点被称之为 op (operation 的缩写). 一个 op 获得 0 个或多个 Tensor, 执行计算, 产生 0 个或多个 Tensor. 每个 Tensor 是一个类型化的多维数组. 例如, 你可以将一小组图像集表示为一个四维浮点数数组, 这四个维度分别是 [batch, height, width, channels].
一个 TensorFlow 图描述了计算的过程. 为了进行计算, 图必须在 会话 里被启动. 会话 将图的 op 分发到诸如 CPU 或 GPU 之类的 设备 上, 同时提供执行 op 的方法. 这些方法执行后, 将产生的 tensor 返回. 在 Python 语言中, 返回的 tensor 是 numpy ndarray 对象; 在 C 和 C++ 语言中, 返回的 tensor 是 tensorflow::Tensor 实例.
基本概念:
- 使用图 (graph) 来表示计算任务.
- 在被称之为 会话 (Session) 的上下文 (context) 中执行图.
- 使用 tensor 表示数据.
- 通过 变量 (Variable) 维护状态.
- 使用 feed 和 fetch 可以为任意的操作(arbitrary operation) 赋值或者从其中获取数据.
官方安装指南
图与会话
创建图,执行会话
以下代码创建了图:
import tensorflow as tfx = tf.Variable(5, name='x')y = tf.Variable(2, name='y')f = x*x*y + y + 10
上边的代码创建了计算图,但是 没有 执行计算。计算这个图,需要打开一个 TensorFlow Session ,然后使用它来初始化变量以及计算 f:
sess = tf.Session()sess.run(x.initializer)sess.run(y.initializer)print(sess.run(f))sess.close()
如果变量很多,会使得 sess.run()
多次出现。所以,我们使用 with
块来设置默认session:
with tf.Session() as sess: x.initializer.run() # equivalent to tf.get_default_session().run(x.initializer) y.initializer.run() retsult = f.eval() # equivalent to calling tf.get_default_session().run(f) print(retsult) sess.close()
上边的代码手动去初始化了各个变量。我们也可以使用 global_variables_initializer()
来初始化所有变量(不会立即执行初始化):
init = tf.global_variables_initializer()with tf.Session() as sess: init.run() retsult = f.eval() print(retsult) sess.close()
管理图
上边的代码都是使用默认图,如果需要在独立的图里边执行代码,可以自行创建图:
import tensorflow as tfx1 = tf.Variable(1)print(x1.graph is tf.get_default_graph()) # Truegraph = tf.Graph() # 独立的 Graphwith graph.as_default(): x2 = tf.Variable(2)print(x2.graph is tf.get_default_graph()) # False
Node 的存活周期
变量的存活开始于其初始化,结束于会话结束:
import tensorflow as tfw = tf.constant(3)x = w + 2y = x + 3z = x + 4# 计算 w 、 x 两次with tf.Session() as sess: print(y.eval()) print(z.eval()) sess.close()# 计算 w 、 x 一次with tf.Session() as sess: y_eval, z_eval = sess.run([y, z]) print(y_eval) print(z_eval) sess.close()
示例:使用TensorFlow实现线性回归
θ 等式计算
线性回归的计算我们使用:
我们引入 sklearn 中 california_housing
来进行演示,代码如下:
import tensorflow as tfimport numpy as npfrom sklearn.datasets import fetch_california_housinghousing = fetch_california_housing()m, n = housing.data.shapehousing_data_with_bias = np.c_[np.ones([m, 1]), housing.data] X = tf.constant(housing_data_with_bias, dtype=tf.float32, name='X')y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name='y')XT = tf.transpose(X)theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y) # (X^T * X)^-1 * X^T * ywith tf.Session() as sess: theta_value = theta.eval() print(theta_value)
输出:
[[ -3.74651413e+01] [ 4.35734153e-01] [ 9.33829229e-03] [ -1.06622010e-01] [ 6.44106984e-01] [ -4.25131839e-06] [ -3.77322501e-03] [ -4.26648885e-01] [ -4.40514028e-01]]
实现梯度下降
下边我们使用梯度下降来代替上边的等式:
import tensorflow as tfimport numpy as npimport numpy.random as rndfrom sklearn.preprocessing import StandardScalerfrom sklearn.datasets import fetch_california_housingfrom datetime import datetimescaler = StandardScaler()housing = fetch_california_housing()m, n = housing.data.shapescale_housing_data = scaler.fit_transform(housing.data)scaled_housing_data_plus_bias = np.c_[np.ones([m, 1]), scale_housing_data]# ### 计算梯度(Batch)###tf.reset_default_graph()n_epochs = 1000learning_rate = 0.01X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name='X')y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name='y')theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1.0, seed=42), name='theta')y_pred = tf.matmul(X, theta, name='predictions')error = y_pred - ymse = tf.reduce_mean(tf.square(error), name='mse')# gradients = 2/m * tf.matmul(tf.transpose(X), error) # ① 手动计算梯度# training_op = tf.assign(theta, theta - gradients * learning_rate)# gradients = tf.gradients(mse, [theta])[0] # ② autodiff 自动计算梯度# training_op = tf.assign(theta, theta - gradients * learning_rate)optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate) # ③ 梯度下降优化器# optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.25) # 可以使用其他优化器training_op = optimizer.minimize(mse)init = tf.global_variables_initializer()saver = tf.train.Saver()with tf.Session() as sess: # saver.restore(sess, 'my_model_final.ckpt') sess.run(init) for epoch in range(n_epochs): if epoch % 100 == 0: print("Epoch", epoch, "MSE =", mse.eval()) save_path = saver.save(sess, '/tmp/my_model.ckpt') sess.run(training_op) best_theta = theta.eval() save_path = saver.save(sess, "my_model_final.ckpt")print("Best theta:")print(best_theta)
手动实现梯度下降
gradients = 2/m * tf.matmul(tf.transpose(X), error) # ① 手动计算梯度# training_op = tf.assign(theta, theta - gradients * learning_rate)
tf.random_uniform()
产生随机数tf.assign()
将新值赋予一个变量,在 “① 手动计算梯度” 中,我们使用了它实现θ(nextstep)=θ−\arrowdown
使用 autodiff 实现梯度下降
使用手动实现梯度下降,在深度神经网络中,代码可能变的冗长易错。我们可以改而使用 symbolic differentiation 对偏导自动查找等式。自动实现梯度下降的主要解决方案如下:
gradients = tf.gradients(mse, [theta])[0] # ② autodiff 自动计算梯度# training_op = tf.assign(theta, theta - gradients * learning_rate)
使用优化器实现梯度下降
TensorFlow 提供了一系列优化器优化器,我们代码中使用了 tf.train.GradientDescentOptimizer()
,也可以使用其他优化器,如 tf.train.MomentumOptimizer()
。代码如下:
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate) # ③ 梯度下降优化器# optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate, momentum=0.25) # 可以使用其他优化器training_op = optimizer.minimize(mse)
保存和加载模型
saver = tf.train.Saver()[...]save_path = saver.save(sess, '/tmp/my_model.ckpt')[...]saver.restore(sess, 'my_model_final.ckpt')
Mini-batch 梯度下降 —— 逐步“喂”数据
实现 Mini-batch Gradient Descent 需要在每个迭代中将X和y替换,最简单的就是使用 tf.placeholder()
。如下:
X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")y = tf.placeholder(tf.float32, shape=(None, 1), name="y")
在每次迭代中通过 feed_dict
参数来填充数据:
X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
全部代码如下:
X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X") # “If you specify None for a dimension, it means “any size.”y = tf.placeholder(tf.float32, shape=(None, 1), name="y")theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1, seed=42), name='theta')y_pred = tf.matmul(X, theta, name='predictions')error = y_pred - ymse = tf.reduce_mean(tf.square(error), name='mse')optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)training_op = optimizer.minimize(mse)init = tf.global_variables_initializer()rnd.seed(42)def fetch_batch(epoch, batch_index, batch_size): rnd.seed(epoch * n_batches + batch_index) indices = rnd.randint(m, size=batch_size) X_batch = scaled_housing_data_plus_bias[indices] y_batch = housing.target.reshape(-1, 1)[indices] return X_batch, y_batchn_epochs = 10batch_size = 100n_batches = int(np.ceil(m / batch_size))with tf.Session() as sess: sess.run(init) for epoch in range(n_epochs): for batch_index in range(n_batches): X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size) sess.run(training_op, feed_dict={X: X_batch, y: y_batch}) best_theta = theta.eval()print("Best theta:")print(best_theta)
可视化 —— 使用 TensorBoard
首先,定义日志文件目录和名称:
now = datetime.utcnow().strftime("%Y%m%d%H%M%S")root_logdir = "tf_logs"logdir = "{}/run-{}/".format(root_logdir, now)
然后添加下列代码:
mse_summary = tf.summary.scalar('MSE', mse)summary_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())
第一行在图中创建一歌节点,将MSE记录进 summary
(a TensorBoard-compatible binary log string)。第二行创建 tf.summary.FileWriter()
,用以将所有 summary
写入日志文件目录。
最后使用 add_summary()
更新文件。代码如下:
tf.reset_default_graph()now = datetime.utcnow().strftime("%Y%m%d%H%M%S")root_logdir = "tf_logs"logdir = "{}/run-{}/".format(root_logdir, now)n_epochs = 100learning_rate = 0.01X = tf.placeholder(tf.float32, shape=(None, n+1), name='X')y = tf.placeholder(tf.float32, shape=(None, 1), name='y')theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1, seed=42), name='theta')y_pred = tf.matmul(X, theta, name='predictions')with tf.name_scope('loss') as scope: # NameScope error = y_pred - y mse = tf.reduce_mean(tf.square(error), name='mse')optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)training_op = optimizer.minimize(mse)init = tf.global_variables_initializer()mse_summary = tf.summary.scalar('MSE', mse)summary_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())n_epochs = 10batch_size = 100n_batches = int(np.ceil(m / batch_size))with tf.Session() as sess: sess.run(init) for epoch in range(n_epochs): for batch_index in range(n_batches): X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size) if batch_index % 10 == 0: summary_str = mse_summary.eval(feed_dict={X:X_batch, y:y_batch}) step = epoch * n_batches + batch_index summary_writer.add_summary(summary_str, step) sess.run(training_op, feed_dict={X : X_batch, y : y_batch}) best_theta = theta.eval()summary_writer.flush()summary_writer.close()print("Best theta:")print(best_theta)
终端里边启动 TensorBoard:
(tensorflow) ➜ ch09 git:(master) ✗ tensorboard --logdir ./logsStarting TensorBoard b'41' on port 6006(You can navigate to http://127.0.0.1:6006)...
这个时候可以在浏览器 http://127.0.0.1:6006 中看到图了。
命名空间、模块化和共享变量
Name Scopes
在复杂的模型中很容易产生很多节点,那么图绘变得很乱。所以,我们使用 Name Scope 来使相关节点变成一个群体,如下:
with tf.name_scope('loss') as scope: error = y_pred - y mse = tf.reduce_mean(tf.square(error), name="mse")print(error.op.name) # loss/subprint(mse.op.name) # loss/mse
Modularity
看一下下边的代码:
tf.reset_default_graph()n_features = 3X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")w1 = tf.Variable(tf.random_normal((n_features, 1)), name="weights1")w2 = tf.Variable(tf.random_normal((n_features, 1)), name="weights2")b1 = tf.Variable(0.0, name="bias1")b2 = tf.Variable(0.0, name="bias2")linear1 = tf.add(tf.matmul(X, w1), b1, name="linear1")linear2 = tf.add(tf.matmul(X, w2), b2, name="linear2")relu1 = tf.maximum(linear1, 0, name="relu1")relu2 = tf.maximum(linear1, 0, name="relu2") # Oops, cut&paste error! Did you spot it?output = tf.add_n([relu1, relu2], name="output")
上边的代码炒鸡丑陋啊有木有?如果我们需要很多重复操作,那么就需要使其模块化:
tf.reset_default_graph()def relu(X): with tf.name_scope("relu"): w_shape = int(X.get_shape()[1]), 1 w = tf.Variable(tf.random_normal(w_shape), name="weights") b = tf.Variable(0.0, name="bias") linear = tf.add(tf.matmul(X, w), b, name="linear") return tf.maximum(linear, 0, name="max")n_features = 3X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")relus = [relu(X) for i in range(5)]output = tf.add_n(relus, name="output")summary_writer = tf.summary.FileWriter("logs/relu2", tf.get_default_graph())
Sharing Variables
如果我们需要一个共享变量,我们有什么办法呢?考虑一下下边几种方案:
- 创建后在函数中通过参数传递。这种方法在需要很多共享变量时变得很痛苦。
tf.reset_default_graph()def relu(X, threshold): with tf.name_scope("relu"): w_shape = int(X.get_shape()[1]), 1 w = tf.Variable(tf.random_normal(w_shape), name="weights") b = tf.Variable(0.0, name="bias") linear = tf.add(tf.matmul(X, w), b, name="linear") return tf.maximum(linear, threshold, name="max")threshold = tf.Variable(0.0, name="threshold")X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")relus = [relu(X, threshold) for i in range(5)]output = tf.add_n(relus, name="output")
- 使用类或者字典来保存。或者是在 relu() 首次调用时设置这个共享变量。
tf.reset_default_graph()def relu(X): with tf.name_scope("relu"): if not hasattr(relu, "threshold"): relu.threshold = tf.Variable(0.0, name="threshold") w_shape = int(X.get_shape()[1]), 1 w = tf.Variable(tf.random_normal(w_shape), name="weights") b = tf.Variable(0.0, name="bias") linear = tf.add(tf.matmul(X, w), b, name="linear") return tf.maximum(linear, relu.threshold, name="max")X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")relus = [relu(X) for i in range(5)]output = tf.add_n(relus, name="output")
- TensorFlow 的方案
TensorFlow 使用 get_variable()
来处理共享变量:不存在则创建,存在则复用。他的行为(创建还是复用)通过 variable_scope()
来控制:
tf.reset_default_graph()def relu(X): with tf.variable_scope("relu", reuse=True): threshold = tf.get_variable("threshold", shape=(), initializer=tf.constant_initializer(0.0)) w_shape = int(X.get_shape()[1]), 1 w = tf.Variable(tf.random_normal(w_shape), name="weights") b = tf.Variable(0.0, name="bias") linear = tf.add(tf.matmul(X, w), b, name="linear") return tf.maximum(linear, threshold, name="max")X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")with tf.variable_scope("relu"): threshold = tf.get_variable("threshold", shape=(), initializer=tf.constant_initializer(0.0))relus = [relu(X) for i in range(5)]output = tf.add_n(relus, name="output")summary_writer = tf.summary.FileWriter("logs/relu6", tf.get_default_graph())summary_writer.close()
上边的共享变量是在主题方法外定义的,使用下列代码将其放在方法内:
import tensorflow as tfn_features = 3def relu(X): with tf.variable_scope("relu"): threshold = tf.get_variable("threshold", shape=(), initializer=tf.constant_initializer(0.0)) w_shape = int(X.get_shape()[1]), 1 w = tf.Variable(tf.random_normal(w_shape), name="weights") b = tf.Variable(0.0, name="bias") linear = tf.add(tf.matmul(X, w), b, name="linear") return tf.maximum(linear, threshold, name="max")X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")with tf.variable_scope("", default_name="") as scope: first_relu = relu(X) # create the shared variable scope.reuse_variables() # then reuse it relus = [first_relu] + [relu(X) for i in range(4)]output = tf.add_n(relus, name="output")summary_writer = tf.summary.FileWriter("logs/relu8", tf.get_default_graph())summary_writer.close()
- [深度学习]-初识 TensorFlow (Python)
- 深度学习的应用以及初识Tensorflow
- 深度学习、python、tensorflow资源
- ubuntu中关于python学习深度学习 pip Theano Tensorflow
- Ubuntu下Python深度学习TensorFlow+Theana+Keras配置文档
- win7 远程配置ubuntu14 python keras tensorflow 深度学习
- TensorFlow 深度学习笔记
- 深度学习---tensorflow简介
- TensorFlow深度学习框架
- TensorFlow深度学习
- 【深度学习】TensorFlow详解
- 【深度学习】TensorFlow架构
- TensorFlow深度学习框架
- tensorflow深度学习原理
- TensorFlow深度学习框架
- TensorFlow深度学习初探
- Tensorflow、深度学习填坑记
- 深度学习 tensorflow例子
- QT中增加C++11编译选项
- SNMP监控一些常用OID的总结
- C语言初步学习笔记
- #define中#和##的作用
- SSH和SSM框架对比总结
- [深度学习]-初识 TensorFlow (Python)
- 逻辑斯谛回归(Logistic regression)—《统计学习方法》
- 利用WIN8自带系统还原功能还原系统,遇到的一些问题及解决思路。
- 斯坦福机器学习笔记 第2周 四、多变量线性回归
- npm --save-dev --save 的区别
- java两种实现线程的方法
- jsp和servlet的关系
- MyBatis传入多个参数的问题
- 类加载的三种方式比较