Tensorflow trick 与细节

来源：互联网发布：为什么手机网络不稳定编辑：程序博客网时间：2024/05/01 12:10

前后传播采用不同方式

How Can I Define Only the Gradient for a Tensorflow Subgraph?

Suppose you want group of ops that behave as f(x) in forward mode, but as g(x) in the backward mode. You implement it as

t = g(x)y = t + tf.stop_gradient(f(x) - t)

在一个bnn的实现中有如下一段：

def round_through(x):    # g(x) = x back    # f(x) = round(x) forward    rounded = K.round(x)     return x + K.stop_gradient(rounded - x)

这里正向传播用的是四舍五入后的x值，而反向传播则保留了小数精度。

停止部分变量的梯度计算

在压缩的retrain步骤中，存在freeze部分变量，retrain另一部分的操作。而在Tensorflow中的tf.stop_gradient函数只能把整个Tensor全部停止计算。
How to stop gradient for some entry of a tensor in tensorflow提供了一个较好的方法：

res_matrix = tf.stop_gradient(mask_h*E) + mask*E

其中mask与E对应，决定了梯度是否需要被保留。

def entry_stop_gradients(target, mask):    mask_h = tf.abs(mask-1)    return tf.stop_gradient(mask_h * target) + mask * target

Tensor 与 Variable

先看一段程序：

a = tf.Variable([1])with tf.device("/cpu:0"):    with tf.Session() as sess:        sess.run(tf.global_variables_initializer())        print 'a:',a.eval()        print 'type of a:',a         a = a + 1        print 'a:',a.eval()         print 'type of a:',a        b = a + 1        print 'b:',b.eval()        print 'type of b:',b

这是很简单的加法，结果肯定都知道，分别是1,2,3。
但同时又加上了他们的类型，这就不太一样了。

a: [1]type of a: <tf.Variable 'Variable:0' shape=(1,) dtype=int32_ref>a: [2]type of a: Tensor("add:0", shape=(1,), dtype=int32, device=/device:CPU:0)b: [3]type of b: Tensor("add_1:0", shape=(1,), dtype=int32, device=/device:CPU:0)

一开始明明定义的是一个tf.Variable，结果一加却变成了Tensor，之后的b也是这样。说明这些操作只是一些计算过程罢了，就像一些菜谱，没有原材料是无法做饭的。这样说的话，经过了计算，原来的变量a的值其实是没有改变的，因为这些操作都不是针对的Variable。像tf.scatter_update这种操作，输入是Variable，才能进行变量的更新。但当你加入这句话，还是不会有作用，因为这个函数返回的是一个Tensor，也是一个“菜谱”，只有当执行时才会生效。

因此Tensorflow的Variable和Tensor还需要更深的理解啊。

阅读全文

0 0

Tensorflow trick 与 细节

前后传播采用不同方式

停止部分变量的梯度计算

Tensor 与 Variable

Tensorflow trick 与细节