theano - scan - 个人理解
来源:互联网 发布:微信数据损坏无法登陆 编辑:程序博客网 时间:2024/05/21 07:04
theano 的 scan 是这段时间觉得比较难以理解的东西,本想着既然用loop的方式就可以实现元素的遍历和计算操作,为什么要做出这么一个 scan 函数呢?
scan的主页http://deeplearning.net/software/theano/library/scan.html给了这么一段描述:
The scan functions provides the basic functionality needed to do loops in Theano. Scan comes with many whistles and bells, which we will introduce byway of examples.
我想要理解scan的提出,应该先从theano说起(参考http://blog.csdn.net/wangjian1204/article/details/50518591):
在theano编程中,Graph是指导theano如何对变量进行操作的唯一途径,theano变量和theano Ops(操作)是Graph的两个基本构成元素。Graph只能由theano变量(包括shared变量)或常数组成。
通常可以按如下步骤构造Graph:首先声明theano变量,theano变量在Python文件中的作用范围和普通python变量相同;然后用theano的Ops建立变量之间的联系,如T.sum(a,b);最后用theano.function把变量和变量间关系结合起来,构成一个完整的Graph。
假设已经创建了一个function,称为fn,fn=theano.function(…)。Graph中的shared变量已经包含了调用fn时需要的数据,而普通theano变量仅仅是一个占位符,需要在function中作为输入,并且在调用fn时给变量赋具体值(如numpy的array或者常数)。
这里值得一提的是shared 变量。首先给出一个官方的解释(http://deeplearning.net/software/theano/tutorial/examples.html?highlight=tanh#using-shared-variables):
Shared variablescan be used in symbolicexpressions just like the objects returnedby dmatrices(...) but theyalso have an internal value that defines thevalue taken by thissymbolic variable in all the functions that useit. It is calleda shared variable because its value is sharedbetween many functions.The value can be accessed and modified by the .get_value() and .set_value() methods.
上面的一段解释我觉得应该关注“internal”这个词(我理解为中间结果)。shared变量一般是模型需要学习的参数,模型的参数在迭代过程中更新,当前的参数value依赖于上一个时刻的value(internal value),所以要用shared变量来定义。一般来说,shared变量不会用在input中,因为input是固定的,没有internal value一说。不过有的时候我们可能会看到把input变成shared variable(dtype=theano.config.floatX)的操作,我想是为了GPU运算加速吧,看看下面这个tips(http://deeplearning.net/software/theano/tutorial/using_gpu.html):
Prefer constructors like matrix, vector and scalar to dmatrix, dvector and dscalar becausethe former will give youfloat32 variables when floatX=float32.
Ensure that your output variables have a float32 dtype andnot float64. The more float32 variables are in your graph, themore work the GPU can do for you.
Minimize tranfers to the GPU device by using shared float32 variablesto store frequently-accessed data (see shared()). When using theGPU, float32 tensor shared variables are stored on the GPUby default to eliminate transfer time for GPU ops using those variables.
另外,shared 变量的 borrow相对于c++的reference。
好的,那下面正式开始scan的讨论,先看一个例子:
import numpy
coefficients = theano.tensor.vector("coefficients")
x = T.scalar("x")
max_coefficients_supported = 10000
# Generate the components of the polynomial
components, updates = theano.scan(fn=lambda coefficient, power, free_variable: coefficient * (free_variable ** power),
sequences=[coefficients, theano.tensor.arange(max_coefficients_supported)],
outputs_info=None,
non_sequences=x)
# Sum them up
polynomial = components.sum()
# Compile a function
calculate_polynomial = theano.function(inputs=[coefficients, x], outputs=polynomial)
# Test
test_coefficients = numpy.asarray([1, 0, 2], dtype=numpy.float32)
test_value = 3
print(calculate_polynomial(test_coefficients, test_value))
print(1.0 * (3 ** 0) + 0.0 * (3 ** 1) + 2.0 * (3 ** 2))
这里的fn=lambda coefficient, power, free_variable: coefficient * (free_variable ** power)是一个函数,它定义了单步操作的实现细节。当然,fn也可以用一个函数来定义:
def accumulate_by_adding(arange_val, sum_to_date): return sum_to_date*2 + arange_val;
然后fn=accumulate_by_adding
接下来是fn的参数,它包含三种类型,而且要按照下面顺序来写
sequences 为单步操作取出的对象列表,对应关键字sequences;prior result(s) 对应为上次单步操作的结果,对应关键字outputs_info;outputs_info的设置与fn定义的返回值有关,如果fn定义的单步运算返回了两个结果,那么outputs_info=[结果1,结果2]。如果在下一次单步操作只用到结果2的部分,可以将outputs_info=[None,结果2],这样在给fn传参时就会忽略结果1,将结果2传进去。另外,因为outputs_info是保留的是上一次单步运算的结果,所以在第一次单步运算时,要做一个初始化,例如:
outputs_info=T.zeros_like(constant),
non_sequences对应为每一次单步操作的常数变量,对应关键字non_sequences;
体会一个例子:
import numpy as npimport theanoimport theano.tensor as Tdef accumulate_by_adding(arange_val, sum_to_date): return [sum_to_date*2 + arange_val, sum_to_date + arange_val*2];up_to = T.iscalar("up_to")seq = T.arange(up_to)scan_result, scan_updates = theano.scan(fn=accumulate_by_adding, sequences=seq, outputs_info=[None,T.as_tensor_variable(np.asarray(0, seq.dtype))], non_sequences=None)triangular_sequence = theano.function(inputs=[up_to], outputs=scan_result[1])# testsome_num = 15print(triangular_sequence(some_num))
输出:[ 0 2 6 12 20 30 42 56 72 90 110 132 156 182 210]
单步运算:seq的一个元素为arange_val,T.as_tensor_variable(np.asarray(0, seq.dtype))是sum_to_date,单步执行完成后
sum_to_date + arange_val*2 的结果返回为outputs_info,作为下一步单步运算sum_to_date的值
好了,就写到这里。
- theano - scan - 个人理解
- 理解Theano的Scan函数
- Theano scan
- theano-scan
- theano.scan
- theano scan 笔记
- Theano---scan函数
- theano的scan函数
- theano scan 实例
- theano scan 实例
- Theano学习-循环-scan
- Theano(7):Theano循环语句,Scan
- theano.function、theano.scan 参数数据类型问题
- theano scan arange shape 实例
- 五个例子掌握theano.scan函数
- Theano 中 scan 函数的参数解释
- 深度学习Theano中scan的使用方法
- theano的scan的大概科普文章。。。
- ADB server didn't ack * failed to start daemon及unable to obtain result of 'adb versio错误
- 窗体操作一切正常,为什么在form.free;时会出现Invalid pointer operation(无效的指针操作)的错误提示?
- 一次ora-600 ktubko_1故障简单分析
- EventBus 源码分析
- mysql 主从复制问题'the master returned an invalid number of fields for SHOW SLAVE HOSTS'
- theano - scan - 个人理解
- 控制字数输入
- java.lang.IllegalArgumentException: invalid value for field
- Linux内核中ioremap映射的透彻理解
- vmware 父虚拟磁盘的容量与子磁盘的容量不同。导致无法打开虚拟机
- JS如何进行
- imooc学习笔记--屏幕适配
- 根据文字长短设置UIlabel的宽高
- linux(centos)下安装jdk