matric operation

来源：互联网发布：2016年网络流行语大全编辑：程序博客网时间：2024/06/05 11:48

1.SpatialConvolution

require('nn')require('sys')local model = nn.Sequential()x = torch.Tensor(1, 5, 8):fill(1)model:add(nn.Identity())model:add(nn.SpatialConvolution(1, 2, 2, 3))y = model:forward(x)print(y)返回的数组是2x(5-3+1)(8-2+1)

2.查找一个矩阵每一维大小时，index是从1开始而不是零

x = torch.Tensor(2,3):fill(1)x:size(1)  :返回数组第1维的长度2x:size(2)  :返回数组第二维的长度3

3. linspace(a,b,N)

linspace(a,b,N)

得到一个a到b之间的等差数组，a为起点，b为终点。数组的间隔为(a-b)/(N-1)
另外一个数组的shape为(5, )时，数组的长度为5

4.arr.transpose

arr.transpose((1,0,2))

对于一个三维数组上面实现的时矩阵转置的功能。其中(1,0,2)中的1表示目标矩阵的第一维的长度等于原矩阵第2维的长度；同样0表示目标矩阵的第二维的长度等于原矩阵第一维的长度。

5. torch.Storage

torch.Storage

是一个一维连续数组。

6. torch.ones(*sizes, out=None) → Tensor

>>> torch.ones(2, 3) 1  1  1 1  1  1[torch.FloatTensor of size 2x3]>>> torch.ones(5) 1 1 1 1 1[torch.FloatTensor of size 5]下面得到一个1乘2乘3的全1矩阵，让互第一维和第2维转换。>>> x = np.ones((1, 2, 3))>>> np.transpose(x, (1, 0, 2)).shape(2, 1, 3)

7. torch.randperm(n, out=None) → LongTensor

>>> torch.randperm(4) 2 1 3 0[torch.LongTensor of size 4]

用0到n之间的数字随机排列得到一个数组。

8. torch.cat(inputs, dimension=0) → Tensor

>>> x = torch.randn(2, 3)>>> x 0.5983 -0.0341  2.4918 1.5981 -0.5265 -0.8735[torch.FloatTensor of size 2x3]>>> torch.cat((x, x, x), 0) 0.5983 -0.0341  2.4918 1.5981 -0.5265 -0.8735 0.5983 -0.0341  2.4918 1.5981 -0.5265 -0.8735 0.5983 -0.0341  2.4918 1.5981 -0.5265 -0.8735[torch.FloatTensor of size 6x3]>>> torch.cat((x, x, x), 1) 0.5983 -0.0341  2.4918  0.5983 -0.0341  2.4918  0.5983 -0.0341  2.4918 1.5981 -0.5265 -0.8735  1.5981 -0.5265 -0.8735  1.5981 -0.5265 -0.8735[torch.FloatTensor of size 2x9]

在给定维度dimension 上串联序列inputs。

9. torch.index_select(input, dim, index, out=None) → Tensor

>>> x = torch.randn(3, 4)>>> x 1.2045  2.4084  0.4001  1.1372 0.5596  1.5677  0.6219 -0.7954 1.3635 -1.2313 -0.5414 -1.8478[torch.FloatTensor of size 3x4]>>> indices = torch.LongTensor([0, 2])>>> torch.index_select(x, 0, indices) 1.2045  2.4084  0.4001  1.1372 1.3635 -1.2313 -0.5414 -1.8478[torch.FloatTensor of size 2x4]>>> torch.index_select(x, 1, indices) 1.2045  0.4001 0.5596  0.6219 1.3635 -0.5414[torch.FloatTensor of size 3x2]

在指定维度dim方向上从input中抽取由位置序列index所指定的值。output的其他维度的长度和原来矩阵相同，在第dim维度上的长度和index的长度相同。

10. torch.nonzero(input, out=None) → LongTensor
抽取input里的非零元素，输出矩阵的每一行包含了input里每个非零元素在input里的索引。
得到的矩阵为z x n，z是input矩阵里非零元素的个数（每个非零元素需要一行里的值来索引），n为input矩阵的维数。

>>> torch.nonzero(torch.Tensor([1, 1, 1, 0, 1])) 0 1 2 4[torch.LongTensor of size 4x1]>>> torch.nonzero(torch.Tensor([[0.6, 0.0, 0.0, 0.0],...                             [0.0, 0.4, 0.0, 0.0],...                             [0.0, 0.0, 1.2, 0.0],...                             [0.0, 0.0, 0.0,-0.4]])) 0  0 1  1 2  2 3  3[torch.LongTensor of size 4x2]

11. torch.max(input, dim, max=None, max_indices=None) -> (Tensor, LongTensor)
在输入矩阵的dim维度上求最大值，得到的矩阵除了dim维度上的长度为1之外，其他维度和input的维度相同。
同时返回每个最大值在input矩阵里的index。

>> a = torch.randn(4, 4)>> a0.0692  0.3142  1.2513 -0.54280.9288  0.8552 -0.2073  0.64091.0695 -0.0101 -2.4507 -1.22300.7426 -0.7666  0.4862 -0.6628torch.FloatTensor of size 4x4]>>> torch.max(a, 1)( 1.2513 0.9288 1.0695 0.7426[torch.FloatTensor of size 4x1], 2 0 0 0[torch.LongTensor of size 4x1])

 model:add(nn.View(1, -1, nhid):setNumInputDims(2)) model:add(cudnn.SpatialConvolution(1, nhid, nhid, kwidth, 1, 1, 0)) model:add(cudnn.SpatialMaxPooling(1, 2, 1, 2)) model:add(nn.Threshold()) model:add(nn.Transpose({2,4}))

b = torch.Tensor(2, 2)b[1][1]=1b[1][2]=2b[2][1]=3b[2][2]=4print(b[{{}, 1}]:contiguous())//得到的是第一列

c = torch.range(1, 3):view(1, 3)      :expand(2, 3):contiguous()print(c)print(c+2)print如下：yi@yi:~$ luajit test.lua 1  2  3 1  2  3[torch.DoubleTensor of size 2x3] 3  4  5 3  4  5[torch.DoubleTensor of size 2x3]

x = torch.Tensor(3,4,4):fill(1)net = nn.Sequential()--下面会得到一个3x1x2x4的矩阵net:add(nn.View(1, -1, 4):setNumInputDims(2))net:add(nn.SpatialConvolution(1, 2, 2, 2))net:add(nn.Tanh())print(net:forward(x))--会得到一个3x2x3x3，所以卷积操作里，如果输入为4维，则第一维的大小为batch的大小

model = nn.Sequential()x = torch.Tensor(1,6,3):fill(1)m=nn.SpatialSubSampling(1,1,2)model:add(m)print(model:forward(x))print(m.weight)print(m.bias)--这其实是个池化动作，和maxpool类似，将每个窗口区域内的多个值变成一个值. 在这里，SpatialSubSampling的weight的长度和input plane长度相同，所以如果只有一个输入面权重就只有一个。权重的值为随机的。--得出的值为这个窗口内的每个值乘以权重然后相加，最后加上一个bias

updateGradInput(input, gradOutput)

Computing the gradient of the module with respect to its own input. This is returned in gradInput. Also, the gradInput state variable is updated accordingly.

accGradParameters(input, gradOutput, scale)
Computing the gradient of the module with respect to its own parameters. Many modules do not perform this step as they do not have any parameters. The state variable name for the parameters is module dependent. The module is expected to accumulate the gradients（保存某些参数的梯度） with respect to the parameters in some variable.

scale is a scale factor that is multiplied with the gradParameters before being accumulated.

Zeroing this accumulation is achieved with zeroGradParameters() and updating the parameters according to this accumulation is done with updateParameters().

zeroGradParameters()
If the module has parameters, this will zero the accumulation of the gradients with respect to these parameters, accumulated through accGradParameters(input, gradOutput,scale) calls. Otherwise, it does nothing.

updateParameters(learningRate)

根据之前求得的梯度值更新参数If the module has parameters, this will update these parameters, according to the accumulation of the gradients with respect to these parameters, accumulated through backward() calls.

The update is basically:

parameters = parameters - learningRate * gradients_wrt_parameters
If the module does not have parameters, it does nothing.

StochasticGradient
他有一个maxIteration 代表最大的迭代次数，同时训练样本也有一个size，训练时每次随机的从训练样本中抽取一个样本来计算并更新梯度，当迭代的次数达到maxIteration 时结束
https://zhuanlan.zhihu.com/p/21550685里介绍了
updateOutput(input)
updateGradInput(input, gradOutput)
accGradParameters(input, gradOutput)

∀x∈M,p(x)： “对任意x属于M，p(x)成立。”
∃x ∈ M，p（x）：存在一个x属于M，使p（x）成立。

0 0