Torch
来源:互联网 发布:大数据平台是什么 编辑:程序博客网 时间:2024/05/18 03:57
Torch 模块学习
1. Serialization
Torch 提供了 4 种序列化/反序列化 Lua/Torch objects 的方法.
torch.save(filename, object [, format, referenced])
将 object 写入 filename 文件.
format 可以是 ascii 或 binary(默认).
用例:
-- arbitrary object:obj = { mat = torch.randn(10,10), name = '10', test = { entry = 1 }}-- save to disk:torch.save('test.dat', obj)
[object] torch.load(filename [, format, referenced])
从 filename 文件读入 objects.
用例:
-- given serialized object from section above, reload:obj = torch.load('test.dat')print(obj)-- will print:-- {[mat] = DoubleTensor - size: 10x10-- [name] = string : "10"-- [test] = table - size: 0}
[str] torch.serialize(object)
将 object 序列化为 string.
用例:
-- arbitrary object:obj = { mat = torch.randn(10,10), name = '10', test = { entry = 1 }}-- serialize:str = torch.serialize(obj)
[object] torch.deserialize(str)
从 string 反序列化 object
用例:
-- given serialized object from section above, deserialize:obj = torch.deserialize(str)print(obj)-- will print:-- {[mat] = DoubleTensor - size: 10x10-- [name] = string : "10"-- [test] = table - size: 0}
2. Module
Module 是抽象类,定义了训练神经网络所必要的基础方法,通过成员函数来设计网络结构.
主要包含变量的两种状态:output 和 gradInput.
[output] forward(input)
根据输入 input,计算对应的输出 output. 一般 input 和 output 是 Tensors.
不推荐修改 forward 函数. 需要实现 updateOutput(input) 函数.
[gradInput] backward(input, gradOutput)
根据输入 input,计算关于 input 的梯度操作;需要是在 forward 之后调用;网络优化;
一般 input, gradOutput, gradInput 是 Tensors.
BP 包含种类梯度的计算及对应的函数:
关于输入变量的梯度计算 - updateGradInput(input, gradOutput)
关于网络参数的梯度计算 - accGradParameters(input,gradOutput,scale), gradParamaters * scale,累加
zeroGradParameters()
如果网络 module 已经有参数,该函数将关于其参数的累积梯度gradParameters清零.
updateParameters(learningRate)
更新网络 module 参数:
parameters=parameters−learningRate∗gradients_wrt_parameters accUpdateGradParameters(input, gradOutput, learningRate)
累积参数梯度,并更新参数.
share(mlp,s1,s2,…,sn)
修改 module
s1,...,sn 的参数,使其与给定 module mlp 具有相同名字的层共享相同参数.用例:
-- make an mlpmlp1=nn.Sequential();mlp1:add(nn.Linear(100,10));-- make a second mlpmlp2=nn.Sequential();mlp2:add(nn.Linear(100,10));-- the second mlp shares the bias of the firstmlp2:share(mlp1,'bias');-- we change the bias of the firstmlp1:get(1).bias[1]=99;-- and see that the second one's bias has also changed..print(mlp2:get(1).bias[1])
clone(mlp,…)
深度复制 module,包括其参数的当前状态.
用例:
-- make an mlpmlp1=nn.Sequential();mlp1:add(nn.Linear(100,10));-- make a copy that shares the weights and biasesmlp2=mlp1:clone('weight','bias');-- we change the bias of the first mlpmlp1:get(1).bias[1]=99;-- and see that the second one's bias has also changed..print(mlp2:get(1).bias[1])
type(type[, tensorCache])
将 module 的所有参数转化为设定 type,torch.Tensor 中的一种 type.
如果 tensors 是网络中的多种 modules 间共享的,则调用 type 会阻止其共享.
为了避免多种 modules 和多种 tensors 间的共享,采用 nn.module = Parallel(inputDimension,outputils.recursiveType:
用例:
-- make an mlpmlp1=nn.Sequential();mlp1:add(nn.Linear(100,10));-- make a second mlpmlp2=nn.Sequential();mlp2:add(nn.Linear(100,10));-- the second mlp shares the bias of the firstmlp2:share(mlp1,'bias');-- mlp1 and mlp2 will be converted to float, and will share bias-- note: tensors can be provided as inputs as well as modulesnn.utils.recursiveType({mlp1, mlp2}, 'torch.FloatTensor')
float([tensorCache])
便于调用 [module:type(‘torch.FloatTensor’, tensorCache])
double([tensorCache])
便于调用 [module:type(‘torch.DoubleTensor’, tensorCache])
cuda([tensorCache])
便于调用 [module:type(‘torch.CudaTensor’, tensorCache])
3. Containers
复杂神经网络可以采用 container 创建.
3.1. Container
抽象类,包含了所有 containers 声明的方法.
add(module)
添加 module 到 container. 按照一定的 order.
get(index)
根据索引 index 获取 modules.
size()
返回包含的 modules 的数目.
3.2 Sequential
以 feed-forward 全连接方式组织网络层.
用例:
mlp = nn.Sequential()mlp:add(nn.Linear(10, 25)) -- Linear module (10 inputs, 25 hidden units)mlp:add(nn.Tanh()) -- apply hyperbolic tangent transfer function on each hidden unitsmlp:add(nn.Linear(25, 1)) -- Linear module (25 inputs, 1 output)> mlp-- nn.Sequential {-- [input -> (1) -> (2) -> (3) -> output]-- (1): nn.Linear(10 -> 25)-- (2): nn.Tanh-- (3): nn.Linear(25 -> 1)--}> print(mlp:forward(torch.randn(10)))-- -0.1815-- [torch.Tensor of dimension 1]
remove([index])
根据 index 删除 module. 如果 index 未指定,则删除最后一层.
用例:
model = nn.Sequential()model:add(nn.Linear(10, 20))model:add(nn.Linear(20, 20))model:add(nn.Linear(20, 30))model:remove(2)> model-- nn.Sequential {-- [input -> (1) -> (2) -> output]-- (1): nn.Linear(10 -> 20)-- (2): nn.Linear(20 -> 30)-- }
insert(module, [index])
根据 index 插入给定 module. 如果 index 未指定,则等价于 add(module).
用例:
model = nn.Sequential()model:add(nn.Linear(10, 20))model:add(nn.Linear(20, 30))model:insert(nn.Linear(20, 20), 2)> model-- nn.Sequential {-- [input -> (1) -> (2) -> (3) -> output]-- (1): nn.Linear(10 -> 20)-- (2): nn.Linear(20 -> 20) -- The inserted layer-- (3): nn.Linear(20 -> 30)-- }
3.3 Parallel
用法:
module = Parallel(inputDimension,outputDimension)
创建 container module,将其 ith 子 module 应用输入 input Tensor 的 ith 分片,子 module 的划分是根据 inputDimension 来进行选择. 最后再将其包含的各子 modules 的结果根据 outputDimension 连接.
用例1:
mlp = nn.Parallel(2,1); -- Parallel container will associate a module to each slice of dimension 2 -- (column space), and concatenate the outputs over the 1st dimension.mlp:add(nn.Linear(10,3)); -- Linear module (input 10, output 3), applied on 1st slice of dimension 2mlp:add(nn.Linear(10,2)) -- Linear module (input 10, output 2), applied on 2nd slice of dimension 2 -- After going through the Linear module the outputs are -- concatenated along the unique dimension, to form 1D Tensor> mlp:forward(torch.randn(10,2)) -- of size 5.-0.5300-1.1015 0.7764 0.2819-0.6026[torch.Tensor of dimension 5]
用例2:
mlp = nn.Sequential();c = nn.Parallel(1,2) -- Parallel container will associate a module to each slice of dimension 1 -- (row space), and concatenate the outputs over the 2nd dimension.for i=1,10 do -- Add 10 Linear+Reshape modules in parallel (input = 3, output = 2x1) local t=nn.Sequential() t:add(nn.Linear(3,2)) -- Linear module (input = 3, output = 2) t:add(nn.Reshape(2,1)) -- Reshape 1D Tensor of size 2 to 2D Tensor of size 2x1 c:add(t)endmlp:add(c) -- Add the Parallel container in the Sequential containerpred = mlp:forward(torch.randn(10,3)) -- 2D Tensor of size 10x3 goes through the Sequential container -- which contains a Parallel container of 10 Linear+Reshape. -- Each Linear+Reshape module receives a slice of dimension 1 -- which corresponds to a 1D Tensor of size 3. -- Eventually all the Linear+Reshape modules' outputs of size 2x1 -- are concatenated alond the 2nd dimension (column space) -- to form pred, a 2D Tensor of size 2x10.> pred-0.7987 -0.4677 -0.1602 -0.8060 1.1337 -0.4781 0.1990 0.2665 -0.1364 0.8109-0.2135 -0.3815 0.3964 -0.4078 0.0516 -0.5029 -0.9783 -0.5826 0.4474 0.6092[torch.DoubleTensor of size 2x10]for i = 1, 10000 do -- Train for a few iterations x = torch.randn(10,3); y = torch.ones(2,10); pred = mlp:forward(x) criterion = nn.MSECriterion() local err = criterion:forward(pred,y) local gradCriterion = criterion:backward(pred,y); mlp:zeroGradParameters(); mlp:backward(x, gradCriterion); mlp:updateParameters(0.01); print(err)end
3.4 Concat
用法:
module = nn.Concat(dim)
根据提供的 dim 将 parallel module 的输出连接:相同的输入,连接各 module 的输出.
用例:
mlp = nn.Concat(1);mlp:add(nn.Linear(5,3))mlp:add(nn.Linear(5,7))> print(mlp:forward(torch.randn(5))) 0.7486 0.1349 0.7924-0.0371-0.4794 0.3044-0.0835-0.7928 0.7856-0.1815[torch.Tensor of dimension 10]
3.5 Weight Normalization
用法:
module = nn.WeightNorm(module)
3.6 NaN
用法:
dmodule = nn.NaN(module, [id])
NaN module 设定 module 的 output 和 gradInput 不包含 NaNs. 对于确定 Nan 错误很有用. id 默认为
用例:
linear = nn.Linear(3,4)mlp = nn.Sequential()mlp:add(nn.NaN(nn.Identity()))mlp:add(nn.NaN(linear))mlp:add(nn.NaN(nn.Linear(4,2)))print(mlp)-- nn.Sequential {-- [input -> (1) -> (2) -> (3) -> output]-- (1): nn.NaN(1) @ nn.Identity-- (2): nn.NaN(2) @ nn.Linear(3 -> 4)-- (3): nn.NaN(3) @ nn.Linear(4 -> 2)-- }
4. Table Layer
4.1 ConcatTable
用法:
module = nn.ConcatTable()
对每个成员模块应用相同输入
+-----------+ +----> {member1, |+-------+ | | || input +----+----> member2, |+-------+ | | | or +----> member3} | {input} +-----------+
用例:
mlp = nn.ConcatTable()mlp:add(nn.Linear(5, 2))mlp:add(nn.Linear(5, 3))pred = mlp:forward(torch.randn(5))for i, k in ipairs(pred) do print(i, k) end
4.2 ParallelTable
用法:
module = nn.ParallelTable()
对每个成员模块应用与之对应的输入(第i个模块应用第i个输入).
+----------+ +-----------+| {input1, +---------> {member1, || | | || input2, +---------> member2, || | | || input3} +---------> member3} |+----------+ +-----------+
用例:
mlp = nn.ParallelTable()mlp:add(nn.Linear(10, 2))mlp:add(nn.Linear(5, 3))x = torch.randn(10)y = torch.rand(5)pred = mlp:forward{x, y}for i, k in pairs(pred) do print(i, k) end
4.3 MapTable
用法:
module = nn.MapTable(m, share)
对所有输入应用的单个模块,不足的进行clone.
+----------+ +-----------+| {input1, +---------> {member, || | | || input2, +---------> clone, || | | || input3} +---------> clone} |+----------+ +-----------+
用例:
map = nn.MapTable()map:add(nn.Linear(10, 3))x1 = torch.rand(10)x2 = torch.rand(10)y = map:forward{x1, x2}for i, k in pairs(y) do print(i, k) end
4.4 SplitTable
用法:
module = SplitTable(dimension, nInputDims)
+----------+ +-----------+ | input[1] +---------> {member1, | +----------+-+ | | | input[2] +-----------> member2, |+----------+-+ | || input[3] +-------------> member3} |+----------+ +-----------+
用例:
mlp = nn.SplitTable(2)x = torch.randn(4, 3)pred = mlp:forward(x)for i, k in ipairs(pred) do print(i, k) end
4.5 JoinTable
用法:
module = JoinTable(dimension, nInputDims)
对每个成员模块应用相同输入.
+----------+ +-----------+| {input1, +-------------> output[1] || | +-----------+-+| input2, +-----------> output[2] || | +-----------+-+| input3} +---------> output[3] |+----------+ +-----------+
用例:
x = torch.randn(5, 1)y = torch.randn(5, 1)z = torch.randn(2, 1)print(nn.JoinTable(1):forward{x, y})print(nn.JoinTable(2):forward{x, y})print(nn.JoinTable(1):forward{x, z})
5. Convolutional layers
6. Criterions
7. Reference
[1] - Neural Network Package