Pytorch学习（十）---解读Neural Style代码

来源：互联网发布：微信淘宝刷 g是真的吗编辑：程序博客网时间：2024/06/14 05:17

总说

其实之前写过的torch版本的neural style代码的解读，可以参考
Torch7学习（七）——Neural-Style代码解析，不过那是传统的层的思想的框架，如今都是计算图的思想了。pytorch版本的写法与之前的写法还是有一定差异的，主要是简单了很多！对比之后你会震撼的。

pytorch官网的neural style代码

其他没啥好看的，主要看核心代码：

class ContentLoss(nn.Module):    def __init__(self, target, weight):        super(ContentLoss, self).__init__()        # we 'detach' the target content from the tree used        self.target = target.detach() * weight        # to dynamically compute the gradient: this is a stated value,        # not a variable. Otherwise the forward method of the criterion        # will throw an error.        self.weight = weight        self.criterion = nn.MSELoss()    def forward(self, input):        self.loss = self.criterion(input * self.weight, self.target)        self.output = input        return self.output    def backward(self, retain_graph=True):        self.loss.backward(retain_graph=retain_graph)        return self.lossclass GramMatrix(nn.Module):    def forward(self, input):        a, b, c, d = input.size()  # a=batch size(=1)        # b=number of feature maps        # (c,d)=dimensions of a f. map (N=c*d)        features = input.view(a * b, c * d)  # resise F_XL into \hat F_XL        G = torch.mm(features, features.t())  # compute the gram product        # we 'normalize' the values of the gram matrix        # by dividing by the number of element in each feature maps.        return G.div(a * b * c * d)class StyleLoss(nn.Module):    def __init__(self, target, weight):        super(StyleLoss, self).__init__()        self.target = target.detach() * weight        self.weight = weight        self.gram = GramMatrix()        self.criterion = nn.MSELoss()    def forward(self, input):        self.output = input.clone()        self.G = self.gram(input)        self.G.mul_(self.weight)        self.loss = self.criterion(self.G, self.target)        return self.output    def backward(self, retain_graph=True):        self.loss.backward(retain_graph=retain_graph)        return self.loss

上面部分，我们发现比较诡异的地方：
1. 在Pytorch入门学习（八）—–自定义层的实现（甚至不可导operation的backward写法）中，我们知道如果要扩展自定义层，只需要重载nn.Module的forward函数就行。然后在forward里面调用xxxFunction.apply来调用自定义autograd类，在自定义autograd类中，再进行重载forward以及backward。但是这里为什么在自定义module中就有backward?
2. retain_graph有什么用？
3. Gram矩阵直接写forward就行，根本不用写反向传播，这个是真厉害！

对比torch的Gram写法

local Gram, parent = torch.class('nn.GramMatrix', 'nn.Module')function Gram:__init()  parent.__init(self)endfunction Gram:updateOutput(input)  assert(input:dim() == 3)  local C, H, W = input:size(1), input:size(2), input:size(3)  local x_flat = input:view(C, H * W)  self.output:resize(C, C)  self.output:mm(x_flat, x_flat:t())  return self.outputendfunction Gram:updateGradInput(input, gradOutput)  assert(input:dim() == 3 and input:size(1))  local C, H, W = input:size(1), input:size(2), input:size(3)  local x_flat = input:view(C, H * W)  self.gradInput:resize(C, H * W):mm(gradOutput, x_flat)  self.gradInput:addmm(gradOutput:t(), x_flat)  self.gradInput = self.gradInput:view(C, H, W)  return self.gradInputend-- Define an nn Module to compute style loss in-placelocal StyleLoss, parent = torch.class('nn.StyleLoss', 'nn.Module')function StyleLoss:__init(strength, normalize)  parent.__init(self)  self.normalize = normalize or false  self.strength = strength  self.target = torch.Tensor()  self.mode = 'none'  self.loss = 0  self.gram = nn.GramMatrix()  self.blend_weight = nil  self.G = nil  self.crit = nn.MSECriterion()endfunction StyleLoss:updateOutput(input)  self.G = self.gram:forward(input)  self.G:div(input:nElement())  if self.mode == 'capture' then    if self.blend_weight == nil then      self.target:resizeAs(self.G):copy(self.G)    elseif self.target:nElement() == 0 then      self.target:resizeAs(self.G):copy(self.G):mul(self.blend_weight)    else      self.target:add(self.blend_weight, self.G)    end  elseif self.mode == 'loss' then    self.loss = self.strength * self.crit:forward(self.G, self.target)  end  self.output = input  return self.outputendfunction StyleLoss:updateGradInput(input, gradOutput)  if self.mode == 'loss' then    local dG = self.crit:backward(self.G, self.target)    dG:div(input:nElement())    self.gradInput = self.gram:backward(input, dG)    if self.normalize then      self.gradInput:div(torch.norm(self.gradInput, 1) + 1e-8)    end    self.gradInput:mul(self.strength)    self.gradInput:add(gradOutput)  else    self.gradInput = gradOutput  end  return self.gradInputend

前后代码量起码差一倍以上，而且书写难度后者困难很多！最主要的差别就是torch对于自定义层，得自己写updateGradInput啊！特别是gram的反向传播，可能并不是很容易写的。另外一点是StyleLoss的反向传播self.gradInput:add(gradOutput)也是有点理解费劲。这点其实在隐藏层加监督（feature matching）的代码书写方法—- 附加optim包的功能再看。有了相应的说明。然而对于这种自动求导的框架，只要是你forward是完全用Variable进行计算的，那么就会创建一个正确的graph，那么反向就会正确！就是这么crazy！根本不用写反向传播！

Neural Style的自定义module的backward是什么？

其实这个“backward”知识一个普通的函数！并不是重载内部的backward！
在这个“backward”中主要是调用criterion的backward，然后返回这个loss，好让外面能拿到相应的损失。
其实看到后面就可以发现。

竟然可以直接将自定义层放入一个list中！

# desired depth layers to compute style/content losses :content_layers_default = ['conv_4']style_layers_default = ['conv_1', 'conv_2', 'conv_3', 'conv_4', 'conv_5']def get_style_model_and_losses(cnn, style_img, content_img,                               style_weight=1000, content_weight=1,                               content_layers=content_layers_default,                               style_layers=style_layers_default):    cnn = copy.deepcopy(cnn)    # just in order to have an iterable access to or list of content/syle    # losses（放入自定义层的list）    content_losses = []    style_losses = []    model = nn.Sequential()  # the new Sequential module network    gram = GramMatrix()  # we need a gram module in order to compute style targets    # move these modules to the GPU if possible:    if use_cuda:        model = model.cuda()        gram = gram.cuda()    i = 1    for layer in list(cnn):        if isinstance(layer, nn.Conv2d):            name = "conv_" + str(i)            model.add_module(name, layer)            if name in content_layers:                # add content loss:                target = model(content_img).clone()                content_loss = ContentLoss(target, content_weight)                model.add_module("content_loss_" + str(i), content_loss)                content_losses.append(content_loss)            if name in style_layers:                # add style loss:                target_feature = model(style_img).clone()                target_feature_gram = gram(target_feature)                style_loss = StyleLoss(target_feature_gram, style_weight)                model.add_module("style_loss_" + str(i), style_loss)                style_losses.append(style_loss)        if isinstance(layer, nn.ReLU):            name = "relu_" + str(i)            model.add_module(name, layer)            if name in content_layers:                # add content loss:                target = model(content_img).clone()                content_loss = ContentLoss(target, content_weight)                model.add_module("content_loss_" + str(i), content_loss)                content_losses.append(content_loss)            if name in style_layers:                # add style loss:                target_feature = model(style_img).clone()                target_feature_gram = gram(target_feature)                style_loss = StyleLoss(target_feature_gram, style_weight)                model.add_module("style_loss_" + str(i), style_loss)                style_losses.append(style_loss)            i += 1        if isinstance(layer, nn.MaxPool2d):            name = "pool_" + str(i)            model.add_module(name, layer)  # ***    # 将整个模型，以及损失层的list全部返回    return model, style_losses, content_losses

再之后

# Parameter是需要grad的Variable！# 这里是迭代优化输入x，因此将x作为参数，然后放进optim中。def get_input_param_optimizer(input_img):    # this line to show that input is a parameter that requires a gradient    input_param = nn.Parameter(input_img.data)    optimizer = optim.LBFGS([input_param])    return input_param, optimizerdef run_style_transfer(cnn, content_img, style_img, input_img, num_steps=300,                       style_weight=1000, content_weight=1):    """Run the style transfer."""    print('Building the style transfer model..')    model, style_losses, content_losses = get_style_model_and_losses(cnn,        style_img, content_img, style_weight, content_weight)    input_param, optimizer = get_input_param_optimizer(input_img)    print('Optimizing..')    run = [0]    while run[0] <= num_steps:        def closure():            # correct the values of updated input image            input_param.data.clamp_(0, 1)        # 首先梯度置0            optimizer.zero_grad()            model(input_param)            style_score = 0            content_score = 0        # 这个有意思。直接调用假的“backward”        # 这个backward会调用隐藏层约束的loss的真的backward！        # 这个假的backward其实主要是返回相应的loss用的。        # 写法非常巧妙。            for sl in style_losses:                style_score += sl.backward()        # content层与style层甚至可以分开来backward！            for cl in content_losses:                content_score += cl.backward()            run[0] += 1            if run[0] % 50 == 0:                print("run {}:".format(run))                print('Style Loss : {:4f} Content Loss: {:4f}'.format(                    style_score.data[0], content_score.data[0]))                print()            return style_score + content_score        optimizer.step(closure)    # a last correction...    input_param.data.clamp_(0, 1)    return input_param.dataoutput = run_style_transfer(cnn, content_img, style_img, input_img)plt.figure()imshow(output, title='Output Image')# sphinx_gallery_thumbnail_number = 4plt.ioff()plt.show()

很方便！下面代码展示了自定义层竟然可以分开进行backward！一点都不影响！

            for sl in style_losses:                style_score += sl.backward()            for cl in content_losses:                content_score += cl.backward()

其实很简单，backward会计算相应的网络结点的梯度，如果梯度不置0，那么这些梯度是不断累加的。不过pytorch为了节省空间，除了叶子结点（自己创建的Variable)之外的结点的梯度，一旦计算完，就会清空！所以你一般想看中间层的梯度是看不到的！除非添加hook！那么我们就不能保留这些中间结点的梯度吗？用retain_graph!

retain_graph与detach的使用

retain_graph就是用来保存计算反向时的graph的，当retain_graph为true时，那么就可以多次单独backward，而不怕上一次的梯度消失！这也就是为什么

    def backward(self, retain_graph=True):        self.loss.backward(retain_graph=retain_graph)        return self.loss

另外一点是，这里有一个detach。这是很重要的。因为这里的target是用style图传入网络中得到的。它是一个Variable！有着自己的计算图。detach()的作用就是将这个结点“截断”，使得其变成叶子节点，就好像是我们自己创建的一个结点，这样的话target.grad_fn变成None，梯度传到它就不会往前进行。

    def __init__(self, target, weight):        super(StyleLoss, self).__init__()        self.target = target.detach() * weight        self.weight = weight        self.gram = GramMatrix()        self.criterion = nn.MSELoss()

总结

自动求导的框架不用写反向传播，前提是自定义module中必须全部用Variable进行操作，否则就无法创建正确的graph！
pytorch为了节省空间，不会保留反向的图，也就意味着中间结点的grad都无法返回，除非你设定retain_graph为true。这也为了“多个loss单独backward”成为可能。
注意detach的使用，将计算图截断，使其变成叶子结点。
这种写backward的方式值得借鉴。

阅读全文

0 0