关于VGG16预训练的理解与实践

来源：互联网发布：程序员进公司职业规划编辑：程序博客网时间：2024/06/07 05:23

今天看到了迁移学习的相关内容但是，感觉并不是很清楚，问了一波大概有了比较清楚的认识，结合网上相关内容，记录如下。

1.概念：

迁移学习（Transfer Learning）的目标是将从一个环境中学到的知识用来帮助新环境中的学习任务。

2.迁移学习的适用情况

目前大多数机器学习算法均是假设训练数据以及测试数据的特征分布相同。然而这在现实世界中却时常不可行。例如我们我们要对一个任务进行分类，但是此任务中数据不充足（在迁移学习中也被称为目标域），然而却又需要大量的相关的训练数据（在迁移学习中也被称为源域），但是此训练数据与所需进行的分类任务中的测试数据特征分布不同（例如语音情感识别中，一种语言的语音数据充足，然而所需进行分类任务的情感数据却极度缺乏），在这种情况下如果可以采用合适的迁移学习方法则可以大大提高样本不充足任务的分类识别结果。也即是大家通常所说的将知识迁移到新环境中的能力，这通常被称为迁移学习。

当两者的相似度并不大的时候，会出现负迁移的情况，这个时候就不适合做迁移

3.预训练模型

简单来说，预训练模型(pre-trained model)是前人为了解决类似问题所创造出来的模型。你在解决问题的时候，不用从零开始训练一个新模型，可以从在类似问题中训练过的模型入手。比如说，如果你想做一辆自动驾驶汽车，可以花数年时间从零开始构建一个性能优良的图像识别算法，也可以从Google在ImageNet数据集上训练得到的inception model(一个预训练模型)起步，来识别图像。一个预训练模型可能对于你的应用中并不是100%的准确对口，但是它可以为你节省大量功夫。

通过使用之前在大数据集上经过训练的预训练模型，我们可以直接使用相应的结构和权重，将它们应用到我们正在面对的问题上。这被称作是“迁移学习”，即将预训练的模型“迁移”到我们正在应对的特定问题中。在选择预训练模型的时候你需要非常仔细，如果你的问题与预训练模型训练情景下有很大的出入，那么模型所得到的预测结果将会非常不准确。
预训练模型已经训练得很好，我们就不会在短时间内去修改过多的权重，在迁移学习中用到它的时候，往往只是进行微调(fine tune)。在修改模型的过程中，我们通过会采用比一般训练模型更低的学习速率

如何使用预训练模型：

场景一：数据集小，数据相似度高(与pre-trained model的训练数据相比而言)
在这种情况下，因为数据与预训练模型的训练数据相似度很高，因此我们不需要重新训练模型。我们只需要将输出层改制成符合问题情境下的结构就好。我们使用预处理模型作为模式提取器。比如说我们使用在ImageNet上训练的模型来辨认一组新照片中的小猫小狗。在这里，需要被辨认的图片与ImageNet库中的图片类似，但是我们的输出结果中只需要两项——猫或者狗。在这个例子中，我们需要做的就是把dense layer和最终softmax layer的输出从1000个类别改为2个类别。

场景二：数据集小，数据相似度不高
在这种情况下，我们可以冻结预训练模型中的前k个层中的权重，然后重新训练后面的n-k个层，当然最后一层也需要根据相应的输出格式来进行修改。因为数据的相似度不高，重新训练的过程就变得非常关键。而新数据集大小的不足，则是通过冻结预训练模型的前k层进行弥补。

场景三：数据集大，数据相似度不高
在这种情况下，因为我们有一个很大的数据集，所以神经网络的训练过程将会比较有效率。然而，因为实际数据与预训练模型的训练数据之间存在很大差异，采用预训练模型将不会是一种高效的方式。因此最好的方法还是将预处理模型中的权重全都初始化后在新数据集的基础上重头开始训练。

场景四：数据集大，数据相似度高
这就是最理想的情况，采用预训练模型会变得非常高效。最好的运用方式是保持模型原有的结构和初始权重不变，随后在新数据集的基础上重新训练。

微调模型的方法
如何使用与训练模型，是由数据集大小和新旧数据集(预训练的数据集和我们要解决的数据集)之间数据的相似度来决定的。
特征提取
我们可以将预训练模型当做特征提取装置来使用。具体的做法是，将输出层去掉，然后将剩下的整个网络当做一个固定的特征提取机，从而应用到新的数据集中。

采用预训练模型的结构
我们还可以采用预训练模型的结构，但先将所有的权重随机化，然后依据自己的数据集进行训练。

训练特定层，冻结其他层
另一种使用预训练模型的方法是对它进行部分的训练。具体的做法是，将模型起始的一些层的权重保持不变，重新训练后面的层，得到新的权重。在这个过程中，我们可以多次进行尝试，从而能够依据结果找到frozen layers和retrain layers之间的最佳搭配。

4.基于 VGG16 预训练权重对图像进行分类

先来一个网络可视化的网站，把你的描述神经网络结构的prototxt文件复制到该编辑框里，按shift-enter，就可以直接以图形方式显示网络的结构
上一波vgg16的prototxt，太占地方了，就放在最后面了
训练完网络再填这一部分

4.1爬虫获取数据

需要新建一个name_lists1.txt，在这里面将你想搜索的关键词放进去

#!/usr/bin/env python  # encoding: utf-8  import urllib2  import re  import os  import sys  reload(sys)  sys.setdefaultencoding("utf-8")  def img_spider(name_file):      user_agent = "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36"      headers = {'User-Agent':user_agent}      #读取名单txt，生成包括所有人的名单列表      with open(name_file) as f:          name_list = [name.rstrip().decode('utf-8') for name in f.readlines()]          f.close()      #遍历每一个人，爬取30张关于他的图，保存在以他名字命名的文件夹中      for name in name_list:          #生成文件夹（如果不存在的话）          if not os.path.exists('/home/andy/xxs/demo/Crawler/name/' + name):              os.makedirs('/home/andy/xxs/demo/Crawler/name/' + name)              try:                  #有些外国人名字中间是空格，要把它替换成%20，不然访问页面会出错。                  url = "http://image.baidu.com/search/avatarjson?tn=resultjsonavatarnew&ie=utf-8&word=" + name.replace(' ','%20') + "&cg=girl&rn=60&pn=60"                  req = urllib2.Request(url, headers=headers)                  res = urllib2.urlopen(req)                  page = res.read()                  #print page                  #因为JSON的原因，在浏览器页面按F12看到的，和你打印出来的页面内容是不一样的，所以匹配的是objURL这个东西，对比一下页面里别的某某URL，那个能访问就用那个                  img_srcs = re.findall('"objURL":"(.*?)"', page, re.S)                  print name,len(img_srcs)              except:                  #如果访问失败，就跳到下一个继续执行代码，而不终止程序                  print name," error:"                  continue              j = 1              src_txt = ''              #访问上述得到的图片路径，保存到本地              for src in img_srcs:                  with open('/home/andy/xxs/demo/Crawler/name/' + name + '/' + str(j)+'.jpg','wb') as p:                      try:                          print "downloading No.%d"%j                          req = urllib2.Request(src, headers=headers)                          #设置一个urlopen的超时，如果3秒访问不到，就跳到下一个地址，防止程序卡在一个地方。                          img = urllib2.urlopen(src,timeout=3)                          p.write(img.read())                      except:                          print "No.%d error:"%j                          p.close()                          continue                      p.close()                  src_txt = src_txt + src + '\n'                  if j==60:   # 设置需要下载多少张这个人的图片                    break                  j = j+1              #保存30个图片的src路径为txt，我要一行一个，所以加换行符              with open('/home/andy/xxs/demo/Crawler/name/' + name + '/' + name +'.txt','wb') as p2:                  p2.write(src_txt)                  p2.close()                  print "save %s txt done"%name  #主程序，读txt文件开始爬  if __name__ == '__main__':      name_file = "name_lists1.txt"      img_spider(name_file)

4.2

【1】https://www.zhihu.com/question/41979241
【2】http://blog.csdn.net/SusanZhang1231/article/details/73249978
【3】http://blog.csdn.net/tumi678/article/details/77335731
【4】http://blog.csdn.net/tangwenbo124/article/details/52725263

    name: "VGG16"      layer {        name: "data"        type: "Data"        top: "data"        top: "label"        include {          phase: TRAIN        }        # transform_param {        #   mirror: true        #   crop_size: 224        #   mean_file: "data/ilsvrc12_shrt_256/imagenet_mean.binaryproto"        # }        transform_param {          mirror: true          crop_size: 224          mean_value: 103.939          mean_value: 116.779          mean_value: 123.68        }        data_param {          source: "data/ilsvrc12_shrt_256/ilsvrc12_train_leveldb"          batch_size: 64          backend: LEVELDB        }      }      layer {        name: "data"        type: "Data"        top: "data"        top: "label"        include {          phase: TEST        }        # transform_param {        #   mirror: false        #   crop_size: 224        #   mean_file: "data/ilsvrc12_shrt_256/imagenet_mean.binaryproto"        # }        transform_param {          mirror: false          crop_size: 224          mean_value: 103.939          mean_value: 116.779          mean_value: 123.68        }        data_param {          source: "data/ilsvrc12_shrt_256/ilsvrc12_val_leveldb"          batch_size: 50          backend: LEVELDB        }      }      layer {        bottom: "data"        top: "conv1_1"        name: "conv1_1"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 64          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv1_1"        top: "conv1_1"        name: "relu1_1"        type: "ReLU"      }      layer {        bottom: "conv1_1"        top: "conv1_2"        name: "conv1_2"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 64          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv1_2"        top: "conv1_2"        name: "relu1_2"        type: "ReLU"      }      layer {        bottom: "conv1_2"        top: "pool1"        name: "pool1"        type: "Pooling"        pooling_param {          pool: MAX          kernel_size: 2          stride: 2        }      }      layer {        bottom: "pool1"        top: "conv2_1"        name: "conv2_1"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 128          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv2_1"        top: "conv2_1"        name: "relu2_1"        type: "ReLU"      }      layer {        bottom: "conv2_1"        top: "conv2_2"        name: "conv2_2"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 128          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv2_2"        top: "conv2_2"        name: "relu2_2"        type: "ReLU"      }      layer {        bottom: "conv2_2"        top: "pool2"        name: "pool2"        type: "Pooling"        pooling_param {          pool: MAX          kernel_size: 2          stride: 2        }      }      layer {        bottom: "pool2"        top: "conv3_1"        name: "conv3_1"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 256          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv3_1"        top: "conv3_1"        name: "relu3_1"        type: "ReLU"      }      layer {        bottom: "conv3_1"        top: "conv3_2"        name: "conv3_2"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 256          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv3_2"        top: "conv3_2"        name: "relu3_2"        type: "ReLU"      }      layer {        bottom: "conv3_2"        top: "conv3_3"        name: "conv3_3"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 256          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv3_3"        top: "conv3_3"        name: "relu3_3"        type: "ReLU"      }      layer {        bottom: "conv3_3"        top: "pool3"        name: "pool3"        type: "Pooling"        pooling_param {          pool: MAX          kernel_size: 2          stride: 2        }      }      layer {        bottom: "pool3"        top: "conv4_1"        name: "conv4_1"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 512          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv4_1"        top: "conv4_1"        name: "relu4_1"        type: "ReLU"      }      layer {        bottom: "conv4_1"        top: "conv4_2"        name: "conv4_2"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 512          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv4_2"        top: "conv4_2"        name: "relu4_2"        type: "ReLU"      }      layer {        bottom: "conv4_2"        top: "conv4_3"        name: "conv4_3"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 512          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv4_3"        top: "conv4_3"        name: "relu4_3"        type: "ReLU"      }      layer {        bottom: "conv4_3"        top: "pool4"        name: "pool4"        type: "Pooling"        pooling_param {          pool: MAX          kernel_size: 2          stride: 2        }      }      layer {        bottom: "pool4"        top: "conv5_1"        name: "conv5_1"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 512          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv5_1"        top: "conv5_1"        name: "relu5_1"        type: "ReLU"      }      layer {        bottom: "conv5_1"        top: "conv5_2"        name: "conv5_2"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 512          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv5_2"        top: "conv5_2"        name: "relu5_2"        type: "ReLU"      }      layer {        bottom: "conv5_2"        top: "conv5_3"        name: "conv5_3"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 512          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv5_3"        top: "conv5_3"        name: "relu5_3"        type: "ReLU"      }      layer {        bottom: "conv5_3"        top: "pool5"        name: "pool5"        type: "Pooling"        pooling_param {          pool: MAX          kernel_size: 2          stride: 2        }      }      layer {        bottom: "pool5"        top: "fc6"        name: "fc6"        type: "InnerProduct"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        inner_product_param {          num_output: 4096          weight_filler {            type: "gaussian"            std: 0.005          }          bias_filler {            type: "constant"            value: 0.1          }        }      }      layer {        bottom: "fc6"        top: "fc6"        name: "relu6"        type: "ReLU"      }      layer {        bottom: "fc6"        top: "fc6"        name: "drop6"        type: "Dropout"        dropout_param {          dropout_ratio: 0.5        }      }      layer {        bottom: "fc6"        top: "fc7"        name: "fc7"        type: "InnerProduct"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        inner_product_param {          num_output: 4096          weight_filler {            type: "gaussian"            std: 0.005          }          bias_filler {            type: "constant"            value: 0.1          }        }      }      layer {        bottom: "fc7"        top: "fc7"        name: "relu7"        type: "ReLU"      }      layer {        bottom: "fc7"        top: "fc7"        name: "drop7"        type: "Dropout"        dropout_param {          dropout_ratio: 0.5        }      }      layer {        bottom: "fc7"        top: "fc8"        name: "fc8"        type: "InnerProduct"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        inner_product_param {          num_output: 1000          weight_filler {            type: "gaussian"            std: 0.005          }          bias_filler {            type: "constant"            value: 0.1          }        }      }      layer {        name: "accuracy_at_1"        type: "Accuracy"        bottom: "fc8"        bottom: "label"        top: "accuracy_at_1"        accuracy_param {          top_k: 1        }        include {          phase: TEST        }      }      layer {        name: "accuracy_at_5"        type: "Accuracy"        bottom: "fc8"        bottom: "label"        top: "accuracy_at_5"        accuracy_param {          top_k: 5        }        include {          phase: TEST        }      }      layer {        bottom: "fc8"        bottom: "label"        top: "loss"        name: "loss"        type: "SoftmaxWithLoss"      }

阅读全文

0 0