迁移学习(Transfer Learning)的目标是将从一个环境中学到的知识用来帮助新环境中的学习任务。





简单来说,预训练模型(pre-trained model)是前人为了解决类似问题所创造出来的模型。你在解决问题的时候,不用从零开始训练一个新模型,可以从在类似问题中训练过的模型入手。比如说,如果你想做一辆自动驾驶汽车,可以花数年时间从零开始构建一个性能优良的图像识别算法,也可以从Google在ImageNet数据集上训练得到的inception model(一个预训练模型)起步,来识别图像。一个预训练模型可能对于你的应用中并不是100%的准确对口,但是它可以为你节省大量功夫。

预训练模型已经训练得很好,我们就不会在短时间内去修改过多的权重,在迁移学习中用到它的时候,往往只是进行微调(fine tune)。在修改模型的过程中,我们通过会采用比一般训练模型更低的学习速率


场景一:数据集小,数据相似度高(与pre-trained model的训练数据相比而言)
在这种情况下,因为数据与预训练模型的训练数据相似度很高,因此我们不需要重新训练模型。我们只需要将输出层改制成符合问题情境下的结构就好。我们使用预处理模型作为模式提取器。比如说我们使用在ImageNet上训练的模型来辨认一组新照片中的小猫小狗。在这里,需要被辨认的图片与ImageNet库中的图片类似,但是我们的输出结果中只需要两项——猫或者狗。在这个例子中,我们需要做的就是把dense layer和最终softmax layer的输出从1000个类别改为2个类别。






另一种使用预训练模型的方法是对它进行部分的训练。具体的做法是,将模型起始的一些层的权重保持不变,重新训练后面的层,得到新的权重。在这个过程中,我们可以多次进行尝试,从而能够依据结果找到frozen layers和retrain layers之间的最佳搭配。

4.基于 VGG16 预训练权重对图像进行分类




#!/usr/bin/env python  # encoding: utf-8  import urllib2  import re  import os  import sys  reload(sys)  sys.setdefaultencoding("utf-8")  def img_spider(name_file):      user_agent = "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36"      headers = {'User-Agent':user_agent}      #读取名单txt,生成包括所有人的名单列表      with open(name_file) as f:          name_list = [name.rstrip().decode('utf-8') for name in f.readlines()]          f.close()      #遍历每一个人,爬取30张关于他的图,保存在以他名字命名的文件夹中      for name in name_list:          #生成文件夹(如果不存在的话)          if not os.path.exists('/home/andy/xxs/demo/Crawler/name/' + name):              os.makedirs('/home/andy/xxs/demo/Crawler/name/' + name)              try:                  #有些外国人名字中间是空格,要把它替换成%20,不然访问页面会出错。                  url = "http://image.baidu.com/search/avatarjson?tn=resultjsonavatarnew&ie=utf-8&word=" + name.replace(' ','%20') + "&cg=girl&rn=60&pn=60"                  req = urllib2.Request(url, headers=headers)                  res = urllib2.urlopen(req)                  page = res.read()                  #print page                  #因为JSON的原因,在浏览器页面按F12看到的,和你打印出来的页面内容是不一样的,所以匹配的是objURL这个东西,对比一下页面里别的某某URL,那个能访问就用那个                  img_srcs = re.findall('"objURL":"(.*?)"', page, re.S)                  print name,len(img_srcs)              except:                  #如果访问失败,就跳到下一个继续执行代码,而不终止程序                  print name," error:"                  continue              j = 1              src_txt = ''              #访问上述得到的图片路径,保存到本地              for src in img_srcs:                  with open('/home/andy/xxs/demo/Crawler/name/' + name + '/' + str(j)+'.jpg','wb') as p:                      try:                          print "downloading No.%d"%j                          req = urllib2.Request(src, headers=headers)                          #设置一个urlopen的超时,如果3秒访问不到,就跳到下一个地址,防止程序卡在一个地方。                          img = urllib2.urlopen(src,timeout=3)                          p.write(img.read())                      except:                          print "No.%d error:"%j                          p.close()                          continue                      p.close()                  src_txt = src_txt + src + '\n'                  if j==60:   # 设置需要下载多少张这个人的图片                    break                  j = j+1              #保存30个图片的src路径为txt,我要一行一个,所以加换行符              with open('/home/andy/xxs/demo/Crawler/name/' + name + '/' + name +'.txt','wb') as p2:                  p2.write(src_txt)                  p2.close()                  print "save %s txt done"%name  #主程序,读txt文件开始爬  if __name__ == '__main__':      name_file = "name_lists1.txt"      img_spider(name_file)



    name: "VGG16"      layer {        name: "data"        type: "Data"        top: "data"        top: "label"        include {          phase: TRAIN        }        # transform_param {        #   mirror: true        #   crop_size: 224        #   mean_file: "data/ilsvrc12_shrt_256/imagenet_mean.binaryproto"        # }        transform_param {          mirror: true          crop_size: 224          mean_value: 103.939          mean_value: 116.779          mean_value: 123.68        }        data_param {          source: "data/ilsvrc12_shrt_256/ilsvrc12_train_leveldb"          batch_size: 64          backend: LEVELDB        }      }      layer {        name: "data"        type: "Data"        top: "data"        top: "label"        include {          phase: TEST        }        # transform_param {        #   mirror: false        #   crop_size: 224        #   mean_file: "data/ilsvrc12_shrt_256/imagenet_mean.binaryproto"        # }        transform_param {          mirror: false          crop_size: 224          mean_value: 103.939          mean_value: 116.779          mean_value: 123.68        }        data_param {          source: "data/ilsvrc12_shrt_256/ilsvrc12_val_leveldb"          batch_size: 50          backend: LEVELDB        }      }      layer {        bottom: "data"        top: "conv1_1"        name: "conv1_1"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 64          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv1_1"        top: "conv1_1"        name: "relu1_1"        type: "ReLU"      }      layer {        bottom: "conv1_1"        top: "conv1_2"        name: "conv1_2"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 64          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv1_2"        top: "conv1_2"        name: "relu1_2"        type: "ReLU"      }      layer {        bottom: "conv1_2"        top: "pool1"        name: "pool1"        type: "Pooling"        pooling_param {          pool: MAX          kernel_size: 2          stride: 2        }      }      layer {        bottom: "pool1"        top: "conv2_1"        name: "conv2_1"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 128          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv2_1"        top: "conv2_1"        name: "relu2_1"        type: "ReLU"      }      layer {        bottom: "conv2_1"        top: "conv2_2"        name: "conv2_2"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 128          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv2_2"        top: "conv2_2"        name: "relu2_2"        type: "ReLU"      }      layer {        bottom: "conv2_2"        top: "pool2"        name: "pool2"        type: "Pooling"        pooling_param {          pool: MAX          kernel_size: 2          stride: 2        }      }      layer {        bottom: "pool2"        top: "conv3_1"        name: "conv3_1"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 256          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv3_1"        top: "conv3_1"        name: "relu3_1"        type: "ReLU"      }      layer {        bottom: "conv3_1"        top: "conv3_2"        name: "conv3_2"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 256          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv3_2"        top: "conv3_2"        name: "relu3_2"        type: "ReLU"      }      layer {        bottom: "conv3_2"        top: "conv3_3"        name: "conv3_3"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 256          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv3_3"        top: "conv3_3"        name: "relu3_3"        type: "ReLU"      }      layer {        bottom: "conv3_3"        top: "pool3"        name: "pool3"        type: "Pooling"        pooling_param {          pool: MAX          kernel_size: 2          stride: 2        }      }      layer {        bottom: "pool3"        top: "conv4_1"        name: "conv4_1"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 512          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv4_1"        top: "conv4_1"        name: "relu4_1"        type: "ReLU"      }      layer {        bottom: "conv4_1"        top: "conv4_2"        name: "conv4_2"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 512          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv4_2"        top: "conv4_2"        name: "relu4_2"        type: "ReLU"      }      layer {        bottom: "conv4_2"        top: "conv4_3"        name: "conv4_3"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 512          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv4_3"        top: "conv4_3"        name: "relu4_3"        type: "ReLU"      }      layer {        bottom: "conv4_3"        top: "pool4"        name: "pool4"        type: "Pooling"        pooling_param {          pool: MAX          kernel_size: 2          stride: 2        }      }      layer {        bottom: "pool4"        top: "conv5_1"        name: "conv5_1"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 512          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv5_1"        top: "conv5_1"        name: "relu5_1"        type: "ReLU"      }      layer {        bottom: "conv5_1"        top: "conv5_2"        name: "conv5_2"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 512          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv5_2"        top: "conv5_2"        name: "relu5_2"        type: "ReLU"      }      layer {        bottom: "conv5_2"        top: "conv5_3"        name: "conv5_3"        type: "Convolution"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        convolution_param {          num_output: 512          pad: 1          kernel_size: 3          weight_filler {            type: "gaussian"            std: 0.01          }          bias_filler {            type: "constant"            value: 0          }        }      }      layer {        bottom: "conv5_3"        top: "conv5_3"        name: "relu5_3"        type: "ReLU"      }      layer {        bottom: "conv5_3"        top: "pool5"        name: "pool5"        type: "Pooling"        pooling_param {          pool: MAX          kernel_size: 2          stride: 2        }      }      layer {        bottom: "pool5"        top: "fc6"        name: "fc6"        type: "InnerProduct"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        inner_product_param {          num_output: 4096          weight_filler {            type: "gaussian"            std: 0.005          }          bias_filler {            type: "constant"            value: 0.1          }        }      }      layer {        bottom: "fc6"        top: "fc6"        name: "relu6"        type: "ReLU"      }      layer {        bottom: "fc6"        top: "fc6"        name: "drop6"        type: "Dropout"        dropout_param {          dropout_ratio: 0.5        }      }      layer {        bottom: "fc6"        top: "fc7"        name: "fc7"        type: "InnerProduct"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        inner_product_param {          num_output: 4096          weight_filler {            type: "gaussian"            std: 0.005          }          bias_filler {            type: "constant"            value: 0.1          }        }      }      layer {        bottom: "fc7"        top: "fc7"        name: "relu7"        type: "ReLU"      }      layer {        bottom: "fc7"        top: "fc7"        name: "drop7"        type: "Dropout"        dropout_param {          dropout_ratio: 0.5        }      }      layer {        bottom: "fc7"        top: "fc8"        name: "fc8"        type: "InnerProduct"        param {          lr_mult: 1          decay_mult: 1        }        param {          lr_mult: 2          decay_mult: 0        }        inner_product_param {          num_output: 1000          weight_filler {            type: "gaussian"            std: 0.005          }          bias_filler {            type: "constant"            value: 0.1          }        }      }      layer {        name: "accuracy_at_1"        type: "Accuracy"        bottom: "fc8"        bottom: "label"        top: "accuracy_at_1"        accuracy_param {          top_k: 1        }        include {          phase: TEST        }      }      layer {        name: "accuracy_at_5"        type: "Accuracy"        bottom: "fc8"        bottom: "label"        top: "accuracy_at_5"        accuracy_param {          top_k: 5        }        include {          phase: TEST        }      }      layer {        bottom: "fc8"        bottom: "label"        top: "loss"        name: "loss"        type: "SoftmaxWithLoss"      }  