[人脸识别]使用VGG Face Model微调(Fine tune)自己的数据集

来源:互联网 发布:怎么购买捷易通软件 编辑:程序博客网 时间:2024/05/16 09:55

关键词:人脸识别、Caffe、VGG Face Model

一、准备数据集

  对于一个未经加工的数据集,基本情况如下图所示


  图中每一个文件夹内是该类别的所有图像。我们需要对每一类划分训练集与测试集。这里可以写个脚本进行划分,训练集与测试集的比例自己把握,一般来说是训练集:测试集=4:1。
  需要注意的是,最终只需要生成两个文件夹:train与val。train文件夹中包含所有训练图片,换句话说,就是把所有训练图片都放在一个文件夹内。val同理。
  我们在使用脚本划分数据集时,同时需要新建两个文本:train.txt与val.txt,用来记录训练集与测试集中图片名称,以及类别。如下图所示:


  需要注意的非常重要的一点是,train.txt和val.txt中的名称与类别中间分隔符是一个空格,其次,类别号要从0开始,否则后续处理会出现错误。

二、数据处理

  经过上一步后,我们会得到两个文件夹:train与val、以及相对应的train.txt和val.txt。
  接下来,我们需要将上述数据集处理成caffe能读取的数据格式LMDB。这部分有标准的处理脚本。

#!/usr/bin/env sh# Create the imagenet lmdb inputs# N.B. set the path to the imagenet train + val data dirsEXAMPLE=    #修改DATA=       #修改TOOLS=      #修改TRAIN_DATA_ROOT=$DATA/train/VAL_DATA_ROOT=$DATA/val/DBTYPE=lmdb# Set RESIZE=true to resize the images to 224×224. Leave as false if images have# already been resized using another tool.RESIZE=trueif $RESIZE; then  RESIZE_HEIGHT=224  RESIZE_WIDTH=224else  RESIZE_HEIGHT=0  RESIZE_WIDTH=0fiif [ ! -d "$TRAIN_DATA_ROOT" ]; then  echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"  echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \       "where the ImageNet training data is stored."  exit 1fiif [ ! -d "$VAL_DATA_ROOT" ]; then  echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"  echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \       "where the ImageNet validation data is stored."  exit 1fiecho "Creating train lmdb..."GLOG_logtostderr=1 $TOOLS/convert_imageset \    --resize_height=$RESIZE_HEIGHT \    --resize_width=$RESIZE_WIDTH \    --shuffle \    $TRAIN_DATA_ROOT \    $DATA/train.txt \    $EXAMPLE/train_lmdbecho "Creating val lmdb..."GLOG_logtostderr=1 $TOOLS/convert_imageset \    --resize_height=$RESIZE_HEIGHT \    --resize_width=$RESIZE_WIDTH \    --shuffle \    $VAL_DATA_ROOT \    $DATA/val.txt \    $EXAMPLE/val_lmdbecho "Computing image mean..."$TOOLS/compute_image_mean -backend=$DBTYPE \  $EXAMPLE/train_lmdb $EXAMPLE/mean.binaryprotoecho "Done."

  你需要更改的地方有:
  1.EXAMPLE后面的路径,尽量设置为第一步中的train与test文件夹的父路径
  2.DATA为第一步中的train与test文件夹的父路径
  3.TOOLS的标准格式为:(caffe路径)/build/tools;
  4.TRAIN_DATA_ROOT与VAL_DATA_ROOT不需要改变,分别对应train与val文件夹的路径。
  其他参数的意义解释可参考:http://www.cnblogs.com/dupuleng/articles/4370236.html
  确保上述路径设置正确后,便可将脚本拖入终端运行。
  运行完成后,在你设置的EXAMPLE路径中会产生train_lmdb与val_lmdb两个文件夹,以及mean.binaryproto文件。

三、caffe训练相关配置文件修改

  下载VGG Face Model,并解压,移动到(caffe路径)/models中:


  接下来,我们需要修改vgg_face_caffe中的VGG_FACE_deploy.prototxt 文件,推荐你直接复制下面的文件

name: "VGG_FACE_16_Net"layer {  name: "data"  type: "Data"  #这里注意  top: "data"  top: "label"  data_param {    source: "$/train_lmdb"   #这里修改    backend:LMDB    batch_size: 100   #这里修改  }  transform_param {     mean_file: "$/mean.binaryproto"   #这里修改     mirror: true  }  include: { phase: TRAIN }}layer {  name: "data"  type: "Data"  top: "data"  top: "label"  data_param {    source: "$/val_lmdb"  #这里修改    backend:LMDB    batch_size: 25   #这里修改  }  transform_param {    mean_file: "$/mean.binaryproto"   #这里修改    mirror: true  }  include: {     phase: TEST   }}layer {  name: "conv1_1"  type: "Convolution"  bottom: "data"  top: "conv1_1"  param {    lr_mult: 1    decay_mult: 1  }  param {    lr_mult: 2    decay_mult: 0  }  convolution_param {    num_output: 64    kernel_size: 3    pad: 1    weight_filler {      type: "gaussian"      std: 0.01    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu1_1"  type: "ReLU"  bottom: "conv1_1"  top: "conv1_1"}layer {  name: "conv1_2"  type: "Convolution"  bottom: "conv1_1"  top: "conv1_2"  param {    lr_mult: 1     decay_mult: 1  }  param {    lr_mult: 2    decay_mult: 0  }  convolution_param {    num_output: 64    kernel_size: 3    pad: 1    weight_filler {      type: "gaussian"      std: 0.01    }    bias_filler {      type: "constant"      value: 0    }  } }layer {  name: "relu1_2"  type: "ReLU"  bottom: "conv1_2"  top: "conv1_2"}layer {  name: "pool1"  type: "Pooling"  bottom: "conv1_2"  top: "pool1"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "conv2_1"  type: "Convolution"  bottom: "pool1"  top: "conv2_1"  param {    lr_mult: 1    decay_mult: 1  }   param {    lr_mult: 2    decay_mult: 0  }   convolution_param {    num_output: 128    kernel_size: 3    pad: 1    weight_filler {      type: "gaussian"      std: 0.01    }     bias_filler {      type: "constant"      value: 0    }   } }layer {  name: "relu2_1"  type: "ReLU"  bottom: "conv2_1"  top: "conv2_1"}layer {   name: "conv2_2"  type: "Convolution"  bottom: "conv2_1"  top: "conv2_2"  param {    lr_mult: 1    decay_mult: 1  }   param {    lr_mult: 2    decay_mult: 0  }   convolution_param {    num_output: 128    kernel_size: 3    pad: 1    weight_filler {      type: "gaussian"      std: 0.01     }     bias_filler {      type: "constant"      value: 0    }  } }layer {  name: "relu2_2"  type: "ReLU"  bottom: "conv2_2"  top: "conv2_2"}layer {  name: "pool2"  type: "Pooling"  bottom: "conv2_2"  top: "pool2"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "conv3_1"  type: "Convolution"  bottom: "pool2"  top: "conv3_1"  param {    lr_mult: 1    decay_mult: 1  }  param {    lr_mult: 2    decay_mult: 0  }  convolution_param {    num_output: 256    kernel_size: 3    pad: 1    weight_filler {      type: "gaussian"      std: 0.01    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu3_1"  type: "ReLU"  bottom: "conv3_1"  top: "conv3_1"}layer {  name: "conv3_2"  type: "Convolution"  bottom: "conv3_1"  top: "conv3_2"  param {    lr_mult: 1    decay_mult: 1  }  param {    lr_mult: 2    decay_mult: 0  }  convolution_param {    num_output: 256    kernel_size: 3    pad: 1    weight_filler {      type: "gaussian"      std: 0.01    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu3_2"  type: "ReLU"  bottom: "conv3_2"  top: "conv3_2"}layer {  name: "conv3_3"  type: "Convolution"  bottom: "conv3_2"  top: "conv3_3"  param {    lr_mult: 1    decay_mult: 1  }  param {    lr_mult: 2    decay_mult: 0  }  convolution_param {    num_output: 256    kernel_size: 3    pad: 1    weight_filler {      type: "gaussian"      std: 0.01    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu3_3"  type: "ReLU"  bottom: "conv3_3"  top: "conv3_3"}layer {  name: "pool3"  type: "Pooling"  bottom: "conv3_3"  top: "pool3"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "conv4_1"  type: "Convolution"  bottom: "pool3"  top: "conv4_1"  param {    lr_mult: 1    decay_mult: 1  }  param {    lr_mult: 2    decay_mult: 0  }  convolution_param {    num_output: 512    kernel_size: 3    pad: 1    weight_filler {      type: "gaussian"      std: 0.01    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu4_1"  type: "ReLU"  bottom: "conv4_1"  top: "conv4_1"}layer {  name: "conv4_2"  type: "Convolution"  bottom: "conv4_1"  top: "conv4_2"  param {    lr_mult: 1    decay_mult: 1  }  param {    lr_mult: 2    decay_mult: 0  }  convolution_param {    num_output: 512    kernel_size: 3    pad: 1    weight_filler {      type: "gaussian"      std: 0.01    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu4_2"  type: "ReLU"  bottom: "conv4_2"  top: "conv4_2"}layer {  name: "conv4_3"  type: "Convolution"  bottom: "conv4_2"  top: "conv4_3"  param {    lr_mult: 1    decay_mult: 1  }  param {    lr_mult: 2    decay_mult: 0  }  convolution_param {    num_output: 512    kernel_size: 3    pad: 1    weight_filler {      type: "gaussian"      std: 0.01    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu4_3"  type: "ReLU"  bottom: "conv4_3"  top: "conv4_3"}layer {  name: "pool4"  type: "Pooling"  bottom: "conv4_3"  top: "pool4"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "conv5_1"  type: "Convolution"  bottom: "pool4"  top: "conv5_1"  param {    lr_mult: 1    decay_mult: 1  }  param {    lr_mult: 2    decay_mult: 0  }  convolution_param {    num_output: 512    kernel_size: 3    pad: 1    weight_filler {      type: "gaussian"      std: 0.01    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu5_1"  type: "ReLU"  bottom: "conv5_1"  top: "conv5_1"}layer {  name: "conv5_2"  type: "Convolution"  bottom: "conv5_1"  top: "conv5_2"  param {    lr_mult: 1    decay_mult: 1  }  param {    lr_mult: 2    decay_mult: 0  }  convolution_param {    num_output: 512    kernel_size: 3    pad: 1    weight_filler {      type: "gaussian"      std: 0.01    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu5_2"  type: "ReLU"  bottom: "conv5_2"  top: "conv5_2"}layer {  name: "conv5_3"  type: "Convolution"  bottom: "conv5_2"  top: "conv5_3"  param {    lr_mult: 1    decay_mult: 1  }  param {    lr_mult: 2    decay_mult: 0  }  convolution_param {    num_output: 512    kernel_size: 3    pad: 1    weight_filler {      type: "gaussian"      std: 0.01    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "relu5_3"  type: "ReLU"  bottom: "conv5_3"  top: "conv5_3"}layer {  name: "pool5"  type: "Pooling"  bottom: "conv5_3"  top: "pool5"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "fc6"  type: "InnerProduct"  bottom: "pool5"  top: "fc6"  param {    lr_mult: 1    decay_mult: 1  }  param {    lr_mult: 2    decay_mult: 0  }  inner_product_param {    num_output: 4096    weight_filler {      type: "gaussian"      std: 0.005    }    bias_filler {      type: "constant"      value: 1    }  }}layer {  name: "relu6"  type: "ReLU"  bottom: "fc6"  top: "fc6"}layer {  name: "drop6"  type: "Dropout"  bottom: "fc6"  top: "fc6"  dropout_param {    dropout_ratio: 0.5  }}layer {  name: "fc7"  type: "InnerProduct"  bottom: "fc6"  top: "fc7"  # Note that lr_mult can be set to 0 to disable any fine-tuning of this, and any other, layer  param {    lr_mult: 1    decay_mult: 1  }  param {    lr_mult: 2    decay_mult: 0  }  inner_product_param {    num_output: 4096    weight_filler {      type: "gaussian"      std: 0.005    }    bias_filler {      type: "constant"      value: 1    }  }}layer {  name: "relu7"  type: "ReLU"  bottom: "fc7"  top: "fc7"}layer {  name: "drop7"  type: "Dropout"  bottom: "fc7"  top: "fc7"  dropout_param {    dropout_ratio: 0.5  }}layer {  name: "fc8_flickr"  type: "InnerProduct"  bottom: "fc7"  top: "fc8_flickr"  # lr_mult is set to higher than for other layers, because this layer is starting from random while the others are already trained  propagate_down: false  inner_product_param {    num_output: 356   #这里修改    weight_filler {      type: "gaussian"      std: 0.01    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "accuracy"  type: "Accuracy"  bottom: "fc8_flickr"  bottom: "label"  top: "accuracy"  include {    phase: TEST  }}layer {  name: "loss"  type: "SoftmaxWithLoss"  bottom: "fc8_flickr"  bottom: "label"  top: "loss"}

  如果你复制了上面的配置,你还需要修改7处,分别在第8、10、13、25、27、30、611行。
  其中,第8、13、25、30分别修改为第二步产生的train_lmdb与val_lmdb两个文件夹路径,以及mean.binaryproto路径。
  至于第10、27行batch_size的数值要根据你的GPU容量修改,可以根据128–64–32–16依次修改,至于batch_size的含义,可以查阅其他资料。注意,此处如果设置过大,训练时会提示内存不足!。另外,第10、27行两处的batch_size不必相同。
  第611行需要需改为你训练的图片种类数
  修改完VGG_FACE_deploy.prototxt后,还需新增一个文件solver.prototxt文件。
  你可以拷贝models/finetune_flickr_style/solver.prototxt到models/vgg_face_caffe文件夹中,并将针对现问题进行修改,主要修改如下:

net: "models/vgg_face_caffe/VGG_FACE_deploy.prototxt"test_iter: 100test_interval: 1000# lr for fine-tuning should be lower than when starting from scratchbase_lr: 0.001lr_policy: "step"gamma: 0.1# stepsize should also be lower, as we're closer to being donestepsize: 20000display: 20max_iter: 100000momentum: 0.9weight_decay: 0.0005snapshot: 10000snapshot_prefix: "models/vgg"# uncomment the following to default to CPU mode solving#solver_mode: CPU

  该配置文件可以直接复制,具体参数意义可参考:
  http://www.cnblogs.com/denny402/p/5074049.html

四、训练

  在终端中输入下面两行命令:

cd (caffe目录)./build/tools/caffe train -solver models/vgg_face_caffe/solver.prototxt -weights models/vgg_face_caffe/VGG_FACE.caffemodel -gpu 0

  -solver 后面的代表上面我们拷贝并修改后的solver文件,-weights后面的路径表示最终生成的模型名称及路径。
  以上便是使用VGG Face Model微调自己的数据集的方法。至于如何使用训练后模型对图片进行测试,可以看下一篇。

阅读全文
0 0
原创粉丝点击