Deep Learning学习 - VGG-Face网络人脸识别

来源：互联网发布：施耐庵子孙哑巴知乎编辑：程序博客网时间：2024/06/06 00:23

记录一下使用VGGNet进行人脸识别的实验过程。

数据集：训练集 9W+张人脸图片，包含10000 ID。

1.数据集准备

将数据集图片分为训练集以及测试集两个部分，并生成标签文件，记录在.txt文件中。训练与测试图片比例为4:1。

import os,shutiltrainFile = open('train.txt','w')  testFile = open('test.txt','w')path = 'train' #原始图片数据路径dirs = os.listdir(path)for d in dirs:    print d    files = os.listdir(path+'/'+d)    for i,f in enumerate(files):        if i<8:            trainFile.write(f+' '+d+'\n') #记录形式：图片名字 标签（中间有一个空格）              shutil.copy(path+'/'+d+'/'+f,'trainset/'+f)  #训练集        else:            testFile.write(f+' '+d+'\n')            shutil.copy(path+'/'+d+'/'+f,'testset/'+f)   #测试集trainFile.close()testFile.close()

数据集要尽量大，否则可能最后训练结构不太好。

2.将图片转换成lmdb格式并求均值
下载caffe代码并编译通过。在该目录下新建文件夹vggface，并将测试集trainset和训练集testset以及标签文件train.txt和test.txt移动到此处。

下面编写脚本文件生成lmdb格式图片。新建vggface.sh，脚本代码是在网上借鉴修改后的，直接贴出来。图片大小被转为224*224，这是VGG处理图片的格式大小。写好脚本文件后，输入sh vggface.sh即可生成对应两个lmdb格式文件face_train_lmdb和face_test_lmdb。

#!/usr/bin/env sh# Create the imagenet lmdb inputs# N.B. set the path to the imagenet train + val data dirsEXAMPLE=vggfaceDATA=vggfaceTOOLS=./build/toolsTRAIN_DATA_ROOT=vggface/train/VAL_DATA_ROOT=vggface/val/# Set RESIZE=true to resize the images to 256x256. Leave as false if images have# already been resized using another tool.RESIZE=trueif $RESIZE; then  RESIZE_HEIGHT=224  RESIZE_WIDTH=224else  RESIZE_HEIGHT=0  RESIZE_WIDTH=0fiif [ ! -d "$TRAIN_DATA_ROOT" ]; then  echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"  echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \       "where the ImageNet training data is stored."  exit 1fiif [ ! -d "$VAL_DATA_ROOT" ]; then  echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"  echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \       "where the ImageNet validation data is stored."  exit 1fiecho "Creating train lmdb..."GLOG_logtostderr=1 $TOOLS/convert_imageset.bin \    --resize_height=$RESIZE_HEIGHT \    --resize_width=$RESIZE_WIDTH \    --shuffle \    $TRAIN_DATA_ROOT \    $DATA/train.txt \    $EXAMPLE/face_train_lmdbecho "Creating val lmdb..."GLOG_logtostderr=1 $TOOLS/convert_imageset.bin \    --resize_height=$RESIZE_HEIGHT \    --resize_width=$RESIZE_WIDTH \    --shuffle \    $VAL_DATA_ROOT \    $DATA/val.txt \    $EXAMPLE/face_val_lmdbecho "Done."

下面编写脚本文件生成均值。新建vgg_mean.sh，脚本代码也是网上借鉴修改的来。写好后执行sh vgg_mean.sh即可生成face_mean.binaryproto。

#!/usr/bin/env sh# Compute the mean image from the imagenet training lmdb# N.B. this is available in data/ilsvrc12EXAMPLE=vggfaceDATA=vggfaceTOOLS=./build/tools$TOOLS/compute_image_mean $EXAMPLE/face_train_lmdb \  $DATA/face_mean.binaryprotoecho "Done."

3.训练网络

首先去官网上下载vgg caffe模型，下载下来有一个.caffemodel文件，因此是可以直接用来finetune的。

需要自己先配置一下.prototxt文件和.solver文件。

先在vggface文件夹下面建立一个.prototxt文件，命名为vggface_train_test.prototxt，.prototxt主要是描述一个网路的结构，本身模型已经确定了，因此不需要去改变它的结构，只需要改一下输入，然后让它自动训练新的参数即可。做法是把之前VGG_FACE_deploy.prototxt的所有的复制过来，然后加入数据层。

加入的数据层部分的代码如下，删掉2-6行，改入自己的数据集即可。

name: "vggface_train_test.prototxt"layer{name:"data"type:"ImageData"top:"data"top:"label"include{phase: TRAIN}transform_param{mirror:truecrop_size:224mean_value:104mean_value:117mean_value:123}image_data_param{source:"train.txt"batch_size:64shuffle:truenew_height:256new_width:256}}layer{name:"data"type:"ImageData"top:"data"top:"label"include{phase: TEST}transform_param{mirror:truecrop_size:224mean_value:104mean_value:117mean_value:123}image_data_param{source:"val.txt"batch_size:32shuffle:truenew_height:256new_width:256}}layer{name:"conv1_1"type:"Convolution"bottom:"data"top:"conv1_1"param{lr_mult:0decay_mult:0}param{lr_mult:0decay_mult:0}convolution_param{num_output:64kernel_size:3pad:1}}...中间忽略，按官方所给即可...layer{name:"fc8_10000"type:"InnerProduct"bottom:"fc7"top:"fc8_10000"param{lr_mult:10decay_mult:1}param{lr_mult:20decay_mult:0}inner_product_param{num_output:10000weight_filler{type:"gaussian"std:0.01}bias_filler{type:"constant"value:0}}}layer{name:"loss"type:"SoftmaxWithLoss"bottom:"fc8_10000"bottom:"label"top:"loss"}layer{name:"accuracy"type:"Accuracy"bottom:"fc8_10000"bottom:"label"top:"accuracy"include{phase: TEST}}

另外，在文件的最底部，有个num_output: 2622的那一层，这一层主要是概率输出层，就是softmax分类器层。因为vgg训练这个网络，用了2622个人，所以就是2622，现在可以根据自己的人的个数来设置，我用了10000个人，所以就把把num_output: 2622改为了num_output: 10000，并且把name：改为了facefc8。

下面开始改solver文件。如果没有就新建一个solver文件，保存为solver.prototxt。

net: "vggface_train_test.prototxt"test_iter: 500test_interval: 500test_initialization: falsedisplay: 40average_loss: 40base_lr: 0.00005lr_policy: "step"stepsize: 320000gamma: 0.96max_iter: 1000momentum: 0.9weight_decay: 0.0002snapshot: 500snapshot_prefix: "mymodel"solver_mode: GPU

最后写一个训练用的脚本文件,命名为vgg_training.sh,如下：

#!/usr/bin/env sh../build/tools/caffe train \    --solver=solver.prototxt \    --weights=VGG_FACE.caffemodel

weights=vggface/VGG_FACE.caffemodel，这就是在VGG_FACE.caffemodel上finetuing了。
执行sh vgg_training.sh就可以在vggface上finetuing了，跑结果模型也会保存在vggface文件夹中。

对模型使用有matlab例程调用。
下图时训练过程图片

1 0