YOLO v2 人脸检测 by zhangzexuan

来源:互联网 发布:淘宝付费推广技巧 编辑:程序博客网 时间:2024/06/06 00:47


一、 数据准备

YOLO本身使用的是VOC 的数据集,因此,要使用其它数据集来训练YOLO v2的话,可以在VOC数据集的基础之上进行改造,或者说按照VOC数据集的结构和格式来构建所需的数据集。

这里我所使用的数据集是CelebA(Large-scale CelebFaces Attributes Dataset)大规模名人人脸标注数据集。使用的是.jpg格式的图片。CelebA数据集中的图片命名格式为统一的000001.jpg—202599.jpg,BoundingBox的信息在list_bbox_celeba.txt中保存。格式如下:

--------------------------------------------------list_bbox_celeba.txt-----------------------------------------------------

第一行:图片总数量

第二行:格式信息

余下所有行:< xxxxxx.jpg > < [x1] [y1] [width] [height] >

--------------------------------------------------list_bbox_celeba.txt-----------------------------------------------------

其中,x1,y1表示该BoundingBox左上角点的坐标,width,height分别表示该BoundingBox的宽度和高度。

在YOLO中,每张图片都需要一个对应的label文件,这个label文件应当是一个.txt文件,除后缀名外,它的文件名与该图片的文件名相同,其中的内容为< 类别代码>  < [x] [y][width] [height] >,类别代码为从0开始的整数,它用于在.names文件中指明该BoundingBox中目标的类别。这里x = BoundingBox的中心点横坐标/图片宽度,y = BoundingBox的中心点纵坐标/图片高度,width = BoundingBox宽度/图片宽度,height = BoundingBox高度/图片高度。

二、数据处理

YOLOv2训练时需要:

1.      指明训练图片绝对路径的train.txt文件和指明验证图片绝对路径的val.txt。

2.      所有图片对应的label文本文件,在voc数据集中位于VOC2007/labels文件夹内。

3.      训练数据配置文件voc.data

4.      网络配置文件,这里用tiny-yolo.cfg

5.      类别名列表文件voc.names

以上1,2两项需要我们自己生成,作者提供了一个python程序(darknet/scripts/voc_label.py)来生成所有的label文本文件和图片路径文本文件,但前提是我的数据必须按照voc数据集的格式布局,这里使用【http://blog.csdn.net/minstyrain/article/details/77888176?locationNum=4&fps=1】提供的python程序(我将之命名为celebA2YOLO.py)来将celebA数据集转换成voc数据集的格式。代码中的生成文件路径需要按照自己的要求修改。我修改后的代码如下:

import cv2,h5py,osimport numpy as npfrom xml.dom.minidom import Documentimport progressbarrootdir="../"imgdir=rootdir+"Img/img_celeba"  landmarkpath=rootdir+"Anno/list_landmarks_celeba.txt"bboxpath=rootdir+"Anno/list_bbox_celeba.txt"vocannotationdir=rootdir+"VOCdevkit/VOC2007/"+"Annotations"labelsdir=rootdir+"VOCdevkit/VOC2007/"+"labels"  convet2yoloformat=Trueconvert2vocformat=True  resized_dim=(48,48)  datasetprefix="/home/scw4750/zhangzexuan/CelebA/VOCdevkit/VOC2007/JPEGImages/"progress = progressbar.ProgressBar(widgets=[    progressbar.Percentage(),    ' (', progressbar.SimpleProgress(), ') ',    ' (', progressbar.Timer(), ') ',    ' (', progressbar.ETA(), ') ',])def drawbboxandlandmarks(img,bbox,landmark):    cv2.rectangle(img,(bbox[0],bbox[1]),(bbox[0]+bbox[2],bbox[1]+bbox[3]),(0,255,0))    for i in range(int(len(landmark)/2)):        cv2.circle(img,(int(landmark[2*i]),int(landmark[2*i+1])),2,(0,0,255))  def loadgt():    imgpaths=[]    landmarks=[]    bboxes=[]    with open(landmarkpath) as landmarkfile:        lines=landmarkfile.readlines()        lines=lines[2:]        for line in lines:            landmarkline=line.split()            imgpath=landmarkline[0]            imgpaths.append(imgpath)            landmarkline=landmarkline[1:]            landmark=[int(str) for str in landmarkline]            landmarks.append(landmark)    with open(bboxpath) as bboxfile:        lines=bboxfile.readlines()        lines=lines[2:]        for line in lines:            bboxline=line.split()            imgpath=bboxline[0]            bboxline=bboxline[1:]            bbox=[int(bb) for bb in bboxline]            bboxes.append(bbox)    return imgpaths,bboxes,landmarks  def generate_hdf5():    imgpaths,bboxes,landmarks=loadgt()    numofimg=len(imgpaths)    faces=[]    labels=[]    #numofimg=2    for i in range(numofimg):        imgpath=imgdir+"/"+imgpaths[i]        print(i)#,imgpath)        bbox=bboxes[i]        landmark=landmarks[i]        img=cv2.imread(imgpath)        if bbox[2]<=0 or bbox[3]<=0:            continue        face=img[bbox[1]:bbox[1]+bbox[3],bbox[0]:bbox[0]+bbox[2]]        face=cv2.resize(face,resized_dim)        faces.append(face)        label=[]        label.append(1)        for i in range(len(bbox)):            label.append(bbox[i])        for i in range(len(landmark)):            lm=landmark[i]            if i%2==0:                lm=(lm-bbox[0])*1.0/(bbox[2])            else:                lm=(lm-bbox[1])*1.0/(bbox[3])            label.append(lm)        labels.append(label)    faces=np.asarray(faces)    labels=np.asarray(labels)    f=h5py.File('train.h5','w')    f['data']=faces.astype(np.float32)    f['labels']=labels.astype(np.float32)    f.close()def viewginhdf5():    f = h5py.File('train.h5','r')    f.keys()    faces=f['data'][:]    labels=f['labels'][:]    for i in range(len(faces)):        print(i)        face=faces[i].astype(np.uint8)        label=labels[i]        bbox=label[1:4]        landmark=label[5:]        for i in range(int(len(landmark)/2)):            cv2.circle(face,(int(landmark[2*i]*resized_dim[0]),int(landmark[2*i+1]*resized_dim[1])),1,(0,0,255))        cv2.imshow("img",face)        cv2.waitKey()    f.close()def showgt():    landmarkfile=open(landmarkpath)    bboxfile=open(bboxpath)    numofimgs=int(landmarkfile.readline())    _=landmarkfile.readline()    _=bboxfile.readline()    _=bboxfile.readline()    index=0      pbar = progress.start()    if convet2yoloformat:        if not os.path.exists(labelsdir):            os.mkdir(labelsdir)    if convert2vocformat:        if not os.path.exists(vocannotationdir):            os.mkdir(vocannotationdir)#    while(index<numofimgs):    for i in pbar(range(numofimgs)):        #pbar.update(int((index/(numofimgs-1))*10000))        landmarkline=landmarkfile.readline().split()        filename=landmarkline[0]        #sys.stdout.write("\r"+str(index)+":"+filename)        #sys.stdout.flush()        imgpath=imgdir+"/"+filename        img=cv2.imread(imgpath)        landmarkline=landmarkline[1:]        landmark=[int(pt) for pt in landmarkline]        bboxline=bboxfile.readline().split()        imgpath2=imgdir+"/"+bboxline[0]        bboxline=bboxline[1:]        bbox=[int(bb) for bb in bboxline]        drawbboxandlandmarks(img,bbox,landmark)        if convet2yoloformat:            height=img.shape[0]            width=img.shape[1]            txtpath=labelsdir+"/"+filename            txtpath=txtpath[:-3]+"txt"            ftxt=open(txtpath,'w')            xcenter=(bbox[0]+bbox[2]*0.5)/width            ycenter=(bbox[1]+bbox[3]*0.5)/height            wr=bbox[2]*1.0/width            hr=bbox[3]*1.0/height            line="0 "+str(xcenter)+" "+str(ycenter)+" "+str(wr)+" "+str(hr)+"\n"            ftxt.write(line)            ftxt.close()        if convert2vocformat:            xmlpath=vocannotationdir+"/"+filename            xmlpath=xmlpath[:-3]+"xml"            doc = Document()            annotation = doc.createElement('annotation')            doc.appendChild(annotation)            folder = doc.createElement('folder')            folder_name = doc.createTextNode('CelebA')            folder.appendChild(folder_name)            annotation.appendChild(folder)            filenamenode = doc.createElement('filename')            filename_name = doc.createTextNode(filename)            filenamenode.appendChild(filename_name)            annotation.appendChild(filenamenode)            source = doc.createElement('source')            annotation.appendChild(source)            database = doc.createElement('database')            database.appendChild(doc.createTextNode('CelebA Database'))            source.appendChild(database)            annotation_s = doc.createElement('annotation')            annotation_s.appendChild(doc.createTextNode('PASCAL VOC2007'))            source.appendChild(annotation_s)            image = doc.createElement('image')            image.appendChild(doc.createTextNode('flickr'))            source.appendChild(image)            flickrid = doc.createElement('flickrid')            flickrid.appendChild(doc.createTextNode('-1'))            source.appendChild(flickrid)            owner = doc.createElement('owner')            annotation.appendChild(owner)            flickrid_o = doc.createElement('flickrid')            flickrid_o.appendChild(doc.createTextNode('tdr'))            owner.appendChild(flickrid_o)            name_o = doc.createElement('name')            name_o.appendChild(doc.createTextNode('yanyu'))            owner.appendChild(name_o)            size = doc.createElement('size')            annotation.appendChild(size)            width = doc.createElement('width')            width.appendChild(doc.createTextNode(str(img.shape[1])))            height = doc.createElement('height')            height.appendChild(doc.createTextNode(str(img.shape[0])))            depth = doc.createElement('depth')            depth.appendChild(doc.createTextNode(str(img.shape[2])))            size.appendChild(width)            size.appendChild(height)            size.appendChild(depth)            segmented = doc.createElement('segmented')            segmented.appendChild(doc.createTextNode('0'))            annotation.appendChild(segmented)            for i in range(1):                objects = doc.createElement('object')                annotation.appendChild(objects)                object_name = doc.createElement('name')                object_name.appendChild(doc.createTextNode('face'))                objects.appendChild(object_name)                pose = doc.createElement('pose')                pose.appendChild(doc.createTextNode('Unspecified'))                objects.appendChild(pose)                truncated = doc.createElement('truncated')                truncated.appendChild(doc.createTextNode('1'))                objects.appendChild(truncated)                difficult = doc.createElement('difficult')                difficult.appendChild(doc.createTextNode('0'))                objects.appendChild(difficult)                bndbox = doc.createElement('bndbox')                objects.appendChild(bndbox)                xmin = doc.createElement('xmin')                xmin.appendChild(doc.createTextNode(str(bbox[0])))                bndbox.appendChild(xmin)                ymin = doc.createElement('ymin')                ymin.appendChild(doc.createTextNode(str(bbox[1])))                bndbox.appendChild(ymin)                xmax = doc.createElement('xmax')                xmax.appendChild(doc.createTextNode(str(bbox[0]+bbox[2])))                bndbox.appendChild(xmax)                ymax = doc.createElement('ymax')                ymax.appendChild(doc.createTextNode(str(bbox[1]+bbox[3])))                bndbox.appendChild(ymax)            f=open(xmlpath,"w")            f.write(doc.toprettyxml(indent = ''))            f.close()        cv2.imshow("img",img)        cv2.waitKey(1)        index=index+1    pbar.finish()  def generatetxt(trainratio=0.7,valratio=0.2,testratio=0.1):    files=os.listdir(labelsdir)    ftrain=open(rootdir+"VOCdevkit/VOC2007/"+"train.txt","w")    fval=open(rootdir+"VOCdevkit/VOC2007/"+"val.txt","w")    ftrainval=open(rootdir+"VOCdevkit/VOC2007/"+"trainval.txt","w")    ftest=open(rootdir+"VOCdevkit/VOC2007/"+"test.txt","w")    index=0    for i in range(len(files)):        filename=files[i]        filename=datasetprefix+filename[:-3]+"jpg"+"\n"        if i<trainratio*len(files):            ftrain.write(filename)            ftrainval.write(filename)        elif i<(trainratio+valratio)*len(files):            fval.write(filename)            ftrainval.write(filename)        elif i<(trainratio+valratio+testratio)*len(files):            ftest.write(filename)    ftrain.close()    fval.close()    ftrainval.close()    ftest.close()  def generatevocsets(trainratio=0.7,valratio=0.2,testratio=0.1):    if not os.path.exists(rootdir+"VOCdevkit/VOC2007/ImageSets"):        os.mkdir(rootdir+"VOCdevkit/VOC2007/ImageSets")    if not os.path.exists(rootdir+"/ImageSets/Main"):        os.mkdir(rootdir+"VOCdevkit/VOC2007/ImageSets/Main")    ftrain=open(rootdir+"VOCdevkit/VOC2007/ImageSets/Main/train.txt",'w')    fval=open(rootdir+"VOCdevkit/VOC2007/ImageSets/Main/val.txt",'w')    ftrainval=open(rootdir+"VOCdevkit/VOC2007/ImageSets/Main/trainval.txt",'w')    ftest=open(rootdir+"VOCdevkit/VOC2007/ImageSets/Main/test.txt",'w')    files=os.listdir(labelsdir)    for i in range(len(files)):        imgfilename=files[i][:-4]        ftrainval.write(imgfilename+"\n")        if i<int(len(files)*trainratio):            ftrain.write(imgfilename+"\n")        elif i<int(len(files)*(trainratio+valratio)):            fval.write(imgfilename+"\n")        else:            ftest.write(imgfilename+"\n")    ftrain.close()    fval.close()    ftrainval.close()    ftest.close()  if __name__=="__main__":    showgt()    generatevocsets()    generatetxt()    #generate_hdf5()    #viewginhdf5()


celebA2YOLO.py需要我将celebA的数据以及celebA2YOLO.py按照如下目录结构放置:

----------------------------------------------CelebA目录结构----------------------------------------------

--CelebA

         --Anno

                   list_bbox_celeba.txt

                   list_landmarks_celeba.txt

         --Img

                   celebA2YOLO.py

                   --img_celeba

                            {allof celebA pics}

         --VOCdevkit

                   --VOC2007

----------------------------------------------CelebA目录结构----------------------------------------------

注意以上所有目录及文件都必须存在。

运行celebA2YOLO.py后生成的目录结构如下:

----------------------------------------------CelebA目录结构----------------------------------------------

--CelebA

         --Anno

                   list_bbox_celeba.txt

                   list_landmarks_celeba.txt

         --Img

                   celebA2YOLO.py

                   --img_celeba

                            {allof celebA pics}

         --VOCdevkit

                   --VOC2007

                            --Annotations

                                     {allof XMLs}

                            --ImageSets

                                     --Main

                                               train.txt

                                               val.txt

                                               trainval.txt

                                               test.txt

                            train.txt//这四个文件与ImageSets/Main/中的四个文件同名,我们之后所提到的train.txt,val.txt,trainval.txt,test.txt均指这四个文件而非ImageSets/Main/中的文件。

                            val.txt

                            trainval.txt

                            test.txt

----------------------------------------------CelebA目录结构----------------------------------------------

接着我在CelebA/VOCdevkit/VOC2007/下新建目录JPEGImages,将/CelebA/Img/img_celeba/下的所有图片移动到/CelebA/VOCdevkit/VOC2007/JPEGImages/中,并将yolo作者提供的voc_label.py复制到CelebA/下。

运行voc_label.py。得到的目录结构如下:

----------------------------------------------CelebA目录结构----------------------------------------------

--CelebA

         voc_label.py

         2007_train.txt//这里我们不用这两个文件,我们使用celebA2YOLO.pyVOC2007下生成的train.txt,test.txt,val.txt

         train.all.txt

         --Anno

                   list_bbox_celeba.txt

                   list_landmarks_celeba.txt

         --Img

                   celebA2YOLO.py

                   --img_celeba

         --VOCdevkit

                   --VOC2007

                            --JPEGImages

                                     {allof celeba pics}

                            --Annotations

                                     {allof XMLs}

                            --ImageSets

                                     --Main

                                               train.txt

                                               val.txt

                                               trainval.txt

                                               test.txt

                            --labels

                                     {all of label.txt files for yolov2}

                            train.txt

                            val.txt

                            trainval.txt

                            test.txt

----------------------------------------------CelebA目录结构----------------------------------------------

这样我们就得到了1,2要求的文件。这样数据的准备和部署就做好啦。

三、修改YOLOv2相关文件

接下来我们考虑3。

先看voc.data文件,voc.data文件位于darknet/cfg/下,原版voc.data文件的内容如下:

-----------------------------------------------voc.data文件内容---------------------------------------------------

classes = 20

train =  /home/pjreddie/data/voc/train.txt

valid =  /home/pjreddie/data/voc/2007_test.txt

names = data/voc.names

backup =  backup

-----------------------------------------------voc.data文件内容---------------------------------------------------

其中,train和valid两项分别代表训练图片和验证图片,其后的路径分别指向train.txt,val.txt。names指向类别名列表,即voc.names。backup为训练时备份权重文件的路径。

我将voc.data的内容作如下修改,同时在zhangzexuan/下新建backup目录:

-----------------------------------------------voc.data文件内容---------------------------------------------------

Classes = 1

train = /home/scw4750/zhangzexuan/CelebA/train.txt

valid = /home/scw4750/zhangzexuan/CelebA/val.txt//我使用celebA2YOLO.py所生成的train.txtval.txt来进行训练以及验证

names = data/voc.names

backup = /home/scw4750/zhangzexuan/backup

-----------------------------------------------voc.data文件内容---------------------------------------------------

修改网络配置文件tiny-yolo.cfg,tiny-yolo.cfg位于darknet/cfg/目录下。按照下述修改:

1.      第3行与第4行,删除“batch”和“subdivisions”前的“#”和空格。

2.      第6行与第7行,在最前面加“#”。

3.      第4行,改为“subdivisions=8”。

4.      第125行,改为“classes=1”。

5.      第119行,改为“filters=30”。//这里filters的值的计算是有根据的:5 bbox per location; bbox hasstx,sty,tw,th,and confidences of 20 classes,and probabilities for voc.5*(4+1+1)=30

修改类别名列表文件darknet/data/voc.names:这个文件就是按顺序存储了对应的类别名称,将其清空并在第一行写入“face”即可(因为我们在label文件xxxxxx.txt中用0来代表face

四、进行训练

阅读这篇文章之前,你应该已经看了yolo作者官网的教程,并且已经在你的系统上clone好了darknet。

做好了之前的步骤之后,我现在可以开始训练了。需要注意的是,darknet训练命令有省略某些参数的简写形式,我没有修改darknet的源码,因此我在进行训练的时候必须使用darknet的完整命令。

即:./darknet detector train cfg/voc.data cfg/tiny-yolo.cfg -gpus0,1,2,3

测试:./darknet detector test cfg/voc.data cfg/tiny-yolo.cfg [权重文件] [图片]

好啦,享受训练的乐趣吧~

原创粉丝点击