使用自己的数据训练Faster-RCNN

来源：互联网发布：淘宝菲戈体育编辑：程序博客网时间：2024/04/30 03:27

1. 制作自己的数据

1.1 文件夹格式

保留py-faster-rcnn/data/VOCdevkit2007/VOC2007/下的Annotations，ImageSets和JPEGImages文件夹，但是需要删除其中的内容，并删除SegmenttationClass和SegmentationObject文件夹，其中Annotations保存标签txt转换的xml文件，ImageSets仅在Main文件夹中保存train.txt，trainval.txt，test.txt，val.txt，其余的也用不到，最后JPEGImages保存所训练的数据

1.2 制作xml数据

首先需要自己对图片中的兴趣区域进行标注，可以使用PS等软件得到坐标，然后制作与图片对应的.txt文件，文件内容如下：

3-1.jpg s 176 293 215 329

其中第一字段为图片名称，第二字段为自定义的类别名称，剩余四个字段分别代表兴趣区域的坐标XMin,YMin,XMax,YMax

然后使用matlab代码即可生成对应的xml文件，其中需要在matlab代码目录下创建JPEGImages文件夹保存图片，labels文件夹保存txt文件，Annotations文件夹保存转换后的xml文件，代码如下：

%writeanno.mpath_image='JPEGImages/';path_label='labels/';%txt文件存放路径files_all=dir(path_image);for i = 3:length(files_all)    msg = textread(strcat(path_label, files_all(i).name(1:end-4),'.txt'),'%s');    clear rec;    path = ['./Annotations/' files_all(i).name(1:end-4) '.xml'];    disp(path)    fid=fopen(path,'w');    rec.annotation.folder = 'VOC2007';%数据集名    rec.annotation.filename = files_all(i).name(1:end-4);%图片名    rec.annotation.source.database = 'The VOC2007 Database';%随便写       rec.annotation.source.annotation = 'PASCAL VOC2007';%随便写    rec.annotation.source.image = 'flickr';%随便写      rec.annotation.source.flickrid = '0';%随便写    rec.annotation.owner.flickrid = 'I do not know';%随便写    rec.annotation.owner.name = 'I do not know';%随便写    img = imread(['./JPEGImages/' files_all(i).name]);    rec.annotation.size.width = int2str(size(img,2));    rec.annotation.size.height = int2str(size(img,1));    rec.annotation.size.depth = int2str(size(img,3));    rec.annotation.segmented = '0';%不用于分割      rec.annotation.object.name = 'spine';%类别名       rec.annotation.object.pose = 'Left';%不指定姿势       rec.annotation.object.truncated = '1';%没有被删节       rec.annotation.object.difficult = '0';%不是难以识别的目标    rec.annotation.object.bndbox.xmin = msg{3};%坐标x1    rec.annotation.object.bndbox.ymin = msg{4};%坐标y1    rec.annotation.object.bndbox.xmax = msg{5};%坐标x2    rec.annotation.object.bndbox.ymax = msg{6};%坐标y2    writexml(fid,rec,0);    fclose(fid);end

%writexml.mfunction xml = writexml(fid,rec,depth)fn=fieldnames(rec);for i=1:length(fn)    f=rec.(fn{i});    if ~isempty(f)        if isstruct(f)            for j=1:length(f)                            fprintf(fid,'%s',repmat(char(9),1,depth));                a=repmat(char(9),1,depth);                fprintf(fid,'<%s>\n',fn{i});                writexml(fid,rec.(fn{i})(j),depth+1);                fprintf(fid,'%s',repmat(char(9),1,depth));                fprintf(fid,'</%s>\n',fn{i});            end        else            if ~iscell(f)                f={f};            end                   for j=1:length(f)                fprintf(fid,'%s',repmat(char(9),1,depth));                fprintf(fid,'<%s>',fn{i});                if ischar(f{j})                    fprintf(fid,'%s',f{j});                elseif isnumeric(f{j})&&numel(f{j})==1                    fprintf(fid,'%s',num2str(f{j}));                else                    error('unsupported type');                end                fprintf(fid,'</%s>\n',fn{i});            end        end    endend

1.3 ImageSets

ImageSets里只需要用到Main文件夹，而在Main中，主要用到4个文件：
- train.txt 是用来训练的图片文件的文件名列表
- trianval.txt是用来训练和验证的图片文件的文件名列表
- val.txt是用来验证的图片文件的文件名列表
- test.txt 是用来测试的图片文件的文件名列表

我们希望训练集、验证集、测试集的分别是随机的，下面是实现随机选取样本集合与写txt文件的代码：

%writetxt.mfile = dir('Annotations');len = length(file)-2;num_trainval=sort(randsample(len, floor(9*len/10)));%trainval集占所有数据的9/10，可以根据需要设置num_train=sort(num_trainval(randsample(length(num_trainval), floor(5*length(num_trainval)/6))));%train集占trainval集的5/6，可以根据需要设置num_val=setdiff(num_trainval,num_train);%trainval集剩下的作为val集num_test=setdiff(1:len,num_trainval);%所有数据中剩下的作为test集path = 'ImageSets\Main\';fid=fopen(strcat(path, 'trainval.txt'),'a+');for i=1:length(num_trainval)    s = sprintf('%s',file(num_trainval(i)+2).name);    fprintf(fid,[s(1:length(s)-4) '\n']);endfclose(fid);fid=fopen(strcat(path, 'train.txt'),'a+');for i=1:length(num_train)    s = sprintf('%s',file(num_train(i)+2).name);    fprintf(fid,[s(1:length(s)-4) '\n']);endfclose(fid);fid=fopen(strcat(path, 'val.txt'),'a+');for i=1:length(num_val)    s = sprintf('%s',file(num_val(i)+2).name);    fprintf(fid,[s(1:length(s)-4) '\n']);endfclose(fid);fid=fopen(strcat(path, 'test.txt'),'a+');for i=1:length(num_test)    s = sprintf('%s',file(num_test(i)+2).name);    fprintf(fid,[s(1:length(s)-4) '\n']);endfclose(fid);

其中将原代码中的randperm改为了randsample

2. 修改调用文件

根据你需要使用的模型，修改对应的配置文件，例如使用ZF模型，则在models/pascal_voc/ZF/faster_rcnn_alt_opt文件夹下，修改五个文件：stage1_fast_rcnn_train.pt，stage2_fast_rcnn_train.pt，stage1_rpn_train.pt，stage2_rpn_train.pt，faster_rcnn_test.pt

2.1 stage1_fast_rcnn_train.pt和stage2_fast_rcnn_train.pt

修改参数num_class:2（识别1类+背景1类），cls_score层中的num_output:2，bbox_pred中num_output:8（该值为（训练集类别数+1）*4（四个顶点坐标））

2.2 stage1_rpn_train.pt和stage2_rpn_train.pt

修改参数num_class:2（识别1类+背景1类）

2.3 fast_rcnn_test.pt

修改参数cls_score层中的num_output:2，bbox_pred中num_output:8（该值为（训练集类别数+1）*4（四个顶点坐标））

2.4 修改lib/datasets/pascal_voc.py

self._classes = ('__background__', # always index 0                              'spine')（列出自定义的类别名称）

2.5 修改lib/datasets/imdb.py

在boxes[:, 2] = widths[i] - oldx1 - 1下添加下列代码：

for b in range(len(boxes)):      if boxes[b][2]< boxes[b][0]:         boxes[b][0] = 0

修改完pascal_voc.py和imdb.py后进入lib/datasets目录下删除原来的pascal_voc.pyc和imdb.pyc文件，重新生成这两个文件，因为这两个文件是python编译后的文件，系统会直接调用。
终端进入lib/datasets文件目录输入：

python(此处应出现python的版本)>>>import py_compile>>>py_compile.compile(r'imdb.py')>>>py_compile.compile(r'pascal_voc.py')

3. 训练模型

训练之前还有一些需要注意的地方，如果之前训练了官方的VOC2007的数据集或其他的数据集，是会产生cache的问题的，建议在重新训练新的数据之前将其删除

py-faster-rcnn/outputpy-faster-rcnn/data/cachepy-faster-rcnn/data/VOCdevkit2007/annotations_cache

3.1 训练参数设置

在py-faster-rcnn/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage*_fast_rcnn_solver*.pt文件中设置训练参数，例如：

base_lr: 0.001lr_policy: "step"gamma: 0.1stepsize: 3000display: 20average_loss: 100momentum: 0.9weight_decay: 0.0005

迭代次数在文件py-faster-rcnn/tools/train_faster_rcnn_alt_opt.py中第80行进行修改

max_iters = [80000, 40000, 80000, 40000]

分别对应rpn第1阶段，fast rcnn第1阶段，rpn第2阶段，fast rcnn第2阶段的迭代次数，自己修改即可，不过注意这里的值不要小于上面的solver里面的step_size的大小，可自行修改，由于本机显存只有4G，所以修改迭代次数为1000次

3.2 训练模型

cd py-faster-rcnn./experiments/scripts/faster_rcnn_alt_opt.sh 0 ZF pascal_voc

注意此处模型名称需要大写，指明使用第一块GPU(0),模型是ZF，训练数据是pascal_voc(voc2007)

4. 错误处理

4.1 TypeError: ‘numpy.float64’ object cannot be interpreted as an index

解决方法 sudo pip install -U numpy==1.11.0

4.2 RuntimeError: module compiled against API version 0xb but this version of numpy is 0xa

由于之前修改了numpy版本，导致许多依赖numpy编译的库无法使用，本机python库都安装在/usr/local/lib/python2.7/dist-packages/中，如果使用sudo pip uninstall无法删除时，可以将该目录下的模块文件和相应的egg-info移除，然后使用命令pip show 模块名查看是否还存在，如果不存在则可以重新安装，需要重新安装的模块为：scikit-image，安装后出现ImportError: numpy.core.multiarray failed to import问题，根据产生问题的模块重新安装即可，需要安装的有matplotlib和scipy

如果用到了pandas，也需要重新安装，不同于以上三个模块，简单的uninstall/install对于pandas而言并不管用，修改完成后还是会出现错误：

RuntimeError: module compiled against API version 0xb but this version of numpy is 0xaTraceback (most recent call last):  File "train_model.py", line 6, in <module>    import pandas as pd  File "/usr/local/lib/python2.7/dist-packages/pandas/__init__.py", line 35, in <module>    "the C extensions first.".format(module))ImportError: C extension: iNaT not built. If you want to import pandas from the source directory, you may need to run 'python setup.py build_ext --inplace --force' to build the C extensions first.

此时需要下载源码自行编译安装，卸载pandas之后，从github上下载源码并解压，进入目录，运行以下命令即可：

python setup.py build_ext --inplace --forcesudo pip install .  # don't forget the dot

再次运行需要pandas的代码即可

同时注意Faster-RCNN自带的caffe编译文件Makefile.config中有关于numpy路径的问题，修改完numpy后最好确定该路径保持一致

# NOTE: this is required only if you will compile the python interface.# We need to be able to find Python.h and numpy/arrayobject.h.PYTHON_INCLUDE := /usr/include/python2.7 \/usr/local/lib/python2.7/dist-packages/numpy/core/include

由于后续还有一些关于lib中的文件错误，所以最好重新源代码，将训练模型修改的进行替换，然后重新编译一下

其他错误参考：py-faster-rcnn 训练常见错误

5. 利用demo.py预测自己的数据

训练完成之后，将output/faster_rcnn_alt_opt/voc_2007_trainval中的最终模型ZF_faster_rcnn_final.caffemodel拷贝到data/faster_rcnn_models（删除以前生成类似的model）中修改/tools/demo.py为：

CLASSES = ('__background__',           'spine') # 自定义的类别名称NETS = {'vgg16': ('VGG16',                  'VGG16_faster_rcnn_final.caffemodel'),        'zf': ('ZF',                  'ZF_faster_rcnn_final.caffemodel')} # 使用的模型参数对应的模型名称im_names = ['3-38.jpg', '3-10.jpg', '3-6.jpg',                '3-36.jpg', '3-1.jpg'] #测试数据列表

测试数据放入py-faster-rcnn/data/demo文件夹中，名称与im_names列表中对应

运行demo.py进行测试：

./tools/demo.py --net zf

程序运行完成会显示预测结果

6. 参考文章

本文参考以下文章：

Py-faster-rcnn实现自己的数据train和demo

py-faster-rcnn + ZF 实现自己的数据训练与检测(二)

RCNN系列实验的PASCAL VOC数据集格式设置

使用Faster-Rcnn进行目标检测(实践篇)

阅读全文

0 0