caffe+python+mnist从图片训练到测试单张图片

来源：互联网发布：淘宝收藏店铺排行榜编辑：程序博客网时间：2024/05/17 21:46

环境：caffe已经装好，GPU训练模式，ubuntu14，

1.从图片格式的数据集开始，下载了mnist图片格式的数据集，下载地址：http://download.csdn.net/download/magicarcher/9529956

解压以后放在caffe-master/data/Mnist_image中，MNIST是一个手写数字数据库,它有60000个训练样本集和10000个测试样本集。

2.数据准备，转换成lmdb格式

首先是在caffe-master/data/Mnist_image中新建一个create_filelist.sh脚本来生成训练和测试数据的标签文件(就是指定什么图片是什么类别的txt)：

# !/usr/bin/env shDATA_TRAIN=../../data/Mnist_image/train #../使得能直接在这个目录运行create_filelist.shDATA_TEST=../../data/Mnist_image/testMY=../../data/Mnist_imageecho "Create train.txt..."rm -rf $MY/train.txt #删除原有的train.txt,在重复生成train.txt的时候用到for i in 0 1 2 3 4 5 6 7 8 9 dofind $DATA_TRAIN/$i/ -name *.png | cut -d '/' -f6-7 | sed "s/$/ $i/">>$MY/train.txt #以/为分隔符，截取第6-7段作为图片在train.txt中的名称，后面加上标签0~9中一个doneecho "Create test.txt..."rm -rf $MY/test.txtfor i in 0 1 2 3 4 5 6 7 8 9 dofind $DATA_TEST/$i/ -name *.png | cut -d '/' -f6-7 | sed "s/$/ $i/">>$MY/test.txtdoneecho "All done"

解释-f6-7:

比如路径$DATA_TRAIN/$i/ -name *.png = ../../data/Mnist_image/train/0/0_1.png，f6-7就是被/分隔开的第6段和第7段的内容：0/0_1.png

在此路径caffe-master/data/Mnist_image中运行：

create_filelist.sh

就得到train.txt和test.txt文件：

然后在caffe-master/examples中新建一个文件夹Mnist_image,在Mnist_image中新建脚本文件create_lmdb.sh:

#!/usr/bin/env sh# Create the imagenet lmdb inputs# N.B. set the path to the imagenet train + val data dirsset -eEXAMPLE=../../examples/Mnist_image        #放得到的lmdb、训练得到的模型的路径DATA=../../data/Mnist_image               #获取数据的路径，注意我们的mnist数据集中的图片都是单通道的(可以用python命令shape来看图片形状是(20,20),证明是单通道)TOOLS=../..ild/tools                      #使用caffe的工具进行转换格式的路径TRAIN_DATA_ROOT=$DATA/train/              #根目录TEST_DATA_ROOT=$DATA/test/rm $EXAMPLE/number_train_lmdb -rfrm  $EXAMPLE/number_test_lmdb -rf# 这个不用了，数据集中的图像都是20*20#Set RESIZE=true to resize the images to 256x256. Leave as false if images have# already been resized using another tool.RESIZE=trueif $RESIZE; then  RESIZE_HEIGHT=20  RESIZE_WIDTH=20else  RESIZE_HEIGHT=0  RESIZE_WIDTH=0fiif [ ! -d "$TRAIN_DATA_ROOT" ]; then  echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"  echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \       "where the ImageNet training data is stored."  exit 1fiif [ ! -d "$TEST_DATA_ROOT" ]; then  echo "Error: TEST_DATA_ROOT is not a path to a directory: $TEST_DATA_ROOT"  echo "Set the TEST_DATA_ROOT variable in create_imagenet.sh to the path" \       "where the ImageNet validation data is stored."  exit 1fiecho "Creating train lmdb..."GLOG_logtostderr=1 $TOOLS/convert_imageset \   #convert_imageaet的用法    --resize_height=$RESIZE_HEIGHT \    --resize_width=$RESIZE_WIDTH \    --shuffle \    --gray=true \                               #注意因为训练数据是灰度图，所以这里要令gray=true，默认是false，就会导致训练得到的lmdb是3通道的    $TRAIN_DATA_ROOT \                          #根目录    $DATA/train.txt \                           #train.txt的路径    $EXAMPLE/number_train_lmdb                  #放生成的lmdb的路径echo "Creating val lmdb..."GLOG_logtostderr=1 $TOOLS/convert_imageset \    --resize_height=$RESIZE_HEIGHT \    --resize_width=$RESIZE_WIDTH \    --shuffle \    --gray=true \    $TEST_DATA_ROOT\    $DATA/test.txt \    $EXAMPLE/number_test_lmdbecho "Done."

于是生成如上两个lmdb文件夹。

3.计算均值并保存

图片减去均值再训练，会提高训练速度和精度。因此，一般都会有这个操作。
caffe程序提供了一个计算均值的文件compute_image_mean.cpp，我们直接使用就可以了：

sudo build/tools/compute_image_mean examples/Mnist_image/number_train_lmdb examples/Mnist_image/mean.binaryproto1

生成均值文件mean.binaryproto，但是好像默认的生成路径在根目录下。

4.创建模型并修改配置文件

模型就用examples中自带的模型，位置在examples/mnist目录下, 将需要的两个配置文件lenet_solver.prototxt和lenet_train_val.prototxt，复制到examples/Mnist_image/目录下，更名为solver.prototxt和train_val.prototxt，打开solver.prototxt,只需修改两个路径，其他参数不用修改：？？？？？？？？？test？那train呢？

# The train/test net protocol buffer definitionnet: "examples/Mnist_image/train_test.prototxt"                            #指定训练模型文件的位置# test_iter specifies how many forward passes the test should carry out.# In the case of MNIST, we have test batch size 100 and 100 test iterations,# covering the full 10,000 testing images.test_iter: 100# Carry out testing every 500 training iterations.test_interval: 500# The base learning rate, momentum and the weight decay of the network.base_lr: 0.01momentum: 0.9weight_decay: 0.0005# The learning rate policylr_policy: "inv"gamma: 0.0001power: 0.75# Display every 100 iterationsdisplay: 100# The maximum number of iterationsmax_iter: 10000# snapshot intermediate resultssnapshot: 5000snapshot_prefix: "examples/Mnist_image/caffenet_train"# solver mode: CPU or GPUsolver_mode: GPU

然后train_val.prototxt也只用修改一下路径，参数什么的都不用改。

name: "LeNet"layer {  name: "mnist"  type: "Data"  top: "data"  top: "label"  include {    phase: TRAIN  }  transform_param {    scale: 0.00390625  }  data_param {    source: "examples/mnist/mnist_train_lmdb"    batch_size: 64    backend: LMDB  }}layer {  name: "mnist"  type: "Data"  top: "data"  top: "label"  include {    phase: TEST  }  transform_param {    scale: 0.00390625  }  data_param {    source: "examples/mnist/mnist_test_lmdb"    batch_size: 100    backend: LMDB  }}layer {  name: "conv1"  type: "Convolution"  bottom: "data"  top: "conv1"  param {    lr_mult: 1  }  param {    lr_mult: 2  }  convolution_param {    num_output: 20    kernel_size: 5    stride: 1    weight_filler {      type: "xavier"    }    bias_filler {      type: "constant"    }  }}layer {  name: "pool1"  type: "Pooling"  bottom: "conv1"  top: "pool1"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "conv2"  type: "Convolution"  bottom: "pool1"  top: "conv2"  param {    lr_mult: 1  }  param {    lr_mult: 2  }  convolution_param {    num_output: 50    kernel_size: 5    stride: 1    weight_filler {      type: "xavier"    }    bias_filler {      type: "constant"    }  }}layer {  name: "pool2"  type: "Pooling"  bottom: "conv2"  top: "pool2"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "ip1"  type: "InnerProduct"  bottom: "pool2"  top: "ip1"  param {    lr_mult: 1  }  param {    lr_mult: 2  }  inner_product_param {    num_output: 500    weight_filler {      type: "xavier"    }    bias_filler {      type: "constant"    }  }}layer {  name: "relu1"  type: "ReLU"  bottom: "ip1"  top: "ip1"}layer {  name: "ip2"  type: "InnerProduct"  bottom: "ip1"  top: "ip2"  param {    lr_mult: 1  }  param {    lr_mult: 2  }  inner_product_param {    num_output: 10    weight_filler {      type: "xavier"    }    bias_filler {      type: "constant"    }  }}layer {  name: "accuracy"  type: "Accuracy"  bottom: "ip2"  bottom: "label"  top: "accuracy"  include {    phase: TEST  }}layer {  name: "loss"  type: "SoftmaxWithLoss"  bottom: "ip2"  bottom: "label"  top: "loss"}

5.训练

同样从位置在examples/mnist目录下, 复制lenet_train.sh到examples/Mnist_image目录,并更名为train.sh，修改路径：

 #!/usr/bin/env shset -e.build/tools/caffe train --solver=examples/Mnist_image/solver.prototxt $@

然后在caffe-master目录运行examples/Mnist_image/train_lenet.sh ，就会开始训练得到caffenet_train_iter_10000.caffemodel。整个训练过程就完了，最后就是为了得到这个caffemodel模型。下面尝试对任意一张图片使用这个caffemodel进行测试，看是否准确。

6.使用deploy.py生成deploy.prototxt

在examples/Mnist_image目录下新建deploy.py：

# -*- coding: utf-8 -*-caffe_root = '/home/cvlab01/2016liulu/caffe-master/'  import sys  sys.path.insert(0, caffe_root + 'python')  from caffe  import layers as L,params as P,to_protoroot='/home/cvlab01/2016liulu/caffe-master/'deploy='/home/cvlab01/2016liulu/caffe-master/examples/Mnist_image/deploy.prototxt'    #文件保存路径def create_deploy():    #少了第一层，data层    conv1=L.Convolution(name='conv1',bottom='data', kernel_size=5, stride=1,num_output=20, pad=0,weight_filler=dict(type='xavier'))    pool1=L.Pooling(conv1,name='pool1',pool=P.Pooling.MAX, kernel_size=2, stride=2)    conv2=L.Convolution(pool1, name='conv2',kernel_size=5, stride=1,num_output=50, pad=0,weight_filler=dict(type='xavier'))    pool2=L.Pooling(conv2, name='pool2',top='pool2', pool=P.Pooling.MAX, kernel_size=2, stride=2)    fc3=L.InnerProduct(pool2, name='ip1',num_output=500,weight_filler=dict(type='xavier'))    relu3=L.ReLU(fc3, name='relu1',in_place=True)    fc4 = L.InnerProduct(relu3, name='ip2',num_output=10,weight_filler=dict(type='xavier'))    #最后没有accuracy层，但有一个Softmax层    prob=L.Softmax(fc4, name='prob')    return to_proto(prob)def write_deploy():     with open(deploy, 'w') as f:        f.write('name:"LeNet"\n')        f.write('layer {\n')        f.write('name:"data"\n')        f.write('type:"Input"\n')        f.write('input_param { shape : {')        f.write('dim:1 ')        f.write('dim:3 ')        f.write('dim:28 ')        f.write('dim:28 ')        f.write('} }\n\n')        f.write(str(create_deploy()))if __name__ == '__main__':    write_deploy()

运行deploy.py生成的deploy.prototxt如下：

name: "LeNet" layer {  name: "data"  type: "Input"  top: "data"  input_param { shape: { dim: 1 dim: 1 dim: 20 dim: 20 } }#灰度图像，dim为1，不能弄错了}#/*卷积层与全连接层中的权值学习率，偏移值学习率，偏移值初始化方式,因为这些值在caffemodel文件中已经提供*/layer {  name: "conv1"  type: "Convolution"  bottom: "data"  top: "conv1"  convolution_param {    num_output: 20    kernel_size: 5    stride: 1    weight_filler {      type: "xavier"    }  }}layer {  name: "pool1"  type: "Pooling"  bottom: "conv1"  top: "pool1"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "conv2"  type: "Convolution"  bottom: "pool1"  top: "conv2"  convolution_param {    num_output: 50    kernel_size: 5    stride: 1    weight_filler {      type: "xavier"    }  }}layer {  name: "pool2"  type: "Pooling"  bottom: "conv2"  top: "pool2"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "ip1"  type: "InnerProduct"  bottom: "pool2"  top: "ip1"  inner_product_param {    num_output: 500    weight_filler {      type: "xavier"    }  }}layer {  name: "relu1"  type: "ReLU"  bottom: "ip1"  top: "ip1"}layer {  name: "ip2"  type: "InnerProduct"  bottom: "ip1"  top: "ip2"  inner_product_param {    num_output: 10    weight_filler {      type: "xavier"    }  }}#/*删除了原有的测试模块的测试精度层*/#/*输出层的类型由SoftmaxWithLoss变成Softmax，训练是输出时是loss，应用时是prob。*/layer {  name: "prob"  type: "Softmax"  bottom: "ip2"  top: "prob"}

7.准备均值文件meanfile.npy和synset_words.txt

因为classify.py中的测试接口caffe.Classifier需要训练图片的均值文件作为输入参数，而实际lenet-5训练时并未计算均值文件，所以这里创建一个全0的均值文件输入。编写一个zeronp.py文件如下
这里写图片描述
执行

python zeronp.py1
1

生成均值文件 meanfile.npy。

在examples/Mnist_image中新建synset_words.txt：

0 zero1 one2 two3 three4 four5 five6 six7 seven8 eight9 nine

8.修改classify.py保存为classifymnist.py文件

在目录caffe-master/python中有classify.py文件，复制一份并改名为classifymnist.py然后进行如下修改：

#!/usr/bin/env python#coding:utf-8"""classify.py is an out-of-the-box image classifer callable from the command line.By default it configures and runs the Caffe reference ImageNet model."""caffe_root = '/home/cvlab01/2016liulu/caffe-master/'  import sys  sys.path.insert(0, caffe_root + 'python')  import numpy as npimport osimport sysimport argparseimport globimport timeimport pandas as pd #插入数据分析包import caffedef main(argv):    pycaffe_dir = os.path.dirname(__file__)    parser = argparse.ArgumentParser()    # Required arguments: input and output files.    parser.add_argument(        "input_file",        help="Input image, directory, or npy."    )    parser.add_argument(        "output_file",        help="Output npy filename."    )    # Optional arguments.    parser.add_argument(        "--model_def",        default=os.path.join(pycaffe_dir,                "../examples/Mnist_image/deploy.prototxt"), #指定deploy.prototxt的模型位置        help="Model definition file."    )    parser.add_argument(        "--pretrained_model",        default=os.path.join(pycaffe_dir,                "../examples/Mnist_image/caffenet_train_iter_10000.caffemodel"), #指定caffemodel模型位置，这就是我们前面自己训练得到的模型        help="Trained model weights file."    )    #######新增^^^^^^^^^start^^^^^^^^^^^^^^^^^^^^^^    parser.add_argument(        "--labels_file",        default=os.path.join(pycaffe_dir,                "../examples/Mnist_image/synset_words.txt"), #指定输出结果对应的类别名文件???????????????????????????        help="mnist result words file"    )    parser.add_argument(        "--force_grayscale",        action='store_true',   #增加一个变量将输入图像强制转化为灰度图，因为lenet-5训练用的就是灰度图        help="Converts RGB images down to single-channel grayscale versions," +                   "useful for single-channel networks like MNIST."    )    parser.add_argument(        "--print_results",        action='store_true', #输入参数要求打印输出结果        help="Write output text to stdout rather than serializing to a file."    )    #######新增^^^^^^^^^end^^^^^^^^^^^^^^^^^^^^^^    parser.add_argument(        "--gpu",        action='store_true',        help="Switch for gpu computation."    )    parser.add_argument(        "--center_only",        action='store_true',        help="Switch for prediction from center crop alone instead of " +             "averaging predictions across crops (default)."    )    parser.add_argument(        "--images_dim",        default='20,20', #指定图像寬高        help="Canonical 'height,width' dimensions of input images."    )    parser.add_argument(        "--mean_file",        default=os.path.join(pycaffe_dir,                             '../examples/Mnist_image/meanfile.npy'), #指定均值文件        help="Data set image mean of [Channels x Height x Width] dimensions " +             "(numpy array). Set to '' for no mean subtraction."    )    parser.add_argument(        "--input_scale",        type=float,        help="Multiply input features by this scale to finish preprocessing."    )    parser.add_argument(        "--raw_scale",        type=float,        default=255.0,        help="Multiply raw input by this scale before preprocessing."    )    parser.add_argument(        "--channel_swap",        default='2,1,0',        help="Order to permute input channels. The default converts " +             "RGB -> BGR since BGR is the Caffe default by way of OpenCV."    )    parser.add_argument(        "--ext",        default='jpg',        help="Image file extension to take as input when a directory " +             "is given as the input file."    )    args = parser.parse_args()    image_dims = [int(s) for s in args.images_dim.split(',')]    mean, channel_swap = None, None    if args.mean_file:        mean = np.load(args.mean_file).mean(1).mean(1)    if args.channel_swap:        channel_swap = [int(s) for s in args.channel_swap.split(',')]    if args.gpu:        caffe.set_mode_gpu()        print("GPU mode")    else:        caffe.set_mode_cpu()        print("CPU mode")    # Make classifier.    classifier = caffe.Classifier(args.model_def, args.pretrained_model,            image_dims=image_dims, mean=mean,            input_scale=args.input_scale, raw_scale=args.raw_scale,            channel_swap=None)    # Load numpy array (.npy), directory glob (*.jpg), or image file.    args.input_file = os.path.expanduser(args.input_file)    if args.input_file.endswith('npy'):        print("Loading file: %s" % args.input_file)        inputs = np.load(args.input_file)    elif os.path.isdir(args.input_file):        print("Loading folder: %s" % args.input_file)        inputs =[caffe.io.load_image(im_f)                 for im_f in glob.glob(args.input_file + '/*.' + args.ext)]    else:        print("Loading file: %s" % args.input_file)        inputs = [caffe.io.load_image(args.input_file,not args.force_grayscale)] #强制图片为灰度图    print("Classifying %d inputs." % len(inputs))    # Classify.    start = time.time()    scores = classifier.predict(inputs, not args.center_only).flatten()    print("Done in %.2f s." % (time.time() - start))            #增加输出结果打印到终端^^^start^^^^^    # print    if args.print_results:        with open(args.labels_file) as f:            labels_df = pd.DataFrame([{'synset_id':l.strip().split(' ')[0], 'name': ' '.join(l.strip().split(' ')[1:]).split(',')[0]} for l in f.readlines()])            labels = labels_df.sort('synset_id')['name'].values            indices =(-scores).argsort()[:5]            predictions = labels[indices]            print predictions            print scores            meta = [(p, '%.5f' % scores[i]) for i,p in zip(indices, predictions)]            print meta#增加输出结果打印到终端vvvvendvvvvvvv            # Save    print("Saving results into %s" % args.output_file)    np.save(args.output_file, predictions)if __name__ == '__main__':    main(sys.argv)

8.测试，在classifymnist.py目录下准备一个灰度图像3.jpg，大小和mnist中一样，然后执行：

 python classifymnist.py --print_results --force_grayscale --center_only --labels_file ../examples/Mnist_image/synset_words.txt ../examples/Mnist_image/3.jpg resultsfile

借鉴了http://blog.csdn.net/lanxuecc/article/details/52485077的博主一系列的文章，表示感谢，这里只是自己记录学习过程，如果侵权，很抱歉

0 0