caffe(二): 利用训练好的MNIST模型测试自己的手写字符图片

来源:互联网 发布:win7无法打开网络发现 编辑:程序博客网 时间:2024/05/17 23:51

上篇博客已经利用caffe针对MNIST数据集训练出了lenet_iter_10000.caffemodel, 因为要将此模型应用到自己的实际任务中。所以本篇博客记录利用训练好的lenet_iter_10000.cafemodel来测试自己的手写字符图片。

准备待测试图片

本篇博客选取的测试图片是大神符生成程序自带的字符图片(2017年Robomasters机器人大赛),为了之后测试方便,将图片统一命名为1.png, 2.png, … , N.png。在此选取字符5所属文件夹中的第一张图片作为测试图片,如下所示:
这里写图片描述

deploy.prototxt模型描述文件

deploy.prototxt与lent_train_test.prototxt类似,可通过改写后者来实现,具体如下:

name: "LeNet"input: "data"input_dim:1input_dim:1input_dim:28input_dim:28layer {    name: "conv1"    type: "Convolution"    bottom: "data"    top: "conv1"    convolution_param {        num_output: 20        kernel_size: 5        stride: 1        weight_filler {            type: "xavier"        }    }}layer {    name: "pool1"    type: "Pooling"    bottom: "conv1"    top: "pool1"    pooling_param {        pool: MAX        kernel_size: 2        stride: 2    }}layer {    name: "conv2"    type: "Convolution"    bottom: "pool1"    top: "conv2"    convolution_param {        num_output: 50        kernel_size: 5        stride: 1        weight_filler {            type: "xavier"        }    }}layer {    name: "pool2"    type: "Pooling"    bottom: "conv2"    top: "pool2"    pooling_param {        pool: MAX        kernel_size: 2        stride: 2    }}layer {    name: "ip1"    type: "InnerProduct"    bottom: "pool2"    top: "ip1"    inner_product_param {        num_output: 500        weight_filler {            type: "xavier"        }    }}layer {    name: "relu1"    type: "ReLU"    bottom: "ip1"    top: "ip1"}layer {    name: "ip2"    type: "InnerProduct"    bottom: "ip1"    top: "ip2"    inner_product_param {        num_output: 10        weight_filler {            type: "xavier"        }    }}layer {    name: "prob"    type: "Softmax"    bottom: "ip2"    top: "prob"}        

第二个全连接层的num_output: 10代表分类数目是10。
在最开始的时候数据层是这样写的:

layer {    name: "data"    type: "Input"    top: "data"    input_param { shape : {dim: 1 dim: 1 dim: 28 dim: 28 }}}

运行之后出现如下所示错误:
这里写图片描述
改成上面的写法之后才能正确运行,可能是caffe版本的兼容性问题,具体原因目前也不清楚啦。

lenet_iter_10000.caffemodel模型权值文件

训练过程中生成的模型文件,到相应的文件夹中可以找到。

synset_words.txt标签文件

这里写图片描述
注意在9之后不要加回车,否则会出现如下所示错误:
这里写图片描述

mean.binaryproto二进制均值文件

因为caffe自带的例程训练MNIST模型的过程中没有用到二进制均值文件,所以测试的过程中也不需要此文件,所以本篇博客不就这方面展开陈述。

classification.bin分类器

在examples/cpp_classification/classification.cpp,将其改写成自己所需的cpp文件,由于未使用二进制均值文件,所以需要将涉及到均值文件的部分进行改写,然后编译链接,生成classification.bin, 这就是我们所需的分类器。

include <caffe/caffe.hpp>#define USE_OPENCV#ifdef USE_OPENCV#include <opencv2/core/core.hpp>#include <opencv2/highgui/highgui.hpp>#include <opencv2/imgproc/imgproc.hpp>#endif #include <algorithm>#include <iosfwd>#include <memory>#include <string>#include <utility>#include <vector>#ifdef USE_OPENCVusing namespace caffe;using std::string;typedef std::pair<string, float> Prediction;class Classifier {    public:     Classifier(const string& model_file, const string& trained_file, const string& mean_file, const string& label_file);     std::vector<Prediction> Classify(const cv::Mat& img, int N=5);    private:     std::vector<float> Predict(const cv::Mat& img);     void WrapInputLayer(std::vector<cv::Mat>* input_channels);     void Preprocess(const cv::Mat& img, std::vector<cv::Mat>* input_channels);    private:     shared_ptr<Net<float> > net_;     cv::Size input_geometry_;     int num_channels_;     std::vector<string> labels_;}Classifier::Classifier(const string& model_file, const string& trained_file, const string& mean_file, const string& label_file){#ifdef CPU_ONLY Caffe::set_mode(Caffe::CPU);#else Caffe::set_mode(Caffe::GPU);#endif net_.reset(new Net<float>(model_file, TEST)); net_->CopyTrainedLayersFrom(trained_file); CHECK_EQ(net_->num_inputs(),1)<<"Network should have exactly one input."; CHECK_EQ(net_->num_outputs(),1)<<"Network should have exactly one output."; Blob<float>* input_layer=net_->input_blobs()[0]; num_channels_=input_layer->channels(); CHECK(num_channels_==3 || num_channels_==1)<<"Input layer should have 1 or 3 channels."; input_geometry_=cv::Size(28,28); std::ifstream labels(label_file.c_str()); CHECK(labels)<<"Unable to open labels file"<<label_file; string line; while(std::getline(labels,line))  labels_.push_back(string(line)); Blob<float>* output_layer=net_->output_blobs()[0]; CHECK_EQ(labels_.size(), output_layer->channels())<<"Number of labels is different from the output layer dimension.";}static bool PairCompare(const std::pair<float, int>& lhs, const std::pair<float, int>& rhs){ return lhs.first>rhs.first;}static std::vector<int> Argmax(const std::vector<float>&v, int N){ std::vector<std::pair<float, int> > pairs; for(size_t i=0; i<v.size(); i++)  pairs.push_back(std::make_pair(v[i],i)); std::partial_sort(pairs.begin(), pairs.begin+N, pairs.end, PairCompare); std::vector<int> result; for(int i=0; i<N; i++)  result.push_back(pairs[i].second); return result;}std::vector<Prediction> Classifier::Classify(const cv::Mat& img, int N){ std::vector<float> output=Predict(img); N=std::min<int>(labels_.size(),N); std::vector<int> maxN=Argmax(output,N); std::vector<Prediction> predictions; for(int i=0; i<N; i++){  int idx=maxN[i];  predictions.push_back(std::make_pair(labels_[idx],output[idx])); } return predictions;}std::vector<float> Classifier::Predict(const cv::Mat& img){ Blob<float>* input_layer=net_->input_blobs()[0]; input_layer->Reshape(1, num_channels_, input_geometry_.height, input_geometry_.width); net_->Reshape(); std::vector<cv::Mat> input_channels; WrapInputLayer(&input_channels); Preprocess(img, &input_channels); net_->ForwardPrefilled(); Blob<float>* output_layer=net_->output_blobs()[0]; const float* begin=output_layer->cpu_data(); const float* end=begin+output_layer->channels(); return std::vector<float>(begin, end);}void Classifier::WrapInputLayer(std::vector<cv::Mat>* input_channels){ Blob<float>* input_layer=net_->input_blobs()[0]; int width=input_layer->width(); int height=input_layer->height(); float* input_data=input_layer->mutable_cpu_data(); for(int i=0; i<input_layer->channels(); i++){  cv::Mat channels(height, widht, CV_32FC1, input_data);  input_channels->push_back(channel);  input_data+=width*height; }}void Classifier::Preprocess(const cv::Mat& img, std::vector<cv::Mat>* input_channels){    cv::Mat sample;    if(img.channels()==3 && num_channles==1)     cv::cvtColor(img, sample, cv::COLOR_BGR2GRAY);    else if(img.channels()==4 && num_channels==1)     cv::cvtColor(img, sample, cv::COLOR_BGRA2GRAY);    else if(img.channels()==4 && num_channels==3)     cv::cvtColor(img, sample, cv::COLOR_BGRA2BGR);    else if(img.channels()==1 && num_channels==3)     cv::cvtColor(img, sample, cv::COLOR_GRAY2BGR);     else      sample=img;     cv::Mat sample_resized;     if(sample.size()!=input_geometry_)      cv::resize(sample, sample_resized, input_geometry_);     else      sample_resized=sample;     cv::Mat sample_float;     if(num_channels==3)         sample_resized.convertTo(sample_float, CV_32FC3);     else         sample_resized.convertTo(sample_float, CV32FC1);     cv::split(sample_float, *input_channels);     CHECK(reinterpret_cast<float*>(input_channels->at(0).data)==net_->input_blobs()[0]->cpu_data())<<"Input channels not wrapping the input layer of the network.";}int main(int argc, char** argv){ if(argc!=5){  std::cerr<<"Usage:"<<argv[0]<<"deploy.prototxt network.caffemodel"<<"labels.txt img.jpg"<<std::endl;  return 1; } ::google::InitGoogleLogging(argv[0]); string model_file=argv[1]; string trained_file=argv[2]; string mean_file=""; string label_file=argv[3]; Classifier classifier(model_file, trained_file, mean_file, label_file); string file=argv[4]; std::cout<<"--------------------Prediction for"<<file<<"-----------------"<<std::endl; cv::Mat img=cv::imread(file,-1); CHECK(!img.empty())<<"Unable to decode image"<<file; std::vector<Prediction> predictions=classifier.Classify(img); for(size_t i=0; i<predictions.size(); i++){  Prediction p=predictions[i];  std::cout<<std::fixed<<std::setprecision(4)<<p.second<<"-\""<<p.first<<"\""<<std::endl; }}#elseint main(int argc, char** argv){ LOG(FATAL)<<"this example requires opencv; compile with USE_OPENCV.";}#endif   

测试

在服务器集群上利用CPU进行测试(GPU被别人用啦),注意改写一下Makefile.config配置文件,将CPU_ONLY :=1 前面的注释符#去掉,USE_CUNDD :=1前面加上注释符。 指令如下:

$ srun -p K15G12 -J MNIST -c 4 /lustre1/hw/yingjia/caffe-test/build/examples/cpp_classification/classification.bin /lustre1/hw/yingjia/caffe-test/examples/mnist/deploy.prototxt /lustre1/hw/yingjia/caffe-test/examples/mnist/lenet_iter_10000.caffemodel /lustre1/hw/yingjia/caffe-test/examples/mnist/synset_words.txt /lustre1/hw/yingjia/caffe-test/examples/mnist/5/1.png

运行结果如下:
这里写图片描述

涉及到的其他知识

1、因为最终要对测试集中所有图片进行测试,所以编写脚本piliang.sh实现批量测试,如下:

#!/bin/bashecho "this script is test model"for numberFile in /lustre1/hw/yingjia/caffe-test/examples/mnist/image/5/*do    srun -p K15G12 -J MNIST -c 4 /lustre1/hw/yingjia/caffe-test/build/examples/cpp_classification/classification.bin /lustre1/hw/yingjia/caffe-test/examples/mnist/deploy.prototxt /lustre1/hw/yingjia/caffe-test/examples/mnist/lenet_iter_10000.caffemodel /lustre1/hw/yingjia/caffe-test/examples/mnist/synset_words.txt $numberFiledone

2、测试完成之后需要对结果进行统计,在classification.cpp文件中当测试结果错误时输出”error”, 所以可以将输出重定向到一个.log文件中,然后统计文件中”error”字符的个数。
重定向的指令如下:

$ ./piliang.sh >> stderr.log 2>&1

统计stderr.log文件中”error”字符个数的指令如下:

$ awk -v RS='error' 'END {print --NR}' stderr.log

3、将Prediction p1=predictions[0]字符型转换成整形tempt:

std::stringstream sstr;sstr<<p1.first;int tempt;sstr>>tempt;

4、整形转换成字符型:

int i=1;std::stringstream sstr1;sstr1<<i;string str1;sstr1>>str1;
阅读全文
0 0
原创粉丝点击