caffe学习-----3 .m文件流程（训练或者测试）

来源：互联网发布：查询的sql语句编辑：程序博客网时间：2024/06/05 11:02

caffe学习-----3 .m文件流程（训练或者测试）

学习的/caffe-master/matlab/demo下的classification_demo.m文件。

caffe的matlab接口：

在caffe/matlab/deom下classification_demo.m是个很好的学习资料，学会了这个代码之后，就能在matlab里用训练好的model对输入图像进行分类了。

function[scores, maxlabel] = classification_demo(im, use_gpu)这个demo是针对 ImageNet图像分类的(1000个类)，主要是调用训练好model，对输入的图像进行分类，输出分类结果。

要想运行此demo，需要去caffe的官网下载的东西有：

1.模型描述文件：deploy.prototxt（这个文件本来就有，不用管）

2.模型本身：bvlc_reference_caffenet.caffemodel

3.标签文件：用来描述1000个类分别是什么

4.均值文件：即训练样本的均值文件，ilsvrc_2012_mean.mat，文件已经存在于caffe-master/matlab/+caffe/imagenet之中

%IMPORTANT: before you run this demo, you should download BVLCCaffeNet

%from Model Zoo (http://caffe.berkeleyvision.org/model_zoo.html)

Fordetailed documentation and usage on Caffe's Matlab interface, please

%refer to Caffe Interface Tutorial at

%http://caffe.berkeleyvision.org/tutorial/interfaces.html#matlab

原文的：

%input

% im color image as uint8 HxWx3

% use_gpu 1 to use the GPU, 0 to use the CPU

%output

% scores 1000-dimensional ILSVRC score vector

% maxlabel the label of the highest score

解释：

输入有两个参数，一个是图片，为单张图（若要测试多张，用for解决），另外一个则是CPU和GPU的选择。

输出也有两个参数，一个是得分，即该图片对应所有类别的一个概率分布，另外一个则是最高得分所对应的类别，这就和标签文件有关了。

原文中matlab的相关设置：

%You may need to do the following before you start matlab:

% $ exportLD_LIBRARY_PATH=/opt/intel/mkl/lib/intel64:/usr/local/cuda-5.5/lib64

% $ export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libstdc++.so.6

%Or the equivalent based on where things are installed on your system

原文：

ifexist('../+caffe', 'dir')

addpath('..');

else

error('Pleaserun this demo from caffe/matlab/demo');

end

解释：

此处为添加路径，确保能找到caffe-master/matlab/+caffe

原文：

%Set caffe mode

ifexist('use_gpu', 'var') && use_gpu

caffe.set_mode_gpu();

gpu_id= 0; % we will use the first gpu in this demo

caffe.set_device(gpu_id);

else

caffe.set_mode_cpu();

end

解释：%设置CPUor GPU

原文：

model_dir= '../../models/bvlc_reference_caffenet/'; %模型所在文件夹路径

net_model= [model_dir 'deploy.prototxt']; %模型描述文件的路径是deploy文件不包含datalayers

net_weights= [model_dir 'bvlc_reference_caffenet.caffemodel']; %模型的路径

phase= 'test'; % run with phase test (so that dropout isn't applied)

%指出网络状态为test防止使用dropout

if~exist(net_weights, 'file')

error('Pleasedownload CaffeNet from Model Zoo before you run this demo');

end

%Initialize a network %初始化网络

net= caffe.Net(net_model, net_weights, phase);

%　若classification_demo这个函数的输入参数小于1，即无输入参数，则默认使用%caffe/examples/images/cat.jpg这张图片

ifnargin < 1

%For demo purposes we will use the cat image

fprintf('usingcaffe/examples/images/cat.jpg as input image\n');

im= imread('../../examples/images/cat.jpg');

end

% 由于caffe里的数据是BGR的顺序，而matlab是RGB的顺序，因此需要对输入图片进行变换这里%用一个prepare_image函数将RGB转成BGR，而且对输入图片进行了resize操作，crop操作以及减%均值

%得到的input_data是Heightx Width x Channel x Num

tic;

input_data= {prepare_image(im)};（函数下面具体解释）

toc;

%do forward pass to get scores前向传播，scores是Channelsx Num, where Channels == 1000

%即1000*10的

tic;

scores= net.forward(input_data); %将数据输入到网络，进行前向传播，得出分数

toc;

scores= scores{1}; %scores是 1000*10的矩阵10是对应了10个crop图片得到scores（矩阵），%这里每一列都是一个图像的分类的score。

%等价prob= net.blobs('prob').get_data();用法//计算之后再读取原块的数据，最后一%层为prob

scores= mean(scores, 2); %　对10个crop进行求平均

[~,maxlabel] = max(scores); %再找出最大的那一个

%call caffe.reset_all() to reset caffe %重置caffe

caffe.reset_all();

%------------------------------------------------------------------------

functioncrops_data = prepare_image(im)

%------------------------------------------------------------------------

%caffe/matlab/+caffe/imagenet/ilsvrc_2012_mean.mat contains mean_datathat

%is already in W x H x C with BGR channels

d= load('../+caffe/imagenet/ilsvrc_2012_mean.mat'); %读取均值文件通道顺序已经是BGR

mean_data= d.mean_data;

IMAGE_DIM= 256; %%图像将要resize的大小，建议resize为图像最小的那个维度

CROPPED_DIM= 227; %因为模型的输入就是227*227的，所以最终要得到一个227*227

%Convert an image returned by Matlab's imread to im_data in caffe'sdata

%format: W x H x C with BGR channels

im_data= im(:, :, [3, 2, 1]); % RGB转成了 BGR，im_data通道顺序是BGR

im_data= permute(im_data, [2, 1, 3]); %对输入图像进行了转置，长和宽换一换

im_data= single(im_data); % c数据格式转成uint8类型single

im_data= imresize(im_data, [IMAGE_DIM IMAGE_DIM], 'bilinear'); %采取双线性插值法，对输%入图片进行resize至IMAGE_DIM大小

im_data= im_data - mean_data; %再减去均值(alreadyin W x H x C, BGR)

%oversample (4 corners, center, and their x-axis flips)

%oversample就是数据增强2012年Alex等人提出的一个技术，这里是在图片（此处256*256的）%%的4个角以及正中心截取出5张227*227的图片，然后将这5张图片在x轴上进行镜像，总共获得10%张227*227的图片作为模型的输入

crops_data= zeros(CROPPED_DIM, CROPPED_DIM, 3, 10, 'single');

indices= [0 IMAGE_DIM-CROPPED_DIM] + 1;

n= 1;

%此处两个for循环并非是1：indices，而是第一次取indices(1)=1，然后%%%%%%%%%%%%%是indices(2)=30，每一层循环两次

%分别读取图片四个角大小为CROPPED_DIM*CROPPED_DIM的图片

fori = indices

forj = indices

crops_data(:,:, :, n) = im_data(i:i+CROPPED_DIM-1, j:j+CROPPED_DIM-1, :); %截取角

crops_data(:,:, :, n+5) = crops_data(end:-1:1, :, :, n); %将该图片在x轴上作镜像

n= n + 1;

end

center= floor(indices(2) / 2) + 1;

crops_data(:,:,:,5)= ...

im_data(center:center+CROPPED_DIM-1,center:center+CROPPED_DIM-1,:);

crops_data(:,:,:,10)= crops_data(end:-1:1, :, :, 5);

0 0