Caffe:Imagenet

来源：互联网发布：气宗剑宗知乎编辑：程序博客网时间：2024/04/27 19:15

Imagenet的数据有点大哦，将近140G，现在还处于下载过程中，下载完了之后我会进行测试的，那么就将就一下用一下自己从imagenet随手下的两个数据，其实步骤差不来多少，主要是数据~~总感觉有种标题党欺骗大家的嫌疑= =~~。

一、准备数据

嗯由于是自己的数据，没有对应的namelist，于是只好自己写一个程序来进行相应工作，网上搜了一段代码：

void GetAllFiles(string path, vector<string>& files){long   hFile = 0;//文件信息    struct _finddata_t fileinfo;string p;if ((hFile = _findfirst(p.assign(path).append("\\*").c_str(), &fileinfo)) != -1){do{if ((fileinfo.attrib &  _A_SUBDIR)){if (strcmp(fileinfo.name, ".") != 0 && strcmp(fileinfo.name, "..") != 0){files.push_back(p.assign(path).append("\\").append(fileinfo.name));GetAllFiles(p.assign(path).append("\\").append(fileinfo.name), files);}}else{files.push_back(p.assign(path).append("\\").append(fileinfo.name));}} while (_findnext(hFile, &fileinfo) == 0);_findclose(hFile);}}

大家可以适当进行修改，随手百度应该都有别人写好的程序。然后就会得到两个txt文件，里面是图片的名字。然后将数据转换成leveldb格式，修改create_imagenet.sh文件，我的如下：

#!/usr/bin/env sh# Create the imagenet lmdb inputs# N.B. set the path to the imagenet train + val data dirscd ../../</pre><pre name="code" class="cpp">EXAMPLE=examples/testDATA=data/testTOOLS=binTRAIN_DATA_ROOT=../../../../../Dataset/Train/# 这里的TRAIN_DATA_ROOT是训练数据所在的文件夹，VAL_DATA_ROOT同理VAL_DATA_ROOT=../../../../../Dataset/Val/#图片需要调整大小为256x256# Set RESIZE=true to resize the images to 256x256. Leave as false if images have# already been resized using another tool.RESIZE=trueif $RESIZE; then  RESIZE_HEIGHT=256  RESIZE_WIDTH=256else  RESIZE_HEIGHT=0  RESIZE_WIDTH=0fi# 这里下面两个if是判断数据所在文件夹是否还有子文件夹，有的话就会退出if [ ! -d "$TRAIN_DATA_ROOT" ]; then  echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"  echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \       "where the ImageNet training data is stored."  exit 1fiif [ ! -d "$VAL_DATA_ROOT" ]; then  echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"  echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \       "where the ImageNet validation data is stored."  exit 1fi# 下面运行程序，在运行程序前记得确保有convert_imageset.exe这个程序，没有的话请编译一遍。同时train.txt和val.txt两个文件记得放在相应路径echo "Creating train leveldb..."GLOG_logtostderr=1 $TOOLS/convert_imageset \    --resize_height=$RESIZE_HEIGHT \    --resize_width=$RESIZE_WIDTH \    --shuffle \    $TRAIN_DATA_ROOT \    $DATA/train.txt \    $EXAMPLE/ilsvrc12_train_leveldbecho "Creating val leveldb..."GLOG_logtostderr=1 $TOOLS/convert_imageset \    --resize_height=$RESIZE_HEIGHT \    --resize_width=$RESIZE_WIDTH \    --shuffle \    $VAL_DATA_ROOT \    $DATA/val.txt \    $EXAMPLE/ilsvrc12_val_leveldbecho "Done."

二、训练

在训练前，先运行make_imagenet_mean.sh计算数据均值，下面是make_imagenet_mean.sh：（总之路径设对了一般这里没有错）

cd ../../EXAMPLE=examples/testDATA=data/testTOOLS=binecho "Computing mean of imageset..."$TOOLS/compute_image_mean $EXAMPLE/ilsvrc12_train_leveldb \  $DATA/imagenet_mean.binaryprotoecho "Done."

得到imagenet_mean.binaryproto然后按道理就可以运行train_caffenet.sh，但是我运行时出现了这个错误CHECK failed：error == cudaSuccess(11 vs. 0)

这个问题，说实话，遇到的人比较少，而且网上也没怎么说明白是个什么问题，大概跟gpu有关，我后来把batch_size从256改成30了之后发现这个问题自动解决了。稍后会更新这个bug的原因和解决原理~~挖坑待填哈哈哈~~

0 0