caffe使用过程+digits在windows下的安装和运行

来源：互联网发布：杭州米趣网络上市编辑：程序博客网时间：2024/06/05 07:06

一。模型基本组成

想要训练一个caffe模型，需要配置两个文件，包含两个部分：网络模型，参数配置，分别对应*.prototxt , ****_solver.prototxt文件。

Caffe模型文件解析：

预处理图像的leveldb构建

输入：一批图像和label （2和3）
输出：leveldb （4）
指令里包含如下信息：

conver_imageset （构建leveldb的可运行程序）
train/ （此目录放处理的jpg或者其他格式的图像)
label.txt (图像文件名及其label信息)
输出的leveldb文件夹的名字
CPU/GPU (指定是在cpu上还是在gpu上运行code)

CNN网络配置文件

Imagenet_solver.prototxt （包含全局参数的配置的文件）
Imagenet.prototxt （包含训练网络的配置的文件）
Imagenet_val.prototxt （包含测试网络的配置文件）

网络模型：

DATA:一般包括训练数据和测试数据层两种类型。一般指输入层，包含source：数据路径，批处理数据大小batch_size，scale表示数据表示在[0,1]，0.00390625即 1/255

训练数据层：

layer {  name: "mnist"  type: "Data"  top: "data"  top: "label"  include {    phase: TRAIN  }  transform_param {    scale: 0.00390625  }  data_param {    source: "examples/mnist/mnist_train_lmdb"    batch_size: 64    backend: LMDB  }}1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

测试数据层：

layer {  name: "mnist"  type: "Data"  top: "data"  top: "label"  include {    phase: TEST  }  transform_param {    scale: 0.00390625  }  data_param {    source: "examples/mnist/mnist_test_lmdb"    batch_size: 100    backend: LMDB  }}1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

CONVOLUATION：卷积层，blobs_lr:1 , blobs_lr:2分别表示weight 及bias更新时的学习率，这里权重的学习率为solver.prototxt文件中定义的学习率真，bias的学习率真是权重学习率的2倍，这样一般会得到很好的收敛速度。

num_output表示滤波的个数，kernelsize表示滤波的大小，stride表示步长，weight_filter表示滤波的类型

layer {  name: "conv1"  type: "Convolution"  bottom: "data"  top: "conv1"  param {    lr_mult: 1 //weight学习率  }  param {    lr_mult: 2 //bias学习率，一般为weight的两倍  }  convolution_param {    num_output: 20  //滤波器个数    kernel_size: 5    stride: 1  //步长    weight_filler {      type: "xavier"    }    bias_filler {      type: "constant"    }  }}1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

POOLING: 池化层

layer {  name: "pool1"  type: "Pooling"  bottom: "conv1"  top: "pool1"  pooling_param {    pool: MAX    kernel_size: 2     stride: 2  }}1
2
3
4
5
6
7
8
9
10
11

INNER_PRODUCT: 其实表示全连接，不要被名字误导

layer {  name: "ip1"  type: "InnerProduct"  bottom: "pool2"  top: "ip1"  param {    lr_mult: 1  }  param {    lr_mult: 2  }  inner_product_param {    num_output: 500     weight_filler {      type: "xavier"    }    bias_filler {      type: "constant"    }  }}1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

RELU：激活函数，非线性变化层 max( 0 ,x )，一般与CONVOLUTION层成对出现

layer {  name: "relu1"  type: "ReLU"  bottom: "ip1"  top: "ip1"}1
2
3
4
5
6

SOFTMAX:

layer {  name: "loss"  type: "SoftmaxWithLoss"  bottom: "ip2"  bottom: "label"  top: "loss"}1
2
3
4
5
6
7

参数配置文件：

***_solver.prototxt文件定义一些模型训练过程中需要到的参数，比较学习率，权重衰减系数，迭代次数，使用GPU还是CPU等等.

# The train/test net protocol buffer definitionnet: "examples/mnist/lenet_train_test.prototxt"# test_iter specifies how many forward passes the test should carry out.# In the case of MNIST, we have test batch size 100 and 100 test iterations,# covering the full 10,000 testing images.test_iter: 100# Carry out testing every 500 training iterations.test_interval: 500# The base learning rate, momentum and the weight decay of the network.base_lr: 0.01momentum: 0.9weight_decay: 0.0005# The learning rate policylr_policy: "inv"gamma: 0.0001power: 0.75# Display every 100 iterationsdisplay: 100# The maximum number of iterationsmax_iter: 10000# snapshot intermediate resultssnapshot: 5000snapshot_prefix: "examples/mnist/lenet"# solver mode: CPU or GPUsolver_mode: GPUdevice_id: 0  #在cmdcaffe接口下，GPU序号从0开始，如果有一个GPU，则device_id:01
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

训练出的模型被存为*.caffemodel，可供以后使用。
一个完整的网络应该是：
这里写图片描述

步骤

数据准备
准备三组数据：
1. Training Set：用于训练网络
2. Validation Set：用于训练时测试网络准确率
3. Test Set：用于测试网络训练完成后的最终正确率
构建lmdb/leveldb文件，caffe支持三种数据格式输入：images, levelda, lmdb

虽然lmdb的内存消耗是leveldb的1.1倍，但是lmdb的速度比leveldb快10%至15%，更重要的是lmdb允许多种训练模型同时读取同一组数据集。
因此lmdb取代了leveldb成为Caffe默认的数据集生成格式。

定义name.prototxt , name_solver.prototxt文件
训练模型
在windows下训练巨麻烦，要在win下使用.sh文件才行。

在windows使用.sh

安装一波
cygwin
在软件下可以安装，如果出现package不存在的情况可以重新打开setup执行包下载，一般没问题碰到什么问题解决什么问题。

用.bat来测试

去官网http://yann.lecun.com/exdb/mnist/下载mnist数据集。下载后解压到C:\caffe-master\data\mnist
在caffe根目录下，新建一个create_mnist.bat，里面写入如下的脚本。此处可能出错，因为train-images.idx3-ubyte 在解压的时候可能是train-images-idx3-ubyte要注意修改。
.\Build\x64\Release\convert_mnist_data.exe .\data\mnist\mnist_train_lmdb\train-images.idx3-ubyte .\data\mnist\mnist_train_lmdb\train-labels.idx1-ubyte .\examples\mnist\mnist_train_lmdb
echo.
.\Build\x64\Release\convert_mnist_data.exe .\data\mnist\mnist_test_lmdb\t10k-images.idx3-ubyte .\data\mnist\mnist_test_lmdb\t10k-labels.idx1-ubyte .\examples\mnist\mnist_test_lmdb
pause
`
然后双击该脚本运行，即可在E:\caffe\examples\mnist下面生成相应的lmdb数据文件。
在caffe根目录下，新建train_mnist.bat，然后输入如下的脚本，

.\Build\x64\Release\caffe.exe train –solver=.\examples\mnist\lenet_solver.prototxt
pause

然后双击运行，就会开始训练，训练完毕后会得到相应的准确率和损失率。
这里写图片描述、

接下来安装digits：

按照这里装就好了

https://github.com/NVIDIA/DIGITS/blob/digits-5.0/docs/BuildDigitsWindows.md

最后在digits目录下执行python -m digits就可以了
出现了不少bug

bug1

出现找不到pycaffe的情况，
这种情况一般是因为python没有导入caffe的包只需要将CAFFE_ROOT\Build\x64\Release\pycaffe\caffe文件夹复制到anaconda的sitepackages中就可以了。

bug2

出现pkg_resources._vendor.packaging.version.InvalidVersion: Invalid version: 'CAFFE_VERSION'
找到\DIGITS-master\digits\config下的caffe.py
按照下面的中文部分修改。

from __future__ import absolute_importimport impimport osimport platformimport reimport subprocessimport sysfrom . import option_listfrom digits import device_queryfrom digits.utils import parse_versiondef load_from_envvar(envvar):    """    Load information from an installation indicated by an environment variable    """    value = os.environ[envvar].strip().strip("\"' ")#此处需要修改路径，于CAFFE_HOME对应    if platform.system() == 'Windows':        #executable_dir = os.path.join(value, 'install', 'bin')        executable_dir = os.path.join(value)        #python_dir = os.path.join(value, 'install', 'python')        python_dir = os.path.join(value, 'pycaffe')    else:        executable_dir = os.path.join(value, 'build', 'tools')        python_dir = os.path.join(value, 'python')    try:        executable = find_executable_in_dir(executable_dir)        if executable is None:            raise ValueError('Caffe executable not found at "%s"'                             % executable_dir)        if not is_pycaffe_in_dir(python_dir):            raise ValueError('Pycaffe not found in "%s"'                             % python_dir)        import_pycaffe(python_dir)        version, flavor = get_version_and_flavor(executable)    except:        print ('"%s" from %s does not point to a valid installation of Caffe.'               % (value, envvar))        print 'Use the envvar CAFFE_ROOT to indicate a valid installation.'        raise    return executable, version, flavordef load_from_path():    """    Load information from an installation on standard paths (PATH and PYTHONPATH)    """    try:        executable = find_executable_in_dir()        if executable is None:            raise ValueError('Caffe executable not found in PATH')        if not is_pycaffe_in_dir():            raise ValueError('Pycaffe not found in PYTHONPATH')        import_pycaffe()        version, flavor = get_version_and_flavor(executable)    except:        print 'A valid Caffe installation was not found on your system.'        print 'Use the envvar CAFFE_ROOT to indicate a valid installation.'        raise    return executable, version, flavordef find_executable_in_dir(dirname=None):    """    Returns the path to the caffe executable at dirname    If dirname is None, search all directories in sys.path    Returns None if not found    """    if platform.system() == 'Windows':        exe_name = 'caffe.exe'    else:        exe_name = 'caffe'    if dirname is None:        dirnames = [path.strip("\"' ") for path in os.environ['PATH'].split(os.pathsep)]    else:        dirnames = [dirname]    for dirname in dirnames:        path = os.path.join(dirname, exe_name)        if os.path.isfile(path) and os.access(path, os.X_OK):            return path    return Nonedef is_pycaffe_in_dir(dirname=None):    """    Returns True if you can "import caffe" from dirname    If dirname is None, search all directories in sys.path    """    old_path = sys.path    if dirname is not None:        sys.path = [dirname]  # temporarily replace sys.path    try:        imp.find_module('caffe')    except ImportError:        return False    finally:        sys.path = old_path    return Truedef import_pycaffe(dirname=None):    """    Imports caffe    If dirname is not None, prepend it to sys.path first    """    if dirname is not None:        sys.path.insert(0, dirname)        # Add to PYTHONPATH so that build/tools/caffe is aware of python layers there        os.environ['PYTHONPATH'] = '%s%s%s' % (            dirname, os.pathsep, os.environ.get('PYTHONPATH'))    # Suppress GLOG output for python bindings    GLOG_minloglevel = os.environ.pop('GLOG_minloglevel', None)    # Show only "ERROR" and "FATAL"    os.environ['GLOG_minloglevel'] = '2'    # for Windows environment, loading h5py before caffe solves the issue mentioned in    # https://github.com/NVIDIA/DIGITS/issues/47#issuecomment-206292824    import h5py  # noqa    try:        import caffe    except ImportError:        print 'Did you forget to "make pycaffe"?'        raise    # Strange issue with protocol buffers and pickle - see issue #32    sys.path.insert(0, os.path.join(        os.path.dirname(caffe.__file__), 'proto'))    # Turn GLOG output back on for subprocess calls    if GLOG_minloglevel is None:        del os.environ['GLOG_minloglevel']    else:        os.environ['GLOG_minloglevel'] = GLOG_minlogleveldef get_version_and_flavor(executable):    """    Returns (version, flavor)    Should be called after import_pycaffe()    """    version_string = get_version_from_pycaffe()    if version_string is None:        version_string = get_version_from_cmdline(executable)    if version_string is None:        version_string = get_version_from_soname(executable)    if version_string is None:        raise ValueError('Could not find version information for Caffe build ' +                         'at "%s". Upgrade your installation' % executable)    #这部分代码没用，但是会出现bug，我就注释了    #version = parse_version(version_string)    #if parse_version(0, 99, 0) > version > parse_version(0, 9, 0):    #    flavor = 'NVIDIA'    #    minimum_version = '0.11.0'    #    if version < parse_version(minimum_version):    #        raise ValueError(    #            'Required version "%s" is greater than "%s". Upgrade your installation.'    #            % (minimum_version, version_string))    #else:    #    flavor = 'BVLC'    flavor = 'BVLC'    return version_string, flavordef get_version_from_pycaffe():    try:        from caffe import __version__ as version        return version    except ImportError:        return Nonedef get_version_from_cmdline(executable):    command = [executable, '-version']    p = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)    if p.wait():        print p.stderr.read().strip()        raise RuntimeError('"%s" returned error code %s' % (command, p.returncode))    pattern = 'version'    for line in p.stdout:        if pattern in line:            return line[line.find(pattern) + len(pattern) + 1:].strip()    return Nonedef get_version_from_soname(executable):    command = ['ldd', executable]    p = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)    if p.wait():        print p.stderr.read().strip()        raise RuntimeError('"%s" returned error code %s' % (command, p.returncode))    # Search output for caffe library    libname = 'libcaffe'    caffe_line = None    for line in p.stdout:        if libname in line:            caffe_line = line            break    if caffe_line is None:        raise ValueError('libcaffe not found in linked libraries for "%s"'                         % executable)    # Read the symlink for libcaffe from ldd output    symlink = caffe_line.split()[2]    filename = os.path.basename(os.path.realpath(symlink))    # parse the version string    match = re.match(r'%s(.*)\.so\.(\S+)$' % (libname), filename)    if match:        return match.group(2)    else:        return None#看这里，看这里，一个路径问题#我们需要在环境变量里声明一下，CAFFE_ROOT 或者 CAFFE_HOME都可以，指向caffe编译后的 ./Build/x64/Releaseif 'CAFFE_ROOT' in os.environ:    executable, version, flavor = load_from_envvar('CAFFE_ROOT')elif 'CAFFE_HOME' in os.environ:    executable, version, flavor = load_from_envvar('CAFFE_HOME')else:    executable, version, flavor = load_from_path()option_list['caffe'] = {    'executable': executable,    'version': version,    'flavor': flavor,    'multi_gpu': (flavor == 'BVLC' or parse_version(version) >= parse_version(0, 12)),    'cuda_enabled': (len(device_query.get_devices()) > 0),}1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239

再次运行：
这里写图片描述

训练：
官方教程

这里写图片描述

完美~

阅读全文

0 0