一。模型基本组成
想要训练一个caffe模型,需要配置两个文件,包含两个部分:网络模型,参数配置,分别对应*.prototxt , ****_solver.prototxt文件。
Caffe模型文件解析:
预处理图像的leveldb构建
输入:一批图像和label (2和3)
输出:leveldb (4)
指令里包含如下信息:
- conver_imageset (构建leveldb的可运行程序)
- train/ (此目录放处理的jpg或者其他格式的图像)
- label.txt (图像文件名及其label信息)
- 输出的leveldb文件夹的名字
- CPU/GPU (指定是在cpu上还是在gpu上运行code)
CNN网络配置文件
- Imagenet_solver.prototxt (包含全局参数的配置的文件)
- Imagenet.prototxt (包含训练网络的配置的文件)
- Imagenet_val.prototxt (包含测试网络的配置文件)
网络模型:
DATA:一般包括训练数据和测试数据层两种类型。 一般指输入层,包含source:数据路径,批处理数据大小batch_size,scale表示数据表示在[0,1],0.00390625即 1/255
训练数据层:
layer { name: "mnist" type: "Data" top: "data" top: "label" include { phase: TRAIN } transform_param { scale: 0.00390625 } data_param { source: "examples/mnist/mnist_train_lmdb" batch_size: 64 backend: LMDB }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
测试数据层:
layer { name: "mnist" type: "Data" top: "data" top: "label" include { phase: TEST } transform_param { scale: 0.00390625 } data_param { source: "examples/mnist/mnist_test_lmdb" batch_size: 100 backend: LMDB }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
CONVOLUATION:卷积层,blobs_lr:1 , blobs_lr:2分别表示weight 及bias更新时的学习率,这里权重的学习率为solver.prototxt文件中定义的学习率真,bias的学习率真是权重学习率的2倍,这样一般会得到很好的收敛速度。
num_output表示滤波的个数,kernelsize表示滤波的大小,stride表示步长,weight_filter表示滤波的类型
layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" param { lr_mult: 1 //weight学习率 } param { lr_mult: 2 //bias学习率,一般为weight的两倍 } convolution_param { num_output: 20 //滤波器个数 kernel_size: 5 stride: 1 //步长 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
POOLING: 池化层
layer { name: "pool1" type: "Pooling" bottom: "conv1" top: "pool1" pooling_param { pool: MAX kernel_size: 2 stride: 2 }}
INNER_PRODUCT: 其实表示全连接,不要被名字误导
layer { name: "ip1" type: "InnerProduct" bottom: "pool2" top: "ip1" param { lr_mult: 1 } param { lr_mult: 2 } inner_product_param { num_output: 500 weight_filler { type: "xavier" } bias_filler { type: "constant" } }}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
RELU:激活函数,非线性变化层 max( 0 ,x ),一般与CONVOLUTION层成对出现
layer { name: "relu1" type: "ReLU" bottom: "ip1" top: "ip1"}
SOFTMAX:
layer { name: "loss" type: "SoftmaxWithLoss" bottom: "ip2" bottom: "label" top: "loss"}
参数配置文件:
***_solver.prototxt文件定义一些模型训练过程中需要到的参数,比较学习率,权重衰减系数,迭代次数,使用GPU还是CPU等等.
# The train/test net protocol buffer definitionnet: "examples/mnist/lenet_train_test.prototxt"# test_iter specifies how many forward passes the test should carry out.# In the case of MNIST, we have test batch size 100 and 100 test iterations,# covering the full 10,000 testing images.test_iter: 100# Carry out testing every 500 training iterations.test_interval: 500# The base learning rate, momentum and the weight decay of the network.base_lr: 0.01momentum: 0.9weight_decay: 0.0005# The learning rate policylr_policy: "inv"gamma: 0.0001power: 0.75# Display every 100 iterationsdisplay: 100# The maximum number of iterationsmax_iter: 10000# snapshot intermediate resultssnapshot: 5000snapshot_prefix: "examples/mnist/lenet"# solver mode: CPU or GPUsolver_mode: GPUdevice_id: 0 #在cmdcaffe接口下,GPU序号从0开始,如果有一个GPU,则device_id:0
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
训练出的模型被存为*.caffemodel,可供以后使用。
一个完整的网络应该是:
步骤
- 数据准备
准备三组数据:
- Training Set:用于训练网络
- Validation Set:用于训练时测试网络准确率
- Test Set:用于测试网络训练完成后的最终正确率
- 构建lmdb/leveldb文件,caffe支持三种数据格式输入:images, levelda, lmdb
虽然lmdb的内存消耗是leveldb的1.1倍,但是lmdb的速度比leveldb快10%至15%,更重要的是lmdb允许多种训练模型同时读取同一组数据集。
因此lmdb取代了leveldb成为Caffe默认的数据集生成格式。
在windows使用.sh
安装一波
cygwin
在软件下可以安装,如果出现package不存在的情况可以重新打开setup执行包下载,一般没问题碰到什么问题解决什么问题。
用.bat来测试
- 去官网http://yann.lecun.com/exdb/mnist/下载mnist数据集。下载后解压到C:\caffe-master\data\mnist
在caffe根目录下,新建一个create_mnist.bat,里面写入如下的脚本。此处可能出错,因为train-images.idx3-ubyte
在解压的时候可能是train-images-idx3-ubyte
要注意修改。
.\Build\x64\Release\convert_mnist_data.exe .\data\mnist\mnist_train_lmdb\train-images.idx3-ubyte .\data\mnist\mnist_train_lmdb\train-labels.idx1-ubyte .\examples\mnist\mnist_train_lmdb
echo.
.\Build\x64\Release\convert_mnist_data.exe .\data\mnist\mnist_test_lmdb\t10k-images.idx3-ubyte .\data\mnist\mnist_test_lmdb\t10k-labels.idx1-ubyte .\examples\mnist\mnist_test_lmdb
pause
`
然后双击该脚本运行,即可在E:\caffe\examples\mnist下面生成相应的lmdb数据文件。
- 在caffe根目录下,新建train_mnist.bat,然后输入如下的脚本,
.\Build\x64\Release\caffe.exe train –solver=.\examples\mnist\lenet_solver.prototxt
pause
然后双击运行,就会开始训练,训练完毕后会得到相应的准确率和损失率。
、
接下来安装digits:
按照这里装就好了
https://github.com/NVIDIA/DIGITS/blob/digits-5.0/docs/BuildDigitsWindows.md
最后在digits目录下 执行python -m digits
就可以了
出现了不少bug
bug1
出现找不到pycaffe的情况,
这种情况一般是因为python没有导入caffe的包只需要将CAFFE_ROOT\Build\x64\Release\pycaffe\caffe文件夹复制到anaconda的sitepackages中就可以了。
bug2
出现pkg_resources._vendor.packaging.version.InvalidVersion: Invalid version: 'CAFFE_VERSION'
找到\DIGITS-master\digits\config下的caffe.py
按照下面的中文部分修改。
from __future__ import absolute_importimport impimport osimport platformimport reimport subprocessimport sysfrom . import option_listfrom digits import device_queryfrom digits.utils import parse_versiondef load_from_envvar(envvar): """ Load information from an installation indicated by an environment variable """ value = os.environ[envvar].strip().strip("\"' ") if platform.system() == 'Windows': executable_dir = os.path.join(value) python_dir = os.path.join(value, 'pycaffe') else: executable_dir = os.path.join(value, 'build', 'tools') python_dir = os.path.join(value, 'python') try: executable = find_executable_in_dir(executable_dir) if executable is None: raise ValueError('Caffe executable not found at "%s"' % executable_dir) if not is_pycaffe_in_dir(python_dir): raise ValueError('Pycaffe not found in "%s"' % python_dir) import_pycaffe(python_dir) version, flavor = get_version_and_flavor(executable) except: print ('"%s" from %s does not point to a valid installation of Caffe.' % (value, envvar)) print 'Use the envvar CAFFE_ROOT to indicate a valid installation.' raise return executable, version, flavordef load_from_path(): """ Load information from an installation on standard paths (PATH and PYTHONPATH) """ try: executable = find_executable_in_dir() if executable is None: raise ValueError('Caffe executable not found in PATH') if not is_pycaffe_in_dir(): raise ValueError('Pycaffe not found in PYTHONPATH') import_pycaffe() version, flavor = get_version_and_flavor(executable) except: print 'A valid Caffe installation was not found on your system.' print 'Use the envvar CAFFE_ROOT to indicate a valid installation.' raise return executable, version, flavordef find_executable_in_dir(dirname=None): """ Returns the path to the caffe executable at dirname If dirname is None, search all directories in sys.path Returns None if not found """ if platform.system() == 'Windows': exe_name = 'caffe.exe' else: exe_name = 'caffe' if dirname is None: dirnames = [path.strip("\"' ") for path in os.environ['PATH'].split(os.pathsep)] else: dirnames = [dirname] for dirname in dirnames: path = os.path.join(dirname, exe_name) if os.path.isfile(path) and os.access(path, os.X_OK): return path return Nonedef is_pycaffe_in_dir(dirname=None): """ Returns True if you can "import caffe" from dirname If dirname is None, search all directories in sys.path """ old_path = sys.path if dirname is not None: sys.path = [dirname] try: imp.find_module('caffe') except ImportError: return False finally: sys.path = old_path return Truedef import_pycaffe(dirname=None): """ Imports caffe If dirname is not None, prepend it to sys.path first """ if dirname is not None: sys.path.insert(0, dirname) os.environ['PYTHONPATH'] = '%s%s%s' % ( dirname, os.pathsep, os.environ.get('PYTHONPATH')) GLOG_minloglevel = os.environ.pop('GLOG_minloglevel', None) os.environ['GLOG_minloglevel'] = '2' import h5py try: import caffe except ImportError: print 'Did you forget to "make pycaffe"?' raise sys.path.insert(0, os.path.join( os.path.dirname(caffe.__file__), 'proto')) if GLOG_minloglevel is None: del os.environ['GLOG_minloglevel'] else: os.environ['GLOG_minloglevel'] = GLOG_minlogleveldef get_version_and_flavor(executable): """ Returns (version, flavor) Should be called after import_pycaffe() """ version_string = get_version_from_pycaffe() if version_string is None: version_string = get_version_from_cmdline(executable) if version_string is None: version_string = get_version_from_soname(executable) if version_string is None: raise ValueError('Could not find version information for Caffe build ' + 'at "%s". Upgrade your installation' % executable) flavor = 'BVLC' return version_string, flavordef get_version_from_pycaffe(): try: from caffe import __version__ as version return version except ImportError: return Nonedef get_version_from_cmdline(executable): command = [executable, '-version'] p = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE) if p.wait(): print p.stderr.read().strip() raise RuntimeError('"%s" returned error code %s' % (command, p.returncode)) pattern = 'version' for line in p.stdout: if pattern in line: return line[line.find(pattern) + len(pattern) + 1:].strip() return Nonedef get_version_from_soname(executable): command = ['ldd', executable] p = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE) if p.wait(): print p.stderr.read().strip() raise RuntimeError('"%s" returned error code %s' % (command, p.returncode)) libname = 'libcaffe' caffe_line = None for line in p.stdout: if libname in line: caffe_line = line break if caffe_line is None: raise ValueError('libcaffe not found in linked libraries for "%s"' % executable) symlink = caffe_line.split()[2] filename = os.path.basename(os.path.realpath(symlink)) match = re.match(r'%s(.*)\.so\.(\S+)$' % (libname), filename) if match: return match.group(2) else: return Noneif 'CAFFE_ROOT' in os.environ: executable, version, flavor = load_from_envvar('CAFFE_ROOT')elif 'CAFFE_HOME' in os.environ: executable, version, flavor = load_from_envvar('CAFFE_HOME')else: executable, version, flavor = load_from_path()option_list['caffe'] = { 'executable': executable, 'version': version, 'flavor': flavor, 'multi_gpu': (flavor == 'BVLC' or parse_version(version) >= parse_version(0, 12)), 'cuda_enabled': (len(device_query.get_devices()) > 0),}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
- 73
- 74
- 75
- 76
- 77
- 78
- 79
- 80
- 81
- 82
- 83
- 84
- 85
- 86
- 87
- 88
- 89
- 90
- 91
- 92
- 93
- 94
- 95
- 96
- 97
- 98
- 99
- 100
- 101
- 102
- 103
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
- 118
- 119
- 120
- 121
- 122
- 123
- 124
- 125
- 126
- 127
- 128
- 129
- 130
- 131
- 132
- 133
- 134
- 135
- 136
- 137
- 138
- 139
- 140
- 141
- 142
- 143
- 144
- 145
- 146
- 147
- 148
- 149
- 150
- 151
- 152
- 153
- 154
- 155
- 156
- 157
- 158
- 159
- 160
- 161
- 162
- 163
- 164
- 165
- 166
- 167
- 168
- 169
- 170
- 171
- 172
- 173
- 174
- 175
- 176
- 177
- 178
- 179
- 180
- 181
- 182
- 183
- 184
- 185
- 186
- 187
- 188
- 189
- 190
- 191
- 192
- 193
- 194
- 195
- 196
- 197
- 198
- 199
- 200
- 201
- 202
- 203
- 204
- 205
- 206
- 207
- 208
- 209
- 210
- 211
- 212
- 213
- 214
- 215
- 216
- 217
- 218
- 219
- 220
- 221
- 222
- 223
- 224
- 225
- 226
- 227
- 228
- 229
- 230
- 231
- 232
- 233
- 234
- 235
- 236
- 237
- 238
- 239
再次运行:
训练:
官方教程
完美~