caffe入门----模型

来源:互联网 发布:免费音乐制作软件 知乎 编辑:程序博客网 时间:2024/06/06 06:35

一个深度学习系统最核心的两个方面就是数据和模型啦~数据的读取我们就略过了((●′ω`●) 自己看看把,注意bp的时候参数不变就好了)
一个深度学习模型通常由三个部分参数组成:

  • 可学习参数:
    模型训练就是训练这些可学习参数啦,又称可训练参数、神经网络权系数等。在内存中使用Blob对象保持,必要时以二进制Protobuffer文件(*.caffemodel)形态序列化并存储于磁盘上。
  • 结构参数
    包括卷积层/全连接层/下采样层数目、卷积核数目、卷积核大小等描述网络结构的参数。这个部分就是凸显一个思想了。看你网络怎么设计了。结构参数使用ProtoBuffer文本格式(*.prototxt)描述,网络初始化通过该描述文件构建Net对象、Layer对象等。
  • 训练超参数
    同样使用ProtoBuffer文本格式(*.prototxt)描述,训练阶段利用该描述文件构建求解(Solver)对象,该对象按照一定规则在训练网路时自动调节这些超参数值。

结构参数prototxt大概模样:

name: "LeNet"layer {  name: "mnist"  type: "Data"  top: "data"  top: "label"  include {    phase: TRAIN  }  transform_param {    scale: 0.00390625  }  data_param {    source: "examples/mnist/mnist_train_lmdb"    batch_size: 64    backend: LMDB  }}layer {  name: "mnist"  type: "Data"  top: "data"  top: "label"  include {    phase: TEST  }  transform_param {    scale: 0.00390625  }  data_param {    source: "examples/mnist/mnist_test_lmdb"    batch_size: 100    backend: LMDB  }}layer {  name: "conv1"  type: "Convolution"  bottom: "data"  top: "conv1"  param {    lr_mult: 1  }  param {    lr_mult: 2  }  convolution_param {    num_output: 20    kernel_size: 5    stride: 1    weight_filler {      type: "xavier"    }    bias_filler {      type: "constant"    }  }}layer {  name: "pool1"  type: "Pooling"  bottom: "conv1"  top: "pool1"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "conv2"  type: "Convolution"  bottom: "pool1"  top: "conv2"  param {    lr_mult: 1  }  param {    lr_mult: 2  }  convolution_param {    num_output: 50    kernel_size: 5    stride: 1    weight_filler {      type: "xavier"    }    bias_filler {      type: "constant"    }  }}layer {  name: "pool2"  type: "Pooling"  bottom: "conv2"  top: "pool2"  pooling_param {    pool: MAX    kernel_size: 2    stride: 2  }}layer {  name: "ip1"  type: "InnerProduct"  bottom: "pool2"  top: "ip1"  param {    lr_mult: 1  }  param {    lr_mult: 2  }  inner_product_param {    num_output: 500    weight_filler {      type: "xavier"    }    bias_filler {      type: "constant"    }  }}layer {  name: "relu1"  type: "ReLU"  bottom: "ip1"  top: "ip1"}layer {  name: "ip2"  type: "InnerProduct"  bottom: "ip1"  top: "ip2"  param {    lr_mult: 1  }  param {    lr_mult: 2  }  inner_product_param {    num_output: 10    weight_filler {      type: "xavier"    }    bias_filler {      type: "constant"    }  }}layer {  name: "accuracy"  type: "Accuracy"  bottom: "ip2"  bottom: "label"  top: "accuracy"  include {    phase: TEST  }}layer {  name: "loss"  type: "SoftmaxWithLoss"  bottom: "ip2"  bottom: "label"  top: "loss"}

训练超参数大概长这样:

# The train/test net protocol buffer definitionnet: "examples/mnist/lenet_train_test.prototxt"# test_iter specifies how many forward passes the test should carry out.# In the case of MNIST, we have test batch size 100 and 100 test iterations,# covering the full 10,000 testing images.test_iter: 100# Carry out testing every 500 training iterations.test_interval: 500# The base learning rate, momentum and the weight decay of the network.base_lr: 0.01momentum: 0.9weight_decay: 0.0005# The learning rate policylr_policy: "inv"gamma: 0.0001power: 0.75# Display every 100 iterationsdisplay: 100# The maximum number of iterationsmax_iter: 10000# snapshot intermediate resultssnapshot: 5000snapshot_prefix: "examples/mnist/lenet"# solver mode: CPU or GPUsolver_mode: GPU

参考:
赵永科 《深度学习 21天实战Caffe》

0 0