pycaffe学习

来源：互联网发布：linux vi 序号编辑：程序博客网时间：2024/06/05 20:10

环境： ubuntu14.04 64bit， python2.7（not anaconda), caffe当前最新版

首先可以看下caffe官网给出的pycaffe有关介绍：

caffe.Net is the central interface for loading, configuring, and running models. caffe.Classifier and caffe.Detector provide convenience interfaces for common tasks.
caffe.SGDSolver exposes the solving interface.
caffe.io handles input / output with preprocessing and protocol buffers.
caffe.draw visualizes network architectures.
Caffe blobs are exposed as numpy ndarrays for ease-of-use and efficiency.
可以看出pycaffe的接口是相当简洁的。

一、如何在python中使用caffe？

第一步肯定是要编译pycaffe了，教程很多，就不说了。
第二步，把pycaffe加入环境。最简单的就是在要使用的caffe的Python文件中，加入如下代码：

import sysimport os# Add caffe packagecaffe_python_dir = "caffe-master/python"sys.path.append(caffe_python_dir)if not os.path.exists(caffe_python_dir):    print "python caffe not found."    exit(-1)import caffe

学习caffe.Net类

从之前介绍可以看出，caffe net是caffe的一个核心类，包装了caffe的大部分功能。
我们先用help(caffe.Net)看看输出：

Help on Net in module caffe._caffe object:class Net(Boost.Python.instance) |  Method resolution order: |      Net |      Boost.Python.instance |      __builtin__.object |   |  Methods defined here: |   |  __init__(...) |      __init__( (object)arg1, (str)arg2, (int)arg3) -> object : |       |          C++ signature : |              void* __init__(boost::python::api::object,std::string,int) |       |      __init__( (object)arg1, (str)arg2, (str)arg3, (int)arg4) -> object : |       |          C++ signature : |              void* __init__(boost::python::api::object,std::string,std::string,int) |   |  __reduce__ = <unnamed Boost.Python function>(...) |   |  backward = _Net_backward(self, diffs=None, start=None, end=None, **kwargs) |      Backward pass: prepare diffs and run the net backward. |       |      Parameters |      ---------- |      diffs : list of diffs to return in addition to bottom diffs. |      kwargs : Keys are output blob names and values are diff ndarrays. |              If None, top diffs are taken from forward loss. |      start : optional name of layer at which to begin the backward pass |      end : optional name of layer at which to finish the backward pass |          (inclusive) |       |      Returns |      ------- |      outs: {blob name: diff ndarray} dict. |   |  copy_from(...) |      copy_from( (Net)arg1, (str)arg2) -> None : |       |          C++ signature : |              void copy_from(caffe::Net<float> {lvalue},std::string) |   |  forward = _Net_forward(self, blobs=None, start=None, end=None, **kwargs) |      Forward pass: prepare inputs and run the net forward. |       |      Parameters |      ---------- |      blobs : list of blobs to return in addition to output blobs. |      kwargs : Keys are input blob names and values are blob ndarrays. |               For formatting inputs for Caffe, see Net.preprocess(). |               If None, input is taken from data layers. |      start : optional name of layer at which to begin the forward pass |      end : optional name of layer at which to finish the forward pass |            (inclusive) |       |      Returns |      ------- |      outs : {blob name: blob ndarray} dict. |   |  forward_all = _Net_forward_all(self, blobs=None, **kwargs) |      Run net forward in batches. |       |      Parameters |      ---------- |      blobs : list of blobs to extract as in forward() |      kwargs : Keys are input blob names and values are blob ndarrays. |               Refer to forward(). |       |      Returns |      ------- |      all_outs : {blob name: list of blobs} dict. |   |  forward_backward_all = _Net_forward_backward_all(self, blobs=None, diffs=None, **kwargs) |      Run net forward + backward in batches. |       |      Parameters |      ---------- |      blobs: list of blobs to extract as in forward() |      diffs: list of diffs to extract as in backward() |      kwargs: Keys are input (for forward) and output (for backward) blob names |              and values are ndarrays. Refer to forward() and backward(). |              Prefilled variants are called for lack of input or output blobs. |       |      Returns |      ------- |      all_blobs: {blob name: blob ndarray} dict. |      all_diffs: {blob name: diff ndarray} dict. |   |  reshape(...) |      reshape( (Net)arg1) -> None : |       |          C++ signature : |              void reshape(caffe::Net<float> {lvalue}) |   |  save(...) |      save( (Net)arg1, (str)arg2) -> None : |       |          C++ signature : |              void save(caffe::Net<float>,std::string) |   |  set_input_arrays = _Net_set_input_arrays(self, data, labels) |      Set input arrays of the in-memory MemoryDataLayer. |      (Note: this is only for networks declared with the memory data layer.) |   |  share_with(...) |      share_with( (Net)arg1, (Net)arg2) -> None : |       |          C++ signature : |              void share_with(caffe::Net<float> {lvalue},caffe::Net<float> const*) |   |  ---------------------------------------------------------------------- |  Data descriptors defined here: |   |  blob_loss_weights |      An OrderedDict (bottom to top, i.e., input to output) of network |      blob loss weights indexed by name |   |  blobs |      An OrderedDict (bottom to top, i.e., input to output) of network |      blobs indexed by name |   |  bottom_names |   |  inputs |   |  layers |   |  outputs |   |  params |      An OrderedDict (bottom to top, i.e., input to output) of network |      parameters indexed by name; each is a list of multiple blobs (e.g., |      weights and biases) |   |  top_names |   |  ---------------------------------------------------------------------- |  Data descriptors inherited from Boost.Python.instance: |   |  __dict__ |   |  __weakref__ |   |  ---------------------------------------------------------------------- |  Data and other attributes inherited from Boost.Python.instance: |   |  __new__ = <built-in method __new__ of Boost.Python.class object> |      T.__new__(S, ...) -> a new object with type S, a subtype of T

介绍还是挺详细的， init就是构造函数，接受3个参数，第一个就是定义的网络文件路径，第二个就是模型的路径，第三个就是运行方式，是TRAIN还是TEST。
如果是TEST的话，就是先要自己设置好data。

# load image, switch to BGR, subtract mean, and make dims C x H x W for Caffeim = Image.open('VOC2012/JPEGImages/2007_000129.jpg')in_ = np.array(im, dtype=np.float32)in_ = in_[:,:,::-1]in_ -= np.array((104.00698793,116.66876762,122.67891434))in_ = in_.transpose((2,0,1))# load netnet = caffe.Net('deploy.prototxt',                'your.caffemodel', caffe.TEST)# shape for input (data blob is N x C x H x W), set datanet.blobs['data'].reshape(1, *in_.shape)net.blobs['data'].data[...] = in_

学习caffe.io

Help on module caffe.io in caffe:NAME    caffe.ioFILE    caffe-master/python/caffe/io.pyCLASSES    Transformer    class Transformer     |  Transform input for feeding into a Net.     |       |  Note: this is mostly for illustrative purposes and it is likely better     |  to define your own input preprocessing routine for your needs.     |       |  Parameters     |  ----------     |  net : a Net for which the input should be prepared     |       |  Methods defined here:     |       |  __init__(self, inputs)     |       |  deprocess(self, in_, data)     |      Invert Caffe formatting; see preprocess().     |       |  preprocess(self, in_, data)     |      Format input for Caffe:     |      - convert to single     |      - resize to input dimensions (preserving number of channels)     |      - transpose dimensions to K x H x W     |      - reorder channels (for instance color to BGR)     |      - scale raw input (e.g. from [0, 1] to [0, 255] for ImageNet models)     |      - subtract mean     |      - scale feature     |           |      Parameters     |      ----------     |      in_ : name of input blob to preprocess for     |      data : (H' x W' x K) ndarray     |           |      Returns     |      -------     |      caffe_in : (K x H x W) ndarray for input to a Net     |       |  set_channel_swap(self, in_, order)     |      Set the input channel order for e.g. RGB to BGR conversion     |      as needed for the reference ImageNet model.     |      N.B. this assumes the channels are the first dimension AFTER transpose.     |           |      Parameters     |      ----------     |      in_ : which input to assign this channel order     |      order : the order to take the channels.     |          (2,1,0) maps RGB to BGR for example.     |       |  set_input_scale(self, in_, scale)     |      Set the scale of preprocessed inputs s.t. the blob = blob * scale.     |      N.B. input_scale is done AFTER mean subtraction and other preprocessing     |      while raw_scale is done BEFORE.     |           |      Parameters     |      ----------     |      in_ : which input to assign this scale factor     |      scale : scale coefficient     |       |  set_mean(self, in_, mean)     |      Set the mean to subtract for centering the data.     |           |      Parameters     |      ----------     |      in_ : which input to assign this mean.     |      mean : mean ndarray (input dimensional or broadcastable)     |       |  set_raw_scale(self, in_, scale)     |      Set the scale of raw features s.t. the input blob = input * scale.     |      While Python represents images in [0, 1], certain Caffe models     |      like CaffeNet and AlexNet represent images in [0, 255] so the raw_scale     |      of these models must be 255.     |           |      Parameters     |      ----------     |      in_ : which input to assign this scale factor     |      scale : scale coefficient     |       |  set_transpose(self, in_, order)     |      Set the input channel order for e.g. RGB to BGR conversion     |      as needed for the reference ImageNet model.     |           |      Parameters     |      ----------     |      in_ : which input to assign this channel order     |      order : the order to transpose the dimensionsFUNCTIONS    array_to_blobproto(arr, diff=None)        Converts a N-dimensional array to blob proto. If diff is given, also        convert the diff. You need to make sure that arr and diff have the same        shape, and this function does not do sanity check.    array_to_datum(arr, label=0)        Converts a 3-dimensional array to datum. If the array has dtype uint8,        the output data will be encoded as a string. Otherwise, the output data        will be stored in float format.    arraylist_to_blobprotovecor_str(arraylist)        Converts a list of arrays to a serialized blobprotovec, which could be        then passed to a network for processing.    blobproto_to_array(blob, return_diff=False)        Convert a blob proto to an array. In default, we will just return the data,        unless return_diff is True, in which case we will return the diff.    blobprotovector_str_to_arraylist(str)        Converts a serialized blobprotovec to a list of arrays.    datum_to_array(datum)        Converts a datum to an array. Note that the label is not returned,        as one can easily get it by calling datum.label.    load_image(filename, color=True)        Load an image converting from grayscale or alpha as needed.        Parameters        ----------        filename : string        color : boolean            flag for color format. True (default) loads as RGB while False            loads as intensity (if image is already grayscale).        Returns        -------        image : an image with type np.float32 in range [0, 1]            of size (H x W x 3) in RGB or            of size (H x W x 1) in grayscale.    oversample(images, crop_dims)        Crop images into the four corners, center, and their mirrored versions.        Parameters        ----------        image : iterable of (H x W x K) ndarrays        crop_dims : (height, width) tuple for the crops.        Returns        -------        crops : (10*N x H x W x K) ndarray of crops for number of inputs N.    resize_image(im, new_dims, interp_order=1)        Resize an image array with interpolation.        Parameters        ----------        im : (H x W x K) ndarray        new_dims : (height, width) tuple of new dimensions.        interp_order : interpolation order, default is linear.        Returns        -------        im : resized ndarray with shape (new_dims[0], new_dims[1], K)

Transformer类主要是把其他类型的数据转换成caffe所需的类型。请注意，一般情况下，我们使用PIL读取的图像是RGB顺序的，但是caffe使用opencv读取的图像是BGR的，所以需要转换下顺序。
（好像我就直接复制了点help，没什么贡献，好吧，继续复制。）

caffe.SGDSolver

help(caffe.SGDSolver)Help on class SGDSolver in module caffe._caffe:class SGDSolver(Solver) |  Method resolution order: |      SGDSolver |      Solver |      Boost.Python.instance |      __builtin__.object |   |  Methods defined here: |   |  __init__(...) |      __init__( (object)arg1, (str)arg2) -> None : |       |          C++ signature : |              void __init__(_object*,std::string) |   |  __reduce__ = <unnamed Boost.Python function>(...) |   |  ---------------------------------------------------------------------- |  Data and other attributes defined here: |   |  __instance_size__ = 32 |   |  ---------------------------------------------------------------------- |  Methods inherited from Solver: |   |  restore(...) |      restore( (Solver)arg1, (str)arg2) -> None : |       |          C++ signature : |              void restore(caffe::Solver<float> {lvalue},char const*) |   |  snapshot(...) |      snapshot( (Solver)arg1) -> None : |       |          C++ signature : |              void snapshot(caffe::Solver<float> {lvalue}) |   |  solve(...) |      solve( (Solver)arg1 [, (str)arg2]) -> None : |       |          C++ signature : |              void solve(caffe::Solver<float> {lvalue} [,char const*]) |   |  step(...) |      step( (Solver)arg1, (int)arg2) -> None : |       |          C++ signature : |              void step(caffe::Solver<float> {lvalue},int) |   |  ---------------------------------------------------------------------- |  Data descriptors inherited from Solver: |   |  iter |   |  net |   |  test_nets |   |  ---------------------------------------------------------------------- |  Data descriptors inherited from Boost.Python.instance: |   |  __dict__ |   |  __weakref__ |   |  ---------------------------------------------------------------------- |  Data and other attributes inherited from Boost.Python.instance: |   |  __new__ = <built-in method __new__ of Boost.Python.class object> |      T.__new__(S, ...) -> a new object with type S, a subtype of T

跟solver有关的信息都在这个类里面。
最后吐槽一下csdn（已经不是第一次了），一不小心删除了文章，居然没法恢复，真是太无语了。

1 0