Caffe Tutorial in Brief

来源：互联网发布：java模块化编程编辑：程序博客网时间：2024/05/16 07:15

这里写图片描述

1. 首先看 Blobs, Layers, and Nets

Caffe stores, communicates, and manipulates the information as blobs: the blob is the standard array and unified memory interface for the framework. The layer comes next as the foundation of both model and computation. The net follows as the collection and connection of layers. The details of blob describe how information is stored and communicated in and across layers and nets.

Blobs保存了两部分，一部分是数据，一部分是反传的梯度

这里写图片描述

…Then these gradients are scaled by the learning rate α and the update to subtract is stored in each parameter Blob’s diff field. Finally, the Blob::Update method is called on each parameter blob, which performs the final update (subtracting the Blob’s diff from its data).

Layer的任务感觉挺多的。The layer is the essence of a model and the fundamental unit of computation. Layers convolve filters, pool, take inner products, apply nonlinearities like rectified-linear and sigmoid and other elementwise transformations, normalize, load data, and compute losses like softmax and hinge. See the layer catalogue for all operations. Most of the types needed for state-of-the-art deep learning tasks are there.

The layer catalogue 讲到了各种layer，日后再细看。

这里写图片描述

每种layer都会做三件事：初始化+前传+后传，而且每种Forward and Backward functions 都有GPU和CPU的实现

这里写图片描述

More specifically, there will be two Forward and Backward functions implemented, one for CPU and one for GPU. If you do not implement a GPU version, the layer will fall back to the CPU functions as a backup option. This may come handy if you would like to do quick experiments, although it may come with additional data transfer cost (its inputs will be copied from GPU to CPU, and its outputs will be copied back from CPU to GPU).

The net is defined as a set of layers and their connections in a plaintext modeling language.

The net jointly defines a function and its gradient by composition and auto-differentiation. The composition of every layer’s output computes the function to do a given task, and the composition of every layer’s backward computes the gradient from the loss to learn the task. Caffe models are end-to-end machine learning engines.

The net is a set of layers connected in a computation graph – a directed acyclic graph (DAG) to be exact. Caffe does all the bookkeeping for any DAG of layers to ensure correctness of the forward and backward passes. A typical net begins with a data layer that loads from disk and ends with a loss layer that computes the objective for a task such as classification or reconstruction.

A simple logistic regression classifier

这里写图片描述

is defined by

这里写图片描述

还有一些model initializatoin机理之类的东西，就不详述了。Model initialization is handled by Net::Init() ….

最后，非常重要的是Model format，采用了google protobuf

这里写图片描述

2. Forward and Backward

这里写图片描述

3. Loss

讲到了loss weight这个概念

这里写图片描述

4. Solver

The solver orchestrates model optimization by coordinating the network’s forward inference and backward gradients to form parameter updates that attempt to improve the loss.

这里写图片描述

从下面可以看出solver的设计思想，Like Caffe models, Caffe solvers run in CPU / GPU modes.

这里写图片描述

然后介绍了一些gradient descent的method(SGD,Adam…)，具体就不细看了，用到时再看。

Snapshotting可以支持resume training

这里写图片描述

5. Layer Catalogue

这里写图片描述

介绍了各种state-of-art layer，用到再看吧。

6. Interfaces

这里写图片描述

7. Data

Data flows through Caffe as Blobs.
可以在data layer configuration中进行去均值，放缩等处理

Common transformations like mean-subtraction and feature-scaling are done by data layer configuration. New input types are supported by developing a new data layer – the rest of the Net follows by the modularity of the Caffe layer catalogue.

这里写图片描述

data layer没有 bottom blobs 因为没有输入

data layer的一些功能：

这里写图片描述

参见The layer catalogue ，更多类型的data layer

这里写图片描述

最后，看了贾扬清写的Convolution in Caffe: a memo，大意是说精力所限，自己一开始处理Caffe中的卷积优化问题时，就是抱着把他转换成一个另一个已经解决的问题的想法。于是就直接im2col，把image和filter都变成向量然后利用BLAS等去做，结果意外发现效果很好，直到04年被Alex他们击败。。当然，现在Caffe在开源社区的努力下，已经做了很多优化，今非昔比了。

0 0