Caffe Loss层

来源：互联网发布：vs.php for vs2015 编辑：程序博客网时间：2024/05/21 18:46

HingelossLayer

计算 one-of-many 分类任务的 hinge loss.

Hinge Loss 概念

定义为：
E(z)=max(0,1−z)

常用在SVM的最大化间隔分类中.

对于期望输出t=+1,−1和分类器分y，预测值y的 hinge loss 为：
l(y)=max(0,1−t∗y)
这里，y应该是分类器决策函数的原是输出，而不是预测的最终类别结果.
例如，
线性SVMs，y=w∗x+b，其中(w,b)是超平面的参数，x为待分类的点.
当t 和 y 符号相同(即y预测到正确的类别)和 |y|>=1 时， hinge loss: l(y)=0；
当t 和 y 符号相反时，hinge loss l(y) 则随着y的增加而线性增加(one-side error).

t=1时，针对变量y，其hinge loss(蓝色线) vs. zero-one loss(青色，misclassification).
Note that the hinge loss penalizes predictions y < 1, corresponding to the notion of a margin in a support vector machine(SVM).

Caffe Layer 参数

HingeLossLayer 参数：
bottom - 输入 Blob 向量(长度为2)
- a. (N∗C∗H∗W) 大小的预测值 t，其各值表示 K=CHW类中的每一个类别的预测分数. 在SVM中，t是 D-维特征X∈RD×K 和学习超平面参数W∈RD×K 作为输入，进行内积计算 XTW 得到的结果，故 HingeLossLayer 采用 InnerProductLayer(num_output=D)的预测值作为输入，不需要再学习参数即其它loss计算，即等价于线性SVM(一个全连接层加上一个Hingeloss相当于一个线性SVM).
- b. (N∗1∗1∗1) 大小的 labels t，其值为整数，ln=[0,1,2,...,K−1]，分别表示所对应的 K 个类别中正确的类别标签.
top - 输出 Blob 向量(长度为1)
- (1∗1∗1∗1)，计算得到的 hinge loss：
  E=1N∑Nn=1∑Kk=1[max(0,1−σ(ln=k)∗tnk)]p
  - Lp 范数，p=1， L1 范数；p=2， L2 范数，类似于 L2-SVM；
  - if condition，σ(condition)=1；otherwise，σ(condition)=−1.

Caffe prototxt定义

......layer {  name: "fc8voc"  type: "InnerProduct"  bottom: "fc7"  top: "fc8voc"  param {    lr_mult: 10    decay_mult: 1  }  param {    lr_mult: 2    decay_mult: 0  }  inner_product_param {    num_output: 20    weight_filler {      type: "gaussian"      std: 0.01    }    bias_filler {      type: "constant"      value: 0    }  }}layer {  name: "loss"  type: "HingeLossMultiLabel"  bottom: "fc8voc"  bottom: "label"  top: "loss"}

Reference

Hinge loss - wikipedia
HingeLossLayer - Caffe
Analyzing Classifiers: Fisher Vectors and Deep Neural Networks [caffemode] [prototxt]

阅读全文

0 0