Deep Relative Attributes

来源：互联网发布：赫鲁晓夫秘密报告知乎编辑：程序博客网时间：2024/06/05 10:18

Relative Attributes 用于学习ranking function 来描述strength of the attribute. 多数当前的学习方法都是通过利用hand-crafted visual features对每个attribute学习到线性的ranking function.这篇论文我们提出 deep relative attributes (DRA)算法去学习 visual features ，并且得到描述成对图像相对属性的非线性 ranking function.这里 visual features 和 ranking function同时学习。DRA模型包含 5 convolutional neural layers, 5 fully connected layers, and a relative loss function which contains the contrastive constraint and the similar constraint corresponding to the ordered image pairs and the un-ordered image pairs,respectively.为了有效训练DRA，我们利用transferred knowledge from the large scale visual recognition on ImageNet.最后我们在3个数据集上进行实验，结果DRA确实比 the state-of-the-art relative attribute 学习方法要好一点。准确率提升 8%, 9%, and 14%。

当前许多相对属性的学习算法都是基于SVM框架的，得到的ranking score表示某属性相对其他图像的强度。相对属性学习算法的提升有一下三个方面：
1.Existing relative attribute methods rely on traditional hand-crafted features, such as gist descriptor , and color histogram ,which may not optimally capture the most appropriate visual features to describe relative attributes.依赖于传统的hand-crafted features，可能获取的特征不是描述相对属性最适宜那个。
2. Most of the relative attribute learning methods only learn a linear or shallow ranking function to obtain the relative score of image pair for a specific attribute. The linear or shallow models are simple, and may not best represent the mapping from visual features of images pair to the relative score of attributes. 对于一对图像的一个特定的属性，大多数的相对属性学习算法只是学习到线性的或者浅层的 ranking function去获取relative score .
3. Existing relative attribute learning methods perform feature extraction and ranking function learning separately, which cannot capture the most useful features for describing visual attributes of images.一些相对属性学习算法对于feature extraction 和 ranking function 分别学习，这样获取的特征可能不是描述这个图像视觉属性的最有用的特征。

for each convolutional layer, we show the size and number of the convolutional filters. For each fully connected layer, we show the dimension of the output feature vector. We do not show the pooling, normalization and dropout layers after the convolutional layers or the fully connected layer. For all the convolutional layers, we adopt the same manner of pooling or normalization as AlexNet . The dropout is only carried out after the F 6 and F 7 layers.

由图可见，相对属性函数的值可以由最后的全连接层计算。 In our DRA model, the visual features and the ranking function are learned jointly in a unified convolutional neural network framework, and they can benefit each other. More effective visual features can improve the ranking accuracy of the relative attribute, while a better ranking function can be used to guide the more appropriate visual features learning.在模型中有million-scale 参数，例如卷积层中的卷积核、全连接层中的权重、偏差，在训练时候都要求 large scale labeled data，但是在我们实验的三个数据集上，最大的数据集只包含10000-scale labeled images。为了解决这个，我们利用transferred knowledge from the large scale visual recognition on ImageNet我们采用the trained image classification model 初始化低层DRA，然后 trained on the relative attributes dataset with the labeled image pairs。

关于DRA模型：

A. Problem Description
这里写图片描述

传统属性学习模型 DRA
hand-crafted features ———–> visual features
linear f(x) ———–> nonlinear f(x)

B. Deep Network Structure
训练阶段：
输入： an image pair (x,y) with relative attribute assignment l which denotes the label of the image pair.
在前向传播过程中，两张图片不同的输入输出在每一层中用相同的参数计算。
通过损失函数中的对比约束和相似约束，the ordered image pair (l = 1)被约束产生不同的输出，而the un-ordered image pair (l = 0) 被约束产生相同或者近似的输出。
测试阶段：
the single CNN with the learned parameters 用来预测 the strength value of any individual image with regard to the natural attribute.

卷积层：
对于第m层，输出：

*：卷积运算，W：卷积核，b：偏差，s(x)=max(0,x)，m:1-5

全连接层
这里写图片描述
*：卷积运算，W：权重矩阵，b：偏差，s(x)=max(0,x)，m:6-10

相对损失函数
这里写图片描述
G = P ∪ Q contains all ordered and un-ordered image pairs， Θ：包含所有参数， λ：正则化项
对于损失函数，做两点解释：
(1) If l i = 0, 则 x i与y i 有相同的属性值, 那么contrastive（对比）constraint=0 ， similar
constraint (f(x i )− f(y i )) 2最小值为0，使得 f(x i ) = f(y i )
(2) If l i = 1, 则xi比yi的属性强度更大, the similar constraint =0，最小化contrastive constraint
max(0,τ−(f(x i )− f(y i ))) 使得f(x i ) > f(y i ).这种情况下, 有两种情况
(a) If f(x i ) ≥ f(y i )+τ, 则the loss=0 ，这种情况正是我们想要的，所以无需最小化，并且不用任何惩罚
(b) If f(x i )< f(y i )+τ, the loss = τ−(f(x i )− f(y i )). 则最小化工作使得t接近0，直到f(x i ) > f(y i )+τ.

C. Optimization
DRA的优化和卷积神经网络相似，采用随机梯度下降法，在前向和后向传播过程中更新参数。

0 0