Describing People: A Poselet-Based Approach to Attribute Classification

来源：互联网发布：预测算法有哪些编辑：程序博客网时间：2024/06/05 08:05

原文链接

1. Abstract

Use a part-based approach based on poselets.(Poselets is proposed by Lubomir Bourdev in 2009)

2. Introduce

Convert finegrained to attribute classification problem. For one attribute, we need to conbine many cues. For classification, detecting and aligning the parts is of much importance. But localizing body parts is a tough task.

The training input is a set of images in which the people of interset are specified via their visible bounds and the values of their attributes. Use a three layer feed-forward network. Three layers mean three steps of work. This layer is not the layer in deep learning.

In the first layer(first step), predict 9 attributes(is-male, has-hat, has-t-shirt,…) for each human part.

In the second layer(second step), combine information from all such predictions, as the gender given the face, the leges, and other parts, into one single attribute classification.

In the third layer(third layer), leverage dependencies between different attributes, such as the fact that gender is correlated with the presence of long hair.

In fact, this article regards poselets as a general tool for decomposing the viewpoint and pose.

3. Algorithm

Step 1

Detect the poselets on the test image and get qi for the probability of poselet type i.

Step 2

For each poselet type i, extract a feature vector consisting of HOG cells at three, a color histogram and skin-mask features.

Step 3(first layer)

For each poselet type i and each attribute j, evalute a classifier rij for attribute j conditioned on the poselet i. These classifiers are called poselet-level attribute classifiers. Classifier is a linear SVM followed by a logistic g.(What is the relationship between SVM and logistic here?)

Step 4(second layer)

For the output of poselet-level attribute classifiers, we zero-center them(move the center to zero) and modulate them by the poselet detection probabilities qi(multiply q) to get the input of a second classifier called person-level attribute classifier, whose goal is to combine the evidence from all body parts.

Step 5(third layer)

For each attribute j, evalute a third classifier called context-level attribute classifier. Input feature vector is the scores of all person-level classifiers for all attributes sj. Classifier is an SVM with quadratic kernel.

0 0