HOGDescriptor with SVM

来源:互联网 发布:卡佩拉详细数据 编辑:程序博客网 时间:2024/06/05 08:25

 Using HOGDescriptor with SVM


I am working on  Traffic Sign Recognition (TSR) and using a SVM with HOG features for the detection step. This post will show how to use the HOGDescriptor with a (2-class) linear ml::SVM.

I used firstly an own sliding window approach but this was horrible slow. I recalculated always the HOG features in my first approach but this is not always necessary and this would give a potential speedup. I took a look at the implementation of detect() and detectMultiScale(). They avoid exactly this overhead by using HOGCache. This will calculate all gradients in the beginning and cache calculate blocks. Indeed it gave a remarkable speedup. To use it HOGDescriptor needs a linear SVM, more precisely it needs the weight vector w and the bias b of the formula x⋅w+b, where x is the feature vector.


First I wanted to calculate the weight vector from the support vecctors given by getSupportVectors(). Surprisingly there was only one support vector. If you dump your ml::SVM with save() then you will get a XML file which you can examine. For example my (truncated) file looks like:

<opencv_storage><my_svm type_id="opencv-ml-svm">  <svm_type>C_SVC</svm_type>  <kernel><type>LINEAR</type></kernel>  <C>10.</C>  <term_criteria><epsilon>2.2204460492503131e-016</epsilon>    <iterations>50000</iterations></term_criteria>  <var_all>144</var_all>  <var_count>144</var_count>  <class_count>2</class_count>  <class_labels type_id="opencv-matrix">    <rows>1</rows>    <cols>2</cols>    <dt>i</dt>    <data>      -1 1</data></class_labels>  <class_weights type_id="opencv-matrix">    <rows>2</rows>    <cols>1</cols>    <dt>d</dt>    <data>      10. 30.</data></class_weights>  <sv_total>1</sv_total>  <support_vectors>    <_>      -2.57132626e+000 -8.15989792e-001 1.58552456e+000 -5.26909530e-001      -6.96959376e-001 9.07045364e-001 -7.77327597e-001 3.29408097e+000      2.92618185e-001 1.78208619e-001 -6.62242770e-001 -6.59358799e-001      2.16543388e+000 -2.22054377e-001 4.07920361e-001 1.70213091e+000      2.24114108e+000 -2.56618428e+000 9.89693940e-001 1.56920981e+000      -3.85006189e+000 6.49856925e-002 2.41270334e-001 -6.43376589e-001      1.96081090e+000 -2.11759996e+000 -4.94708754e-002 2.42956370e-001      -4.60045755e-001 6.22926056e-001 2.83989429e-001 -4.40556228e-001      -1.50898278e+000 -2.26646304e+000 -4.36711870e-002 1.32744241e+000      -9.02386189e-001 1.30066350e-002 1.12155938e+000 1.37151927e-001      -2.92353183e-001 2.49987912e+000 -6.72911823e-001 -2.27644324e+000      5.26569605e-001 -3.93141210e-001 8.90563369e-001 -1.07311404e+000      1.58201587e+000 1.19251311e+000 -4.85848077e-002 4.37769890e-001      -3.00920457e-001 7.63926983e-001 -4.32711601e-001 1.00920767e-001      -2.59097457e+000 -6.18790209e-001 -8.82605672e-001 5.52388787e-001      3.47640872e+000 -4.16076452e-001 1.02774036e+000 1.82222977e-001      -1.70404565e+000 3.11814260e+000 -1.07632530e+000 -2.18639016e+000      -1.35766029e+000 -5.38780069e+000 -1.29287338e+000      -7.45145977e-002 -4.60754931e-001 -7.52632737e-001 6.04343474e-001      -3.39522898e-001 4.91919637e-001 1.28550231e+000 -3.89712167e+000      2.19123673e+000 7.59627251e-003 -7.87921786e-001 -1.32802498e+000      -1.70838583e+000 -7.53632784e-001 -2.63046646e+000 6.94700837e-001      4.22459871e-001 1.30146229e+000 -1.60435414e+000 5.47433674e-001      9.33782995e-001 -1.95213962e+000 -1.90054512e+000 -2.69299173e+000      -6.60706580e-001 1.32844961e+000 -3.59237552e-001 -2.80351186e+000      -2.83979464e+000 3.12810087e+000 2.46888375e+000 8.63000929e-001      2.05874011e-001 -7.29870796e-001 -2.80619979e+000 -2.23102331e+000      2.13059950e+000 2.41010904e-001 1.96904325e+000 2.20617986e+000      -2.80920058e-001 8.72003436e-001 -3.65087181e-001 -3.07571912e+000      -1.88158214e+000 -8.56038868e-001 -1.01106215e+000 1.56111702e-001      -5.05603218e+000 -1.28347707e+000 -2.08585191e+000      -5.91017485e-001 3.29026270e+000 6.63230896e-001 -8.63312602e-001      2.22314811e+000 -3.48826337e+000 -2.45112991e+000 3.43967009e+000      2.75579151e-002 4.56622928e-001 1.60914242e+000 1.57043099e+000      -1.47175753e+000 4.31882322e-001 -2.55531788e-001 1.50610006e+000      -4.29869771e-001 9.34177816e-001 1.75518441e+000 -1.45684707e+000      -5.95927775e-001 1.45646954e+000</_></support_vectors>  <decision_functions>    <_>      <sv_count>1</sv_count>      <rho>-4.2610261838202392e+000</rho>      <alpha>1.</alpha>      <index>0</index>

At the top are the settings of my ml::SVM and at the bottom are the support vectors. The line sv_total: 1 shows that there is only one support vector. I am not even sure if there can be theoretically less than two support vectors. After checking the train() function of ml::SVM it became clear that a linear machine will be optimized automatically after training, see optimize_linear_svm(), this means all support vectors will be compressed to a single weight vector. That's exactly what we want. All we need to do is to call getSupportVectors().

We created a weight vector which could be used to set to HOGDescriptor but what about the bias b? This is also not clear from the documentation. We can get it from ml::SVM::getDecisionFunction(). What we get is ρ (rho). ρ is equals to b. This can be deduced from the code, for example from  PredictBody::operator(). By inspecting the code of HOGDescriptor::detect() it turned out that we can pass the bias b in the weight vector at the last position. So we have to place b at weightvector[n] where n is again the number of features in the weight vector. Both implementations call b rho with the important difference that PredictBody::operator() subtracts rho from the sum while HOGDescriptor::detect() adds it. 

In SVM , we could predict the class to which a sample x belongs by :
 
        const float * pSVData = SVM.get_support_vector(0);float rho = -4.26; //get from .xmlMat sv = Mat(1, dims, CV_32FC1);for (int i = 0; i < dims; i++){    sv.at<float>(0, i) = pSVData[i];}  Mat res = sv * x.t();float distance = res.at<float>(0, 0) - rho;if (distance < 0)c = 1; //positive classelsec = -1; //negative class

We could verify the result by directly invoking SVM built-in methods:
      
      float distance = SVM.predict(x,true)           float c = SVM.predict(x,false);


 The calculation in HOGDescriptor::detect() will be done with
xw+b>thresholdPositive Class
where threshold can be given as function parameter and is usually 0.

SVM during training tries to find a separating hyperplane such that trainset examples lay on different sides. There could be many such hyperplanes (or none), so to select the "best" we look for the one for which total distance from all classes are maximized. Indeed, the further away from the hyperplane point is located — the more confident we are in the decision. So what we are interested in is distance to the hyperplane.

As per OpenCV documentation, CvSVM::predict has a default second arguments which specifies what to return. By default, it returns classification label, but you can pass in true and it'll return the distance.The distance itself is pretty ok, but if you want to have a confidence value in a range (0, 1), you can apply sigmoidal function to the result. One of such functions if logistic function.

    decision = svmob.predict(testData, true);    confidence = 1.0 / (1.0 + exp(-decision));

注意:
svm.predict使用的是alpha*sv*another-rho,如果为负的话则认为是正样本,而在HOG的检测函数中,使用rho+alpha*sv*another如果为正的话是正样本。