Dlib人脸特征点检测（速度优化）

来源：互联网发布：手机团购软件编辑：程序博客网时间：2024/05/16 04:52

http://blog.csdn.net/leo_812/article/details/51945743

最近在做人脸先关的研究，人脸识别其实有很多部分组成，每一个环节都关系到整体的效果。因为主要精力在识别这块，前面的人脸检测以及特征点的提取就没有花费太多精力，开始时使用的dlib提供的接口进行人脸对齐。效果是不错，但是缺点也非常明显，dlib的人脸检测实在太慢,320*240的图片，差不多需要0.15s的时间。
看了很多的解决方法https://github.com/cmusatyalab/openface/issues/85 这里有提到通过tbb加速dlib；通过track来加速检测；还有一篇比较赞的文章https://arxiv.org/ftp/arxiv/papers/1508/1508.01292.pdf 可以达到28fps at 4K, 54fps at 1080p的检测速度（没有公开源码），不过方法感觉比较可信，有时间的话一定要试一试。
以上的方法都很好，不过没有时间一一尝试，这里使用了一个比较简单的方法，通过Opencv的检测器进行人脸检测，然后调用Dlib的特征点检测方法进行检测，对齐。
之前考虑到使用不同的检测器可能会对Landmark的提取有影响，所以一直没有试，今天实验了一下，效果竟然还不错，不过个人认为，如果要达到较好的精度，Landmark的模型最好针对自己的检测器做训练。Dlib使用的是One Millisecond Face Alignment with an Ensemble of Regression Trees 这篇文章的方法，不知道Dlib中有没有留出训练的接口，不过检测的源码都是有的，可以研究一下。之前本来准备使用Face Alignment at 3000 FPS via Regressing Local Binary Features做人脸对齐，不过看到别人说自己训练很难达到文章的效果，所以就没有尝试。
下面贴代码：
程序先从通过opencv检测人脸，然后使用了Dlib进行人脸对齐，opencv中有很多参数可以设置，可以根据自己的需要加快检测速度。这里对于640*480的图片进行最大人脸检测以及对齐的时间消耗只要差不多70ms就可以完成。
dlib还是不太了解，下面的对齐代码如何实现不是很清楚，不过还是可以转换成opencv的格式通过getAffineTransform以及warpAffine函数完成人脸对齐，下面写出基本的例程。

#include <dlib/image_processing/frontal_face_detector.h>#include <dlib/image_processing/render_face_detections.h>#include <dlib/image_processing.h>#include <dlib/gui_widgets.h>#include <dlib/image_io.h>#include <iostream>#include "opencv2/opencv.hpp"#include "opencv2/core/core.hpp"#include "time.h"using namespace dlib;using namespace std;//using namespace cv; 和dlib的命名空间有冲突// ----------------------------------------------------------------------------------------int main(int argc, char** argv){    image_window win, win_faces;    string face_cascade_name = "/home/f/ClionProjects/test_cython/koestinger_cascade_aflw_lbp.xml";    //这里使用的LBP检测器，速度较haar检测器速度快，没有的话使用opencv自带的haar特征检测器也可以    cv::CascadeClassifier face_cascade;    face_cascade.load(face_cascade_name);    shape_predictor sp;    string shape_model = "/home/f/ClionProjects/test_cython/shape_predictor_68_face_landmarks.dat";    //模型下载地址如下:    //http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2    deserialize(shape_model) >> sp;    string img_path = "/home/f/ClionProjects/test_cython/faces/1.jpg";    array2d<rgb_pixel> img;    load_image(img, img_path);    cv::Mat face = cv::imread(img_path);    std::vector<cv::Rect> faces;    cv::Mat face_gray;    cvtColor( face, face_gray, CV_BGR2GRAY );  //rgb类型转换为灰度类型    equalizeHist( face_gray, face_gray );   //直方图均衡化                  face_cascade.detectMultiScale(face_gray,faces,1.2,2,0|CV_HAAR_FIND_BIGGEST_OBJECT,cv::Size(20,20));    dlib::rectangle det;    //将opencv检测到的矩形转换为dlib需要的数据结构，这里没有判断检测不到人脸的情况    det.set_left(faces[0].x);    det.set_top(faces[0].y);    det.set_right(faces[0].x+faces[0].width);    det.set_bottom(faces[0].y+faces[0].height);    // Now we will go ask the shape_predictor to tell us the pose of    // each face we detected.    std::vector<full_object_detection> shapes;    full_object_detection shape = sp(img, det);    cout << "number of parts: "<< shape.num_parts() << endl;    cout << "pixel position of first part:  " << shape.part(0) << endl;    cout << "pixel position of second part: " << shape.part(1) << endl;    shapes.push_back(shape);    // Now let's view our face poses on the screen.    win.clear_overlay();    win.set_image(img);    win.add_overlay(render_face_detections(shapes));    // We can also extract copies of each face that are cropped, rotated upright,    // and scaled to a standard size as shown here:    dlib::array<array2d<rgb_pixel> > face_chips;    extract_image_chips(img, get_face_chip_details(shapes), face_chips);    win_faces.set_image(tile_images(face_chips));    char pause;    cin>>pause;    return 0;}1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73

下面的CMakeLists.txt稍微介绍一下，写不对的话是编译不了的。
需要将自己的dlib文件引入进来，根据自己的地址做修改，Dlib有点麻烦就是对jpg,png格式图片的加载使用需要自己手动添加相关支持，比如读取JPEG的文件，需要自己载入jpeg.so而且在编译过程中也要声明。

cmake_minimum_required(VERSION 3.5)project(face_align)set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -DDLIB_JPEG_SUPPORT  -DCMAKE_BUILD_TYPE=Release")include_directories(${CMAKE_SOURCE_DIR})include_directories(/home/f/downLoad/dlib/) #dlib.h....link_directories(/usr/lib/x86_64-linux-gnu/) #jpeg.solink_directories(/home/f/downLoad/dlib/build/dlib/) #dlib.soset(SOURCE_FILES main.cpp)add_executable(face_align ${SOURCE_FILES})find_package(OpenCV REQUIRED)target_link_libraries(face_align ${OpenCV_LIBS} dlib.so jpeg.so)1
2
3
4
5
6
7
8
9
10
11
12
1
2
3
4
5
6
7
8
9
10
11
12

今天在Openface下测试了这种方法对识别的结果影响，又看了一下对齐的效果，确实对齐的不是很准，人脸识别原模型在LFW数据库上是93%的准确率，换了这种对齐方式，下降到87.6%。看来要想达到较好的结果，每一环都不能忽略。

最近看到一个代码很不错，方法简单，而且速度很快。代码地址如下https://github.com/mc-jesus/face_detect_n_track
这里是作者给出的流程图，意思就是说第一帧图片先进行普通的全图搜索，如果找到了人脸那就在这个boundingBox两倍大的区域内再次调用检测器检测，如果检测不到就使用模板匹配的方法检测，2s内检测不到就重置算法。跑了一下发现模板匹配其实没怎么用到，而且要配合特征点检测算法，这里不能使用模板匹配来搜索人脸，简化一下算法，只要使用两倍大的搜索框搜索就可以达到不错的效果。按照这个思想修改了Dlib版本的代码，在320*240的图片里检测人脸可以达到30ms以内。320*240的分辨率不做缩放可以检测到差不多3米远的人脸。可以根据应用场景再做优化，而且现在这个速度其实也不快，还有很多简单的优化方法。

0 0