人脸检测--Face Detection with End-to-End Integration of a ConvNet and a 3D Model
来源:互联网 发布:sql语句union all 编辑:程序博客网 时间:2024/04/29 22:43
Face Detection with End-to-End Integration of a ConvNet and a 3D Model
ECCV2016
mxnet code:https://github.com/tfwu/FaceDetection-ConvNet-3D
Faster R-CNN 在目标检测上表现出优异性能,本文将其应用于人脸检测,并根据人脸的特殊性做出相应的修改.
本文提出一个简单有效的方法将 ConvNet and a 3D model 结合起来实现端对端人脸检测
将 Faster R-CNN应用人脸检测问题,面临两个问题:
1)RPNs 中需要预定义一组 anchor boxes,这就可能在训练中引入了冗余的 parameter tuning,在检测中不稳定
2)RoI pooling layer without exploiting the underlying object structural configurations,RoI pooling layer 没有深入挖掘物体结构信息,很多类别的物体的结构可能没有共性,但是对于固定的人脸结构信息还是比较容易提取利用的。
主要修改的地方有两点:1)在RPN中取消 heuristic design of predefined anchor boxes,用一个 3D mean face model 代替,2)根据人脸结构信息用一个 configuration pooling layer 取代 the generic RoI (Region-of-Interest) pooling layer
本文是基于 facial key-points 检测来做人脸检测的
1.2 Method Overview
本文使用了 十个人脸特征点: “LeftEyeLeftCorner”, “RightEyeRightCorner”, “LeftEar”, “Nose-Left”, “NoseRight”, “RightEar”, “MouthLeftCorner”, “MouthRightCorner”, “Chin-Center”, “CenterBetweenEyes”
先上个图有个感性认识:
3 The Proposed Method
3.1 3D Mean Face Model and Face Representation
本文一个 3D mean face model 由 n 个 3D 人脸特征点表示,数学上具体为一个 n × 3 矩阵。我们从 AFLW dataset 中的 3D mean face model,里面由21个特征点,这里我们选择了10个特征点。
假定一个人脸 f,由 它的 3D 映射参数 表示其 旋转平移信息,我们从图像中提取到对应的 2D 特征点信息,这里 2D 特征点信息和 3D 模型有一个对应关系
这里我们通过学习得到一个 CNN网络用于 估计这个 3D transformation parameters,就是对每个检测到的特征点估计出对应的 3D mean face model,然后得到人脸矩形框候选区域,再对每个人脸预测出更准确的人脸特征点
The key idea is to learn a ConvNet to (i) estimate the 3D transformation parameters (rotation and translation) w.r.t. the 3D mean face model for each detected facial key-point so that we can generate face bounding box proposals and (ii) predict facial key-points for each face instance more accurately.
3.2 The Architecture of Our ConvNet
网络结构分解:
1)Convolution, ReLu and MaxPooling Layers,采用了VGG网络设计, 5 groups and each group has 3 convolution and ReLu consecutive
layers followed by a MaxPooling layer except for the 5th group,最终的特征图缩小了 16倍
2)An Upsampling Layer,因为我们是对比预测的特征点和检测到的特征点位置信息,所以需要放大特征图保留更多的 spatial 信息,upsample the feature maps to 8 times bigger in size,使用 deconvolution
3) A Facial Key-point Label Prediction Layer,11 labels (10 facial key-points and 1 background class) 用于计算 classification Softmax loss
4) A 3D Transformation Parameter Estimation Layer 3D 模型参数估计: 8 parameters
5) A Face Proposal Layer 人脸候选区域提取
6)A Configuration Pooling Layer 将人脸的十个特征点信息组合起来进行池化
7) A Face Bounding Box Regression Layer 对人脸矩形框位置进行回归微调
3.3 The End-to-End Training
这里主要介绍了损失函数的定义
4 Experiments
Results on FDDB
Results on FDDB
AFW dataset
11
- 人脸检测--Face Detection with End-to-End Integration of a ConvNet and a 3D Model
- READING NOTE: Face Detection with End-to-End Integration of a ConvNet and a 3D Model
- 车牌检测识别--Towards End-to-End Car License Plates Detection and Recognition with Deep Neural Networks
- Given a list of presentations with begin and end time that all need to use a conference room.
- 使用滑动窗口进行人脸检测 Face detection with a sliding window
- CS143-project4基于滑窗的人脸检测 Face detection with a sliding window
- Appending to the End of a File
- 姿态检测整理--06-Associative Embedding: End-to-End Learning for Joint Detection and Grouping
- 车牌识别“Towards End-to-End Car License Plates Detection and Recognition with Deep Neural Networks”
- Mnemonic Descent Method:A recurrent process applied for end-to-end face alignment
- DenseBox: Unifying Landmark Localization with End to End Object Detection
- vim separate with . and replace all to end of line
- 人脸检测“A Fast and Accurate Unconstrained Face Detector”
- 级联人脸检测--A Convolutional Neural Network Cascade for Face Detection
- Explaining How a Deep Neural Network Trained with End-to-End Learning Steers a Car论文笔记
- 端点检测 end-point detection
- A pointer to an object & A pointer one past the end of a different object
- 人脸检测“Joint Cascade Face Detection and Alignment”
- Redis pipline
- SpringCloud(第 030 篇)配置服务端ClientServer对配置文件内容进行对称加解密
- JS获取当前日期(精确到秒)
- 不同网段的通信(访问互联网)
- [FAQ19122]Android N 首次开机不随sim卡自适应语言修改方案
- 人脸检测--Face Detection with End-to-End Integration of a ConvNet and a 3D Model
- Android 技术的回顾第三篇《轮播广告》
- Jmeter默认报告优化
- 51 Nod 1563——坐标轴上的最大团
- 解决iOS输入框和button圆角问题
- SpringCloud(第 031 篇)配置客户端ConfigClient链接经过对称加解密的配置微服务
- 用NMOS搭建自动双向电平转换电路
- Docker平台与Moby项目迎来Kubernetes
- About mac80211