face alignment by 3000 fps系列学习总结(二)
来源:互联网 发布:java实现文件下载功能 编辑:程序博客网 时间:2024/05/18 01:55
准备初始数据
mean_shape
mean_shape就是训练图片所有ground_truth points的平均值.那么具体怎么做呢?是不是直接将特征点相加求平均值呢?
显然这样做是仓促和不准确的。因为图片之间人脸是各式各样的,收到光照、姿势等各方面的影响。因此我们求取平均值,应该在一个相对统一的框架下求取。如下先给出matlab代码:
function mean_shape = calc_meanshape(shapepathlistfile)fid = fopen(shapepathlistfile);shapepathlist = textscan(fid, '%s', 'delimiter', '\n');if isempty(shapepathlist) error('no shape file found'); mean_shape = []; return;endshape_header = loadshape(shapepathlist{1}{1});if isempty(shape_header) error('invalid shape file'); mean_shape = []; return;endmean_shape = zeros(size(shape_header));num_shapes = 0;for i = 1:length(shapepathlist{1}) shape_i = double(loadshape(shapepathlist{1}{i})); if isempty(shape_i) continue; end shape_min = min(shape_i, [], 1); shape_max = max(shape_i, [], 1); % translate to origin point shape_i = bsxfun(@minus, shape_i, shape_min); % resize shape shape_i = bsxfun(@rdivide, shape_i, shape_max - shape_min); mean_shape = mean_shape + shape_i; num_shapes = num_shapes + 1;endmean_shape = mean_shape ./ num_shapes;img = 255 * ones(500, 500, 3);drawshapes(img, 50 + 400 * mean_shape);endfunction shape = loadshape(path)% function: load shape from pts filefile = fopen(path);if file == -1 shape = []; fclose(file); return;endshape = textscan(file, '%d16 %d16', 'HeaderLines', 3, 'CollectOutput', 2);fclose(file);shape = shape{1};end
解析:
公式表示:
准备ΔSt
我们知道3000FPS的核心思想是:
其中
相似性变换的主要过程是:
先将
先贴代码: train_model.m 第103行起
Param.meanshape = S0(Param.ind_usedpts, :); %选取特定的landmarkdbsize = length(Data);% load('Ts_bbox.mat');augnumber = Param.augnumber; %为每张人脸选取的init_shape的个数for i = 1:dbsize % initializ the shape of current face image by randomly selecting multiple shapes from other face images % indice = ceil(dbsize*rand(1, augnumber)); indice_rotate = ceil(dbsize*rand(1, augnumber)); indice_shift = ceil(dbsize*rand(1, augnumber)); scales = 1 + 0.2*(rand([1 augnumber]) - 0.5); Data{i}.intermediate_shapes = cell(1, Param.max_numstage); %中间shape Data{i}.intermediate_bboxes = cell(1, Param.max_numstage); Data{i}.intermediate_shapes{1} = zeros([size(Param.meanshape), augnumber]); %68*2*augnumber(augnumber为第i图片设置的初始shape的个数) Data{i}.intermediate_bboxes{1} = zeros([augnumber, size(Data{i}.bbox_gt, 2)]); %augnumber*4 Data{i}.shapes_residual = zeros([size(Param.meanshape), augnumber]); %shapes_residual为shape 残差 维数:68*2*augnumber Data{i}.tf2meanshape = cell(augnumber, 1); Data{i}.meanshape2tf = cell(augnumber, 1); % if Data{i}.isdet == 1 % Data{i}.bbox_facedet = Data{i}.bbox_facedet*ts_bbox; % end % 如下一段的意思是如果augnumber=1,表明每个图片的Init_shape只有一个,因此这要设置成mean_shape即可,这时你会发现Data{i}.tf2meanshape{1}其实就是 % 单位矩阵,因为他是从mean_shape转化到mean_shape。后面就不一样了. %;对于augnumber>1的其他init_shape将采用平移、旋转、 % 缩放等方式产生更多的shape,也可以从其他图片的shape中挑选shape for sr = 1:params.augnumber if sr == 1 % estimate the similarity transformation from initial shape to mean shape % Data{i}.intermediate_shapes{1}(:,:, sr) = resetshape(Data{i}.bbox_gt, Param.meanshape); % Data{i}.intermediate_bboxes{1}(sr, :) = Data{i}.bbox_gt; Data{i}.intermediate_shapes{1}(:,:, sr) = resetshape(Data{i}.bbox_facedet, Param.meanshape); Data{i}.intermediate_bboxes{1}(sr, :) = Data{i}.bbox_facedet; %将mean shape reproject face detection bbox上 meanshape_resize = resetshape(Data{i}.intermediate_bboxes{1}(sr, :), Param.meanshape); %meanshape_resize与 Data{i}.intermediate_shapes{1}(:,:, sr) 是相同的 %计算当前的shape与mean shape之间的相似性变换 Data{i}.tf2meanshape{1} = fitgeotrans(bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))), ... (bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))), 'NonreflectiveSimilarity'); Data{i}.meanshape2tf{1} = fitgeotrans((bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))), ... bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))), 'NonreflectiveSimilarity'); % calculate the residual shape from initial shape to groundtruth shape under normalization scale shape_residual = bsxfun(@rdivide, Data{i}.shape_gt - Data{i}.intermediate_shapes{1}(:,:, 1), [Data{i}.intermediate_bboxes{1}(1, 3) Data{i}.intermediate_bboxes{1}(1, 4)]); % transform the shape residual in the image coordinate to the mean shape coordinate [u, v] = transformPointsForward(Data{i}.tf2meanshape{1}, shape_residual(:, 1)', shape_residual(:, 2)'); Data{i}.shapes_residual(:, 1, 1) = u'; Data{i}.shapes_residual(:, 2, 1) = v'; else % randomly rotate the shape % shape = resetshape(Data{i}.bbox_gt, Param.meanshape); % Data{indice_rotate(sr)}.shape_gt shape = resetshape(Data{i}.bbox_facedet, Param.meanshape); % Data{indice_rotate(sr)}.shape_gt %根据随机选取的scale,rotation,translate计算新的初始shape然后投影到bbox上 if params.augnumber_scale ~= 0 shape = scaleshape(shape, scales(sr)); end if params.augnumber_rotate ~= 0 shape = rotateshape(shape); end if params.augnumber_shift ~= 0 shape = translateshape(shape, Data{indice_shift(sr)}.shape_gt); end Data{i}.intermediate_shapes{1}(:, :, sr) = shape; Data{i}.intermediate_bboxes{1}(sr, :) = getbbox(shape); meanshape_resize = resetshape(Data{i}.intermediate_bboxes{1}(sr, :), Param.meanshape); %将 Data{i}.tf2meanshape{sr} = fitgeotrans(bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, sr), mean(Data{i}.intermediate_shapes{1}(1:end,:, sr))), ... bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :))), 'NonreflectiveSimilarity'); Data{i}.meanshape2tf{sr} = fitgeotrans(bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :))), ... bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, sr), mean(Data{i}.intermediate_shapes{1}(1:end,:, sr))), 'NonreflectiveSimilarity'); shape_residual = bsxfun(@rdivide, Data{i}.shape_gt - Data{i}.intermediate_shapes{1}(:,:, sr), [Data{i}.intermediate_bboxes{1}(sr, 3) Data{i}.intermediate_bboxes{1}(sr, 4)]); [u, v] = transformPointsForward(Data{i}.tf2meanshape{1}, shape_residual(:, 1)', shape_residual(:, 2)'); Data{i}.shapes_residual(:, 1, sr) = u'; Data{i}.shapes_residual(:, 2, sr) = v'; % Data{i}.shapes_residual(:, :, sr) = tformfwd(Data{i}.tf2meanshape{sr}, shape_residual(:, 1), shape_residual(:, 2)); end endend
这段代码的理解需要结合上面给出的那篇文章《人脸配准坐标变换解析》。
按照《人脸配准坐标变换解析》文章所述,
因此根据
但是现在问题比较特殊,需要多操作一下:
由:
%将mean shape reproject face detection bbox上 meanshape_resize = resetshape(Data{i}.intermediate_bboxes{1}(sr, :), Param.meanshape);
查看resetshape的定义知meanshape被映射到intermediate_bboxes中,使得
于是同样按照上面的方法计算:
经过计算得
这也就是上面的代码:
Data{i}.tf2meanshape{1} = fitgeotrans(bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))), ... (bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))), 'NonreflectiveSimilarity');
Data{i}.tf2meanshape{1}即为这里算出的
但我们想要的是
也就是代码中提的:
%计算当前的shape与mean shape之间的相似性变换 Data{i}.tf2meanshape{1} = fitgeotrans(bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))),(bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))), 'NonreflectiveSimilarity');Data{i}.meanshape2tf{1} = fitgeotrans((bsxfun(@minus, meanshape_resize(1:end, :), mean(meanshape_resize(1:end, :)))),bsxfun(@minus, Data{i}.intermediate_shapes{1}(1:end,:, 1), mean(Data{i}.intermediate_shapes{1}(1:end,:, 1))), 'NonreflectiveSimilarity');% calculate the residual shape from initial shape to groundtruth shape under normalization scaleshape_residual = bsxfun(@rdivide, Data{i}.shape_gt - Data{i}.intermediate_shapes{1}(:,:, 1), [Data{i}.intermediate_bboxes{1}(1, 3) Data{i}.intermediate_bboxes{1}(1, 4)]);% transform the shape residual in the image coordinate to the mean shape coordinate[u, v] = transformPointsForward(Data{i}.tf2meanshape{1}, shape_residual(:, 1)', shape_residual(:, 2)'); Data{i}.shapes_residual(:, 1, 1) = u'; Data{i}.shapes_residual(:, 2, 1) = v';
1 0
- face alignment by 3000 fps系列学习总结(二)
- Face Alignment by 3000 FPS系列学习总结(一)
- face alignment by 3000 fps系列学习总结(三)
- face alignment by 3000 fps系列学习总结
- Face Alignment by 3000 FPS 代码之二
- face alignment by 3000FPS 代码解析之一
- Face Alignment at 3000 FPS 学习理解和具体实现
- C++版 Face Alignment at 3000FPS(二)TestModel运行
- Face Alignment at 3000FPS(C++版)工程配置
- Face Alignment at 3000FPS工程配置
- Face Alignment at 3000 FPS 阅读笔记
- Face Alignment at 3000 FPS 阅读笔记
- 【C++版】Face Alignment at 3000 FPS by Regressing Local Binary Features源码下载
- Face Alignment at 3000 FPS通俗易懂讲解二 LBP局部二进制特征(特征映射)的生成
- Face Alignment at 3000 FPS via Regressing Local Binary Features 论文学习
- Face Alignment at 3000 FPS via Regressing Local Binary Features(CVPR2014)读后感(first pass)
- Face Alignment at 3000 FPS via Regressing Local Binary Features(CVPR2014)读后感(first pass)
- Face Alignment at 3000FPS(C++版)工程配置(非Cmake)
- Junit框架使用(4)--JUnit常用断言及注解
- leetcode:Populating Next Right Pointers in Each Node II 【Java】
- js中的升降序的比较器
- Git上传到Github
- 如何在网页中通过js代码将内容分享到朋友圈?
- face alignment by 3000 fps系列学习总结(二)
- 浅谈Java分布式计算
- Mysql 基于 Amoeba 的 水平和垂直 分片
- POJ 2762 Going from u to v or from v to u? 缩点
- iOS深入学习(UITableView系列4:使用xib自定义cell)
- 选夫婿2
- 95. Unique Binary Search Trees II LeetCode
- 伪装地理位置软件"任我行“android版本发布
- 使用mybatis开发一个项目的过程