RGB-D SLAM——匹配篇（一）

来源：互联网发布：linux命令配置ip地址编辑：程序博客网时间：2024/05/01 15:24

之前跑过高博的一起做RGB-D SLAM，趁着放假，也稍微整理一下，顺便理一下思路，由于本人是在Ubuntu上跑的代码，所以用的工具和IDE是CMake+QtCreator，先稍微介绍下如果用QtCreate建立CMake工程，正确来说应该是如何导入CMake工程，然后再进行RGB-D SLAM的搭建。

1.通过QtCreator导入CMake工程

在任意一个Directory中创建一个package，分别在package中创建如下空间：

这里写图片描述

如上图所示：在C_lib package中包含CMakeLists.txt 和 src package （该Directory必须包含这俩个），然后进去 src 后，在该Directory再创建一个CMakeLists.txt 和任意一个 hello.cpp 文件，必须文档到这里结束。

其中，俩个CMakeList.txt 需要包含的内容有：

C_lib 目录下的CMakeLists

CMAKE_MINIMUM_REQUIRED( VERSION 3.5.1 )   #这里我用的CMake是3.5.1的PROJECT( C_lib)    #工程名字#SET(CMAKE_CXX_COMPILER "g++")    #设定编译器SET( CMAKE_BUILD_TYPE Debug  )    #编译类型SET(EXECUTABLE_OUTPUT_PATH ${PROJECT_SOURCE_DIR}/bin) #将执行二进制文件输出到bin文件夹中SET(LIBRARY_OUTPUT_PATH ${PROJECT_SOURCE_DIR}/lib) #生成的库输出到lib文件夹中INCLUDE_DIRECTORIES( ${PROJECT_SOURCE_DIR}/include ) #头文件放在include文件夹中ADD_SUBDIRECTORY( ${PROJECT_SOURCE_DIR}/src )   #增加子文件，也就是进入源码文件夹中继续构建

src 目录下的CMakeLists：

CMAKE_MINIMUM_REQUIRED( VERSION 3.5.1 )  版本号PROJECT( C_lib ) 工程名set(CMAKE_CXX_FLAGS "-std=c++11 ")add_executable(hello hello.cpp)  #生成可执行文件

src目录下的hello.cpp ：

#include<iostream>int main(){  std::cout<<"hello"<<std::endl;  return 0;}

至此，一个最简单的工程创建完毕，紧接着，打开QTCreator，file->open file or project ,然后选中 C_lib 下的CMakelists，至此则导入成功，然后再后面需要什么再往里加就行了。

2.相机模型及其公式推导

这里写图片描述

一个空间点[X,Y,Z]和它在图像中的像坐标[u,v,d],有以下公式：

$\begin{bmatrix} u\\ v\\ 1 \end{bmatrix}=K\begin{bmatrix} \frac{X}{Z}\\ \frac{Y}{Z}\\ 1 \end{bmatrix}=K\begin{bmatrix} f_{x} & 0 & c_{x}\\ 0& f_{y} & c_{y}\\ 0 & 0 & 1 \end{bmatrix}$

其中 K 为相机内参，k= $\begin{bmatrix} f_{x} & 0 &c_{x} \\ 0& f_{y}& c_{y}\\ 0& 0 & 1 \end{bmatrix}$

从而得出以下公式，即三维空间点到像素平面点的转换关系：

$Z=\frac{d}{scale}$

$u=\frac{f_{x}*X}{Z}+c_{x}$
$v=\frac{f_{y}*Y}{Z}+c_{y}$

其中 d（mm极）为深度图深度，scale为尺度因子，其他为内参。

对于俩副图片的位姿，我们可通过以下公式来求得：

$\begin{bmatrix} u\\ v\\ 1 \end{bmatrix}=K(R\begin{bmatrix} X\\ Y\\ Z \end{bmatrix}+t)$ ,在这个公式中，我们把Z理解为空间的点，而不是深度图。

3.从图片生成点云

这里我们先给出封装的函数，到后面我们再将其结合起来。。。

#include<iostream>#include<string>using namespace std#include<opencv2/core/pcd_io.h>#include<opencv2/point_types.h>typedef pcl::PointXYZRGBA pointT;typedef pcl::PointCloud<PointXYZRGBA> PointCloudPointCloud::Ptr image2PointCloud( cv::Mat& rgb, cv::Mat& depth, CAMERA_INTRINSIC_PARAMETERS& camera )  //创建一个PointCloud::Ptr智能指针类型的函数，方便管理内存，传入的参数有rgb图，depth图，以及相机内参{  PointCloud::Ptr cloud(new PointCloud); //创建一个PointCloud对象，顺便在堆区创建内存用来存储，返回指针给主函数     for(int i=0;i<depth.rows;i++)  //i是y轴的值，j是x轴的值         for(int j=0;j<depth.cols;j++)         {      //遍历传进的深度图             ushort d=depth.ptr<ushort> (i)[j];//为什么用ushort，在下面有说明，通过depth.ptr<ushort>能读出i行j列的深度值             if(d==0)  //如果是hole，则直接跳过                 continue;             PointT p;  //声明一个             p.z=double(d)/camera.scale;  //先求出空间点Z             p.x=(j-camera.cx)*p.z/camera.fx; //求出空间点X             p.y=(i-camera.cy)*p.z/camera.fy; //求出空间点Y             p.b=rgb.ptr<uchar>(i)[3*j];  //无符号字符，第i行第j列，注意这里，rgb是连着一起放的。             p.g=rgb.ptr<uchar>(i)[3*j+1];              p.r=rgb.ptr<uchar>(i)[3*j+2];             cloud->points.push_back(p); //将p储存到cloud中         }     cloud->height=1; //设点云数据     cloud->width=cloud->points.size();     cloud->is_dense=false;     return cloud;}

这里有些知识点需要清楚：
1.在OpenCV中，图像是以矩阵（cv::Mat）作为基本数据的，Mat结构既可以帮你管理内存、像素信息，还支持常见的矩阵运算。
2.彩色图含有R,G,B三个通道，每个通道占8个bit（unsigned char）,故称为8UC3（8位unsigned char 3）结构。而深度图则是单通道的图像，每个像素由16个bit组成（也就是C++里的unsigned short），像素值代表该点离传感器的距离，注意尺度。因此，rgb是8UC3彩色通道，depth是16US1

4.图像配准，估计俩帧之间的位姿

假设我们有俩个帧：F1和F2，并且，我们已经提取了俩组一一对应的特征点：

$P=\begin{Bmatrix} p_{1},p_{2},...,p_{n} \end{Bmatrix}\in F_{1} , Q=\begin{Bmatrix} q_{2},q_{2},...,q_{n} \end{Bmatrix}\in F_{2}$

我们都是为了求出一个旋转矩阵R和位移矢量t，使得

$\forall i,p_{i}=Rq_{i}+t$

但是实际上是不可能完全相等的，这里我们可以建立一个最小二乘，来进行优化，得到R，t，其中R，t是决策变量，使得下面的误差最小：

$\underset{R,t}{min}\sum_{i=1}^{N}\left | p_{i}- (Rq_{i}+t)\right ||$

上面的式子是将所有的匹配特征点（是一个向量来着）求欧几里得距离，然后再求和。可以用ICP算法进行求解。

代码如下，在这里我们封装了俩个函数 computerKeyPointAndDesp（）和estimateMotion（），其中computerKeyPointAndDesp（）为计算每一帧的关键点，描述子，而estimationMotion（）则是通过关键点描述子来进行匹配，筛选最合适匹配点，通过PnP来求出R和t。

（1）computerKeyPointAndDesp（）函数代码如下：

void computeKeyPointsAndDesp(FRAME& frame) //函数传入为frame参数{//声明一个FeatureDetector类型的智能指针，用来指向存放在堆区的keypoints,特征器    cv::Ptr<cv::FeatureDetector> _detector; //声明一个DescriptorExtractor类型的智能指针，用来指向存放在堆区的描述子，提取器    cv::Ptr<cv::DescriptorExtractor> _descriptor;    _detector=cv::FeatureDetector::create("ORB"); //ORB是特征器的类型    _descriptor=cv::DescriptorExtractor::create("ORB");  //ORB同上    if(!_detector||!_descriptor)    {        cout<<"unknow types"<<endl;        return;    }    _detector->detect(frame.rgb,frame.kp);//设定好特征器类型后计算特征点，这里的提取keypoint存储到frame.kp中，frame.kp为vector<KeyPoint>类型的    _descriptor->compute(frame.rgb,frame.kp,frame.desp); //描述器，通过图像和特征点提取描述子，存储在frame.desp,这里的description是Mat类型的}

这段程序需要知道的知识：
1.要在一个图像提取特征，第一步是计算“关键点”，然后针对这些关键点周围的像素，计算其“描述子”。在OpenCV中，分别由FeatureDetector （API）和 DescritorExtractors （API）来实现，其中KeyPoint 包含了半径、角度等参数（可自行到响应的API查看）

对于上面的程序，我们进去看一下DescriptorExtractor类里面的内容（建议看一下函数的API）：

class CV_EXPORTS DescriptorExtractor{public:    virtual ~DescriptorExtractor();    void compute( const Mat& image, vector<KeyPoint>& keypoints,                  Mat& descriptors ) const;    //对应frame.rgb、frame.kp、frame.desp    void compute( const vector<Mat>& images, vector<vector<KeyPoint> >& keypoints,                  vector<Mat>& descriptors ) const;  //我们不取这个    virtual void read( const FileNode& );    virtual void write( FileStorage& ) const;    virtual int descriptorSize() const = 0;    virtual int descriptorType() const = 0;    static Ptr<DescriptorExtractor> create( const string& descriptorExtractorType );   //构建描述子提取器protected:    ...};

（2）estimatemotion()函数代码如下：

首先封装一个数据结构：struct RESULT_OF_PNP{  Mat tves,rves;       //由于计算PNP的时候，我们得到旋转向量和平移向量还有内点数量，所以我们可以建一个数据结构  int inliers;}

RESULT_OF_PNP estimateMotion(FRAME & frame1,FRAME&frame2,CAMERA_INTRINSIC_PARAMETERS &camera){    vector<cv::DMatch> matches; //matches是一个结构类型对象，包含很多数据类型    cv::DescriptionMatcher matcher; //关于DescriptionMatcher这个类的详细见官网API    matcher.match(frame1.desp,frame2.desp,matches); //match是一个函数成员见下面    cout<<"find total"<<matches.size()<<"matches."<<endl;    Mat imgMatchers;cv::drawMatches(frame1.rgb,frame1.kp,frame2.rgb,frame2.kp,matches,imgMatchers); //drawMatches函数，其中imgMatchers为Mat类型输出，matchers为vector<DMatch>类型    cv::imshow("good matches",imgMatchers);//显示图片，匹配到的图片    waitKey(0);    int min_distance=999;    vector<cv::DMatch> goodmatches;      double good_match_threshold=10;    for(size_t i=0;i<matches.size();i++) //遍历所有matches        if(matches[i].distance<min_distance)        {            min_distance=matches[i].distance;  //得出最小距离的matches        }    for(size_t j=0;j<matches.size();j++)  //对于所有的matches，如果距离是小于最小距离的10倍的话，则被归为goodmatches        if(matches[j].distance<good_match_threshold*min_distance)        {            goodmatches.push_back(matches[j]);        }    cout<<"good matches:"<<goodmatches.size()<<endl;    cv::drawMatches(frame1.rgb,frame1.kp,frame2.rgb,frame2.kp,goodmatches,imgMatchers);    imshow("good matches",imgMatchers);    waitKey(0);    vector<cv::Point3f> pts_obj;    vector<cv::Point2f> pts_msg;    for(size_t k=0;k<goodmatches.size();k++) //遍历所有的goodmatches    {        cv::Point2f p=frame1.kp[goodmatches[k].queryIdx].pt;//这里的.pt是Keypoint的一个数据成员，类型为Point2f，第一幅图片中对应的像素上的点goodmatches，queryIdx为源特征的索引，也就是第一张图片。        pts_msg.push_back(Point2f(frame2.kp[goodmatches[k].trainIdx].pt));        //第二幅图片中goodmatches对应的像素上的点        ushort d=frame1.depth.ptr<ushort>(int(p.y))[int(p.x)];        Point3f pt(p.x,p.y,d);        Point3f pd=point2dto3d(pt,camera);//将第一副图像转换为空间点        pts_obj.push_back(pd);        cout<<pts_obj.size()<<endl;    }    Mat cameraMatrix(3,3,CV_64F);    Mat distortion(1,5,CV_64F,cv::Scalar::all(0));    cameraMatrix.at<double>(0,0)=camera.fx;    cameraMatrix.at<double>(0,2)=camera.cx;    cameraMatrix.at<double>(1,1)=camera.fy;    cameraMatrix.at<double>(1,2)=camera.cy;    cameraMatrix.at<double>(2,2)=1;    Mat rvec,tvec, inliers;    solvePnPRansac(pts_obj,pts_msg,cameraMatrix,distortion,rvec,tvec,false,100,1.0,100,inliers);//PnP求解，第一个参数为第一幅图像映射到空间点的结果，第二个参数为第二幅图像的二维空间点，参数在后面具体解释    RESULT_OF_PNP result;    result.rvec=rvec;    result.tvec=tvec;    result.inliers=inliers.rows;    vector<DMatch> matcheShow;    for(size_t i=0;i<inliers.rows;i++)//遍历内点，找出内点的goodmatches    {        matcheShow.push_back(goodmatches[inliers.ptr<int>(i)[0]]);    }    cv::drawMatches(frame1.rgb,frame1.kp,frame2.rgb,frame2.kp,matcheShow,imgMatchers);    imshow("inliers",imgMatchers);    waitKey(0);    return result;}

这段程序需要的知识：
1. _descriptor->compute(frame.rgb,frame.kp,frame.desp);
在keypoint计算描述子。描述子是一个cv::Mat 的矩阵结构，每一行代表一个对应于Keypoint的特征向量。当俩个描述子越相似，则关键点越像。
2.理解solvePnPRansac（）：

CV_EXPORTS_W void solvePnPRansac( InputArray objectPoints,//objectpoints（路标）空间点矩阵，3xN/NX3,N代表空间点的数量，空间点类型Pointf                                  InputArray imagePoints,//空间点对应的图像坐标系的二维点矩阵，是2xN/Nx2矩阵，同理N为特征点数量，Poin2f类型                                  InputArray cameraMatrix,                                  //相机内参                                  InputArray distCoeffs,                                  //相机畸变                                  OutputArray rvec,//旋转向量，brings points from the model coordinate system(理解为世界坐标系) to the camera coordinate system（相机坐标系），即从空间坐标系的点转化到相机坐标系的点                                  OutputArray tvec,                                 //平移向量                                  bool useExtrinsicGuess = false,                                  int iterationsCount = 100,                                  //迭代次数                                  float reprojectionError = 8.0,//重投影误差，RANSAC程序使用了inlier阈值，这个值是允许观测（即匹配）和计算（即计算重投影）差的最大值，如果小于这个值，则认为是内点                                  int minInliersCount = 100, //内点的数量，如果算法在一些状态找到比mininlierscount内点更多的点，则结束                                  OutputArray inliers = noArray(),    //输出的矩阵包含了在空间点(Point3f)和二维像素点（Point2f）内点的索引，就是说通过这个矩阵，可以找到空间点和二维像素点的内点索引                                  int flags = ITERATIVE);                                  //解决PnP问题的方法，这里是迭代的！

（3）紧接着，我们使用数据集（可自行下载）来验证我们的程序：完整代码如下（省去上述已经封装好的函数）：

#include<iostream>#include<string>#include<fstream>#include<map>#include<vector>using namespace std;#include<opencv2/highgui/highgui.hpp>#include <opencv2/features2d/features2d.hpp>#include <opencv2/calib3d/calib3d.hpp>#include <opencv2/core/core.hpp>using namespace cv;#include<pcl/point_types.h>#include<pcl/common/transforms.h>#include <pcl/visualization/cloud_viewer.h>#include <pcl/io/pcd_io.h>//#include <pcl/filters/voxel_grid.h>typedef pcl::PointXYZRGBA PointT;typedef pcl::PointCloud<PointT> PointCloud;struct CAMERA_INTRINSIC_PARAMETERS{    double fx,fy,cx,cy,scale;};struct FRAME{  Mat rgb,depth;  Mat desp;  vector<cv::KeyPoint> kp;};struct RESULT_OF_PNP{    Mat rvec,tvec;    int inliers;};//声明函数void computeKeyPointsAndDesp(FRAME& frame);  RESULT_OF_PNP estimateMotion(FRAME& frame1,FRAME& frame2,CAMERA_INTRINSIC_PARAMETERS &camera);Point3f point2dto3d(Point3f& pt,CAMERA_INTRINSIC_PARAMETERS &camera);CAMERA_INTRINSIC_PARAMETERS camera;  //声明一个相机内参的对象int main(){    camera.fx=325.5;    camera.fy=253.5;    camera.cx=518.0;    camera.cy=519.0;    camera.scale=1000;    FRAME frame1,frame2;    frame1.rgb=imread("/home/hansry/1.png");    frame1.depth=imread("/home/hansry/C_lib/data/depth_png/1.png",-1);    frame2.rgb=imread("/home/hansry/2.png");    frame2.depth=imread("/home/hansry/C_lib/data/depth_png/2.png",-1); //“-1”，不改变图像载入    computeKeyPointsAndDesp(frame1);    computeKeyPointsAndDesp(frame2);    RESULT_OF_PNP result=estimateMotion(frame1,frame2,camera);    cout<<"inliers:"<<" "<<result.inliers<<endl;    cout<<"t:"<<" "<<result.tvec<<endl;    cout<<"R:"<<" "<<result.rvec<<endl;    return  0;}