ARKit 和 ARCore剖析、结构、原理介绍

来源：互联网发布：楚天消防网通知通告编辑：程序博客网时间：2024/05/16 09:20

ARKit 和 ARCore 都是三部分：相机姿态估计，环境感知（平面估计）及光源感知。
ARCore 的部分源码：https://github.com/google-ar/arcore-unity-sdk/tree/master/Assets/GoogleARCore/SDK；
ARKit API: https://developer.apple.com/documentation/arkit
ARCore API: https://developers.google.com/ar/reference/

-相机姿态估计

Motion Tracking方面都是VIO.
ARKit 是特征点法，稀疏的点云：https://www.youtube.com/watch?v=rCknUayCsjk

ARKit recognizes notable features in the scene image, tracks differences in the positions of those features across video frames, and compares that information with motion sensing data. —-https://developer.apple.com/documentation/arkit/about_augmented_reality_and_arkit

ARCore 有猜测说是直接法估计的半稠密点云。但是google自己说是特征点，应该也是稀疏的了：

ARCore detects visually distinct features in the captured camera image called feature points and uses these points to compute its change in location.
—–https://developers.google.com/ar/discover/concepts

-环境感知（平面检测）

不太了解原理，摘了一些原文。
ARKit：

“…can track and place objects on smaller feature points as well…”
“…Use hit-testing methods (see the ARHitTestResult class) to find real-world surfaces corresponding to a point in the camera image….”
“You can use hit-test results or detected planes to place or interact with virtual content in your scene.”
检测不了垂直面

ARCore:

“ARCore looks for clusters of feature points that appear to lie on common horizontal surfaces…”
“…can also determine each plane’s boundary…”
” …flat surfaces without texture, such as a white desk, may not be detected properly…”
demo视频:https://www.youtube.com/watch?v=aSKgJEt9l-0

-光源感知

暂时不了解。

ARKit

参考博文：http://blog.csdn.net/u013263917/article/details/72903174
IPhone X添加了TrueDepth camera,也支持 ARKit使用。

一、简介

ARKit 框架

基于3D场景（SceneKit）实现的增强现实(主流)
基于2D场景（SpriktKit）实现的增强现实
ARKit与SceneKit的关系
ARKit并不是一个独立就能够运行的框架，而是必须要SceneKit一起用才可以。
1. ARKit 实现相机捕捉现实世界图像并恢复三维世界
2. SceneKit 实现在图像中现实虚拟的3D模型

我们focus ARKit

二、ARKit

ARKit框架中中显示3D虚拟增强现实的视图ARSCNView继承于SceneKit框架中的SCNView,而SCNView又继承于UIKit框架中的UIView。
在一个完整的虚拟增强现实体验中，ARKit框架只负责将真实世界画面转变为一个3D场景，这一个转变的过程主要分为两个环节：由 ARCamera负责捕捉摄像头画面，由ARSession负责搭建3D场景。
ARSCNView与ARCamera两者之间并没有直接的关系，它们之间是通过AR会话，也就是ARKit框架中非常重量级的一个类ARSession来搭建沟通桥梁的。
这里写图片描述
要想运行一个ARSession会话，你必须要指定一个称之为会话追踪配置的对象:ARSessionConfiguration, ARSessionConfiguration的主要目的就是负责追踪相机在3D世界中的位置以及一些特征场景的捕捉（例如平面捕捉），这个类本身比较简单却作用巨大。

ARSessionConfiguration是一个父类，为了更好的看到增强现实的效果，苹果官方建议我们使用它的子类ARWorldTrackingSessionConfiguration，该类只支持A9芯片之后的机型，也就是iPhone6s之后的机型

2.1. ARWorldTrackingSessionConfiguration 与 ARFrame

ARSession搭建沟通桥梁的参与者主要有两个ARWorldTrackingSessionConfiguration与ARFrame。
ARWorldTrackingSessionConfiguration（会话追踪配置）的作用是跟踪设备的方向和位置,以及检测设备摄像头看到的现实世界的表面。它的内部实现了一系列非常庞大的算法计算以及调用了你的iPhone必要的传感器来检测手机的移动及旋转甚至是翻滚。

ARWorldTrackingSessionConfiguration 里面就是VIO系统
这里文中提到的ARWorldTrackingSessionConfiguration在最新的iOS 11 beta8中已被废弃，因此以下更改为ARWorldTrackingConfiguration
当ARWorldTrackingSessionConfiguration计算出相机在3D世界中的位置时，它本身并不持有这个位置数据，而是将其计算出的位置数据交给ARSession去管理（与前面说的session管理内存相呼应），而相机的位置数据对应的类就是ARFrame
ARSession类一个属性叫做currentFrame，维护的就是ARFrame这个对象
ARCamera只负责捕捉图像，不参与数据的处理。它属于3D场景中的一个环节，每一个3D Scene都会有一个Camera，它觉得了我们看物体的视野。

2.2. ARSession

ARSession获取相机位置数据主要有两种方式

第一种：push。实时不断的获取相机位置，由ARSession主动告知用户。通过实现ARSession的代理- (void)session:(ARSession )session didUpdateFrame:(ARFrame )frame来获取
第二种：pull。用户想要时，主动去获取。ARSession的属性currentFrame来获取

2.3. ARKit工作完整流程

ARSCNView加载场景SCNScene
SCNScene启动相机ARCamera开始捕捉场景
捕捉场景后ARSCNView开始将场景数据交给Session
Session通过管理ARSessionConfiguration实现场景的追踪并且返回一个ARFrame
给ARSCNView的scene添加一个子节点（3D物体模型）

ARSessionConfiguration捕捉相机3D位置的意义就在于能够在添加3D物体模型的时候计算出3D物体模型相对于相机的真实的矩阵位置

这里写图片描述

三、API分析

-AROrientationTrackingConfiguration
tracks the device’s movement with three degrees of freedom (3DOF): specifically, the three rotation axes；只跟踪三个，不如下面的这个。
-[ARWorldTrackingConfiguration](https://developer.apple.com/documentation/arkit/arworldtrackingconfiguration)
负责跟踪相机，检测平面。tracks the device’s movement with six degrees of freedom (6DOF)。
完成slam工作的主要内容应该就是在这个里面。
但启动一个最简单的AR, 只需要：

let configuration = ARWorldTrackingConfiguration()configuration.planeDetection = .horizontalsceneView.session.run(configuration)

具体的实现被封装了。
然后与重要的ARSession类和ARSCNView的接口：
由于接口很多，只列部分，详细的转官网or参考博文：http://blog.csdn.net/u013263917/article/category/6959089
苹果有个自己的特色功能，Face-based AR experience。可以使用TrueDepth　camera追踪人脸的表情,pose, topology等。world based AR experience 对应其它厂商的sdk，比如ARCore。
比如一些常用的信息接口：
虚拟物体的位置，ARFrame–>ARAnchor中有transform
相机的位置，ARFrame–>ARCamera 中displayTransform包含位姿
视图矩阵，投影矩阵：ARFrame–>ARCamera 中有projectMatrix，viewMatrix
-ARSession

Tables cols cols Configuring and Running a Session func run(ARConfiguration, options: ARSession.RunOptions =) Starts AR processing for the session with the specified configuration and options. var configuration: ARConfiguration An object that defines motion and scene tracking behaviors for the session. Responding to AR Updates var delegate: ARSessionDelegate An object you provide to receive captured video images and tracking information, or to respond to changes in session status. protocol ARSessionDelegate Methods you can implement to receive captured video frame images and tracking state from an AR session. Displaying and Interacting with AR Content var currentFrame: ARFrame func add(anchor: ARAnchor) Adds the specified anchor to be tracked by the session.

-ARSCNView
ARSCNView即用户见到的界面。

Tables cols cols var session: ARSession The AR session that manages motion tracking and camera image processing for the view’s contents. var scene: SCNScene The SceneKit scene to be displayed in the view. Responding to AR Updates protocol ARSCNViewDelegate Methods you can implement to mediate the automatic synchronization of SceneKit content with an AR session. func hitTest(CGPoint, types: ARHitTestResult.ResultType) Searches for real-world objects or AR anchors in the captured camera image corresponding to a point in the SceneKit view.

-ARCamera
ARCamera类里有很多相关的Topics：

Tables cols cols Handling Tracking Status trackingState The general quality of position tracking available when the camera captured a frame. ARTrackingState Possible values for position tracking quality. Examining Imaging Parameters imageResolution The width and height, in pixels, of the captured camera image. Applying Camera Geometry projectionMatrix A transform matrix appropriate for rendering 3D content to match the image captured by the camera. projectionMatrixForOrientation: Returns a transform matrix appropriate for rendering 3D content to match the image captured by the camera, using the specified parameters. viewMatrixForOrientation: Returns a transform matrix for converting from world space to camera space.

-ARFrame
ARFrame 类里的Topics 可以看到slam输入输出接口

Tables cols cols Accessing Captured Video Frames capturedImage A pixel buffer containing the image captured by the camera. capturedDepthData The depth map, if any, captured along with the video frame. Examining Scene Parameters camera Information about the camera position, orientation, and imaging parameters used to capture the frame. lightEstimate An estimate of lighting conditions based on the camera image. displayTransformForOrientation: Returns an affine transform for converting between normalized image coordinates and a coordinate space appropriate for rendering the camera image onscreen. Tracking and Finding Objects anchors The list of anchors representing positions tracked or objects detected in the scene. hitTest:types: Searches for real-world objects or AR anchors in the captured camera image. Debugging Scene Detection rawFeaturePoints The current intermediate results of the scene analysis ARKit uses to perform world tracking. ARPointCloud A collection of points in the world coordinate space of the AR session.

关于特征点通过ARPointCloud可以看到特征点的个数和identitiers;
究竟是用的什么特征点？可能需要看下identitiers的维数等信息。SIFT 特征的descriptor是128维。

-ARLightEstimate
略

ARCore

google 发布了对应 Android studio、 Unity、Unreal以及Web的环境的ARCore。
API 官网：https://developers.google.com/ar/reference/。
SDK on Github ：https://github.com/google-ar;
2017.12.15更新：google 添加了java ，C平台的支持。（原来的Android Studio部分放到了Java里）
ARCore的结构比较灵活，
AR中，在一个真实场景中绘制一个虚拟物体，在绘制之前，开发者所必需知道的信息：

虚拟物体的位置姿态
视图矩阵，投影矩阵
虚拟物体的光照信息

这些信息都可以从ARCore中得到。我们先看google提供的java平台里的Sample。
先理一下Sample里的绘制逻辑，关于绘制的类有：
BackgroundRenderer用于绘制摄像头采集到的数据。
VirtualObject用于绘制Android小机器人。
VirtualObjectShadow用于给Android机器人绘制阴影。
PlaneRenderer用于绘制SDK识别出来的平面。
PointCloud用于绘制SDK识别出来的点云。
采用的是Opengl ES绘制：配置GLSurfaceView，实现GLSurfaceView.Renderer接口。关于绘制的部分都可以替换，可以用一些更现代化的3D图形框架等。

提一下Renderer接口;
`public interface Renderer {
void onSurfaceCreated(GL10 gl, EGLConfig config);
void onSurfaceChanged(GL10 gl, int width, int height);
void onDrawFrame(GL10 gl);
}
onSurfaceCreated这个方法在可绘制表面创建或重新创建的时候被调用。在这个回调里，可以做一些初始化的事情。注意，此方法运行在OpenGL线程中，具有OpenGL上下文，因此这里可以进行执行OpenGL调用。
onSurfaceChanged 这个方法在可绘制表面发生变化的时候被调用。此时外部可能改变了控件的大小，因此我们需要在这个调用里更新我们的视口信息，以便绘制的时候能准确绘制到屏幕中来。
void onDrawFrame 核心方法。在绘制的时候调用。每绘制一次，就会调用一次，即每一帧触发一次。这里是主要的绘制逻辑。
参考：https://juejin.im/post/59ac1f2bf265da249517ac72
具体的实现见Sample吧。

获取虚拟物体的POSE，ViewMatrix 和ProjectMatrix以及光照这些必要信息的具体接口：
我们先看java平台。https://developers.google.com/ar/reference/java/com/google/ar/core/package-summary

虚拟物体的Pose：

ARCore规定，当你想放置的虚拟对象，你需要定义一个锚Anchor，以确保ARCORE跟踪对象的位置随着时间的推移。
Anchor类
这里写图片描述
getPose就可以得到Anchor的当前位置。Sample 里调用 anchor.getPose().toMatrix(mAnchorMatrix, 0);得到了Anchor的位置矩阵，一个4x4 model-to-world transformation matrix, stored in column-major order。

视图矩阵和投影矩阵：

Camera类存储了这些信息：
这里写图片描述
Camera的属性会随着Session.update() is called update. 所以我们再看下Session类如何更新Camera。

Session类，Session类管理着AR 系统的状态， Session is the main entry point to ARCore API.
这里写图片描述
最后的update() 更新ARCore系统的状态，包括得到一个新的camera frame，更新device的位置，更新Anchor的位置，更新检测到的平面。新的camera 属性通过frame的getCamera()得到。
Frame类

hitTest是google的命中测试接口，检测用户是否点击到平面，是否加载虚拟物体。

光照暂时不关心。

C平台与Java 大同小异。
Session类：
这里写图片描述
Frame类，包括了Java中的Frame，Camera，HitResult，光照等类：

Anchor包括在Trackable类里：

阅读全文

0 0