swift augmented-reality scenekit arkit realitykit

Camera Intrinsics Resolution vs Real Screen Resolution

I am writing an ARKit app where I need to use camera poses and intrinsics for 3D reconstruction.

The camera Intrinsics matrix returned by ARKit seems to be using a different image resolution than mobile screen resolution. Below is one example of this issue

Intrinsics matrix returned by ARKit is :

[[1569.249512, 0, 931.3638306],[0, 1569.249512, 723.3305664],[0, 0, 1]]

whereas input image resolution is 750 (width) x 1182 (height). In this case, the principal point seems to be out of the image which cannot be possible. It should ideally be close to the image center. So above intrinsic matrix might be using image resolution of 1920 (width) x 1440 (height) returned that is completely different than the original image resolution.

The questions are:

Whether the returned camera intrinsics belong to 1920x1440 image resolution?
If yes, how can I get the intrinsics matrix representing original image resolution i.e. 750x1182?

Solution

Intrinsics 3x3 matrix

Intrinsics camera matrix converts between the 2D camera plane and 3D world coordinate space. Here's a decomposition of an intrinsic matrix, where:

fx and fy is a Focal Length in pixels
xO and yO is a Principal Point Offset in pixels
s is an Axis Skew

According to Apple Documentation:

The values fx and fy are the pixel focal length, and are identical for square pixels. The values ox and oy are the offsets of the principal point from the top-left corner of the image frame. All values are expressed in pixels.

So you let's examine what your data is:

[1569,     0,    931]
[   0,  1569,    723]
[   0,     0,      1]

fx=1569, fy=1569
xO=931, yO=723
s=0

To convert a known focal length in pixels to mm use the following expression:

F(mm) = F(pixels) * SensorWidth(mm) / ImageWidth(pixels)

Points Resolution vs Pixels Resolution

Look at this post to find out what a Point Rez and what a Pixel Rez are.

Let's explore what is what when using iPhone X data.

@IBOutlet var arView: ARSCNView!

DispatchQueue.main.asyncAfter(deadline: .now() + 1.0) {
        
    let imageRez = (self.arView.session.currentFrame?.camera.imageResolution)!
    let intrinsics = (self.arView.session.currentFrame?.camera.intrinsics)!
    let viewportSize = self.arView.frame.size
    let screenSize = self.arView.snapshot().size
                    
    print(imageRez as Any)
    print(intrinsics as Any)
    print(viewportSize as Any)
    print(screenSize as Any)
}

Apple Documentation:

imageResolution instance property describes the image in the capturedImage buffer, which contains image data in the camera device's native sensor orientation. To convert image coordinates to match a specific display orientation of that image, use the viewMatrix(for:) or projectPoint(_:orientation:viewportSize:) method.

iPhone X imageRez (aspect ratio is 4:3).

These aspect ratio values correspond to camera sensor values:

(1920.0, 1440.0)

iPhone X intrinsics:

simd_float3x3([[1665.0, 0.0, 0.0],     // first column
               [0.0, 1665.0, 0.0],     // second column
               [963.8, 718.3, 1.0]])   // third column

iPhone X viewportSize (ninth part of screenSize):

(375.0, 812.0)

iPhone X screenSize (resolution declared in tech spec):

(1125.0, 2436.0)

Pay attention, there's no snapshot() method for RealityKit's ARView.

For more info, read this SO post.