Search code examples
ios3dcameraarkit

ARKit - Why is the camera's viewMatrix position changing when the device is rotated?


When I rotate my test device about a certain axis, the camera's viewMatrix's x-axis position value (column 3, row 1) is changing significantly. On the order of a meter in translation along the x-axis when only the device's rotation is changed 180 degrees. If I rotate a full 360 degrees the x position returns to it's starting 0 degree value. I am not translating the device at all (ignoring minor human error).

Is this a bug or configuration setup issue? Can someone explain why the x-axis position would change when only rotating the device? Is anyone else seeing this?

Here is the basic code setup:

@property (nonatomic, strong) ARWorldTrackingConfiguration *arSessionConfiguration;
@property (nonatomic, strong) ARSession *arSession;

- (void)setup
{
  self.arSessionConfiguration = [ARWorldTrackingConfiguration new];
  self.arSessionConfiguration.worldAlignment = ARWorldAlignmentGravity;
  self.arSessionConfiguration.planeDetection = ARPlaneDetectionHorizontal;
  self.arSession = [ARSession new];
  self.arSession.delegate = self;
  [self.arSession runWithConfiguration:self.arSessionConfiguration];
}

- (void)session:(ARSession *)session didUpdateFrame:(ARFrame *)frame
{
  matrix_float4x4 viewMatrix = [frame.camera viewMatrixForOrientation:UIInterfaceOrientationPortrait];
  NSLog(@"%0.2f, %0.2f, %0.2f", viewMatrix.columns[3][0], viewMatrix.columns[3][1], viewMatrix.columns[3][2]);
}

I am testing on an 10.5" iPad Pro running the latest iOS 11 beta.


Solution

  • This is due to a mis-understanding: the 4th column of the view matrix is not the camera's position.

    This is because the view matrix is the inverse of the camera's transformation matrix, i.e. multiplying it by a world point transforms that point to the local basis of the camera.

    For a camera with rotation matrix R (3x3) and position c, multiplying its view matrix V (4x4) with a point p is equivalent to:

    enter image description here

    We can deduce that the view matrix has the following construction:

    enter image description here

    Therefore, to obtain the actual position of the camera, we must multiply the 4th column by [minus] the transpose / inverse of the top-left 3x3 sub-matrix of the view matrix.

    I.e., something like:

    matrix_float3x3 topLeftSubMtx = /*viewMatrix[0][0] to viewMatrix[3][3]*/;
    vector_float4 rightColumn = viewMatrix.columns[3];
    float positionX = -[vector_float4.dotProduct topLeftSubMtx.columns[0], rightColumn];
    // similarly for Y and Z
    

    (Apologies as I don't know objective-C or ARKit specifics; hopefully you get the gist of this pseudo-ish code)