python deep-learning kitti odometry visual-odometry

How to evaluate Monocular Visual Odometry results used the KITTI odometry dataset

I am trying to use KITTI open dataset to do Deep Monocular Visual Odometry I tried to use this repo

it converts pose to 6DoF using this code

def get6DoFPose(self, p):
    pos = np.array([p[3], p[7], p[11]])
    R = np.array([[p[0], p[1], p[2]], [p[4], p[5], p[6]], [p[8], p[9], p[10]]])
    angles = self.rotationMatrixToEulerAngles(R)
    return np.concatenate((pos, angles))

def isRotationMatrix(self, R):
    Rt = np.transpose(R)
    shouldBeIdentity = np.dot(Rt, R)
    I = np.identity(3, dtype=R.dtype)
    n = np.linalg.norm(I - shouldBeIdentity)
    return n < 1e-6

def rotationMatrixToEulerAngles(self, R):
    assert (self.isRotationMatrix(R))
    sy = math.sqrt(R[0, 0] * R[0, 0] + R[1, 0] * R[1, 0])
    singular = sy < 1e-6

    if not singular:
        x = math.atan2(R[2, 1], R[2, 2])
        y = math.atan2(-R[2, 0], sy)
        z = math.atan2(R[1, 0], R[0, 0])
    else:
        x = math.atan2(-R[1, 2], R[1, 1])
        y = math.atan2(-R[2, 0], sy)
        z = 0
    return np.array([x, y, z], dtype=np.float32)

also model output is in same format (6DoF)

the question is how to evaluate 6DoF results as this evaluation tool (kitti-odom-eval) has only the below two formats are supported

# First format: skipping frames are allowed
99 T00 T01 T02 T03 T10 T11 T12 T13 T20 T21 T22 T23 

# Second format: all poses should be included in the file
T00 T01 T02 T03 T10 T11 T12 T13 T20 T21 T22 T23

Solution

Your model output is relative position with euler angles for rotation which concated to translation.

for evaluation you must:

convert your rotation and translation to homogenous matrix 4x4 -> to do this you must convert your euler angles to rotation matrix and then concat rotation matrix 3x3 with translation vector 1x3 and append a extra row[0,0,0,1] a end of matrix to get homogenous matrix.
convert your relative position 4x4 to absolute position-> your absolute position in Tk is dot product of relative position Trel to previous absolute position Tk-1 where T is homogenous matrix with frame index k

Tk = Trel @ Tk-1

The First absolute position is depending on your dataset and the work of you want to do. By default the base absolute position is 2-D array 4x4 with ones on the diagonal and zeros elsewhere (in numpy np.eye(4)) So for converting whole of relative position in a sequence you need multiply base absolute position to all relative positions

Tk5 = Trel @ Tk4 # where Trel is relative position between frame 4 and 5

When step 2 is finished you will have absolute positions contains Tn absolute position. then you must flatten the each Homogenous 4x4 matrix and get a vector with 12 elements and write each flattened Homogenous 4x4 matrix into a file as result!

KITTI is must popular dataset in visual odometry I think reading this topics and referenced links can help you:Visual Odometry, Kitti Dataset