Search code examples
pythondeep-learningkittiodometryvisual-odometry

How to evaluate Monocular Visual Odometry results used the KITTI odometry dataset


I am trying to use KITTI open dataset to do Deep Monocular Visual Odometry I tried to use this repo

it converts pose to 6DoF using this code

def get6DoFPose(self, p):
    pos = np.array([p[3], p[7], p[11]])
    R = np.array([[p[0], p[1], p[2]], [p[4], p[5], p[6]], [p[8], p[9], p[10]]])
    angles = self.rotationMatrixToEulerAngles(R)
    return np.concatenate((pos, angles))

def isRotationMatrix(self, R):
    Rt = np.transpose(R)
    shouldBeIdentity = np.dot(Rt, R)
    I = np.identity(3, dtype=R.dtype)
    n = np.linalg.norm(I - shouldBeIdentity)
    return n < 1e-6

def rotationMatrixToEulerAngles(self, R):
    assert (self.isRotationMatrix(R))
    sy = math.sqrt(R[0, 0] * R[0, 0] + R[1, 0] * R[1, 0])
    singular = sy < 1e-6

    if not singular:
        x = math.atan2(R[2, 1], R[2, 2])
        y = math.atan2(-R[2, 0], sy)
        z = math.atan2(R[1, 0], R[0, 0])
    else:
        x = math.atan2(-R[1, 2], R[1, 1])
        y = math.atan2(-R[2, 0], sy)
        z = 0
    return np.array([x, y, z], dtype=np.float32)

also model output is in same format (6DoF)

the question is how to evaluate 6DoF results as this evaluation tool (kitti-odom-eval) has only the below two formats are supported

# First format: skipping frames are allowed
99 T00 T01 T02 T03 T10 T11 T12 T13 T20 T21 T22 T23 

# Second format: all poses should be included in the file
T00 T01 T02 T03 T10 T11 T12 T13 T20 T21 T22 T23 

Solution

  • Your model output is relative position with euler angles for rotation which concated to translation.

    for evaluation you must:

    1. convert your rotation and translation to homogenous matrix 4x4 -> to do this you must convert your euler angles to rotation matrix and then concat rotation matrix 3x3 with translation vector 1x3 and append a extra row[0,0,0,1] a end of matrix to get homogenous matrix.

    2. convert your relative position 4x4 to absolute position-> your absolute position in Tk is dot product of relative position Trel to previous absolute position Tk-1 where T is homogenous matrix with frame index k

    Tk = Trel @ Tk-1

    The First absolute position is depending on your dataset and the work of you want to do. By default the base absolute position is 2-D array 4x4 with ones on the diagonal and zeros elsewhere (in numpy np.eye(4)) So for converting whole of relative position in a sequence you need multiply base absolute position to all relative positions

    Tk5 = Trel @ Tk4 # where Trel is relative position between frame 4 and 5

    1. When step 2 is finished you will have absolute positions contains Tn absolute position. then you must flatten the each Homogenous 4x4 matrix and get a vector with 12 elements and write each flattened Homogenous 4x4 matrix into a file as result!

    KITTI is must popular dataset in visual odometry I think reading this topics and referenced links can help you:Visual Odometry, Kitti Dataset