Search code examples
projectioncamera-calibrationlidar

Kitti Velodyne point to pixel coordinate


From the Velodyne point, how to get pixel coordinate for each camera?

Using pykitti point_cam0 = data.calib.T_cam0_velo.dot(point_velo) We can get the projection on the image which is equation 7 of the Kitti Dataset paper:

y = Prect(i) Rrect(0) Tvelocam x

But from there, how to get the actual pixel coordinates on each image?


Solution

  • Update: PyKitti version 0.2.1 exposes projection matrices for all cameras.

    I recently faced the same problem. For me, the problem was that pykitty didn't expose Prect and Rrect matrices for all cameras.

    For Pykitti > 0.2.1, use Prect and Rrect from calibration data.

    For previous versions, you have two options:

    1. Enter the matrices by hand (data is in the .xml calibration file for each sequence).
    2. Use this fork of pykitti: https://github.com/Mi-lo/pykitti/

    Then, you can use equation 7 to project a velodyne point into an image. Note that:

    • You will need 3D points as a 4xN array in homogeneous coordinates. Points returned by pykitti are a Nx4 numpy array, with the reflectance in the 4th column. You can prepare the points with the prepare_velo_points function below, which keeps only points with reflectance > 0, then replaces reflectance values with 1 to get homogeneous coordinates.

    • The velodyne is 360°. Equation 7 will give you a result even for points that are behind the camera (they will get projected as if they were in front, but vertically mirrored). To avoid this, you should project only points that are in front of the camera. For this, you can use the function project_velo_points_in_img below. It returns 2d points in homogeneous coordinates so you should discard the 3rd row.

    Here are the functions I used:

    def prepare_velo_points(pts3d_raw):
        '''Replaces the reflectance value by 1, and tranposes the array, so
           points can be directly multiplied by the camera projection matrix'''
    
        pts3d = pts3d_raw
        # Reflectance > 0
        pts3d = pts3d[pts3d[:, 3] > 0 ,:]
        pts3d[:,3] = 1
        return pts3d.transpose()
    
    def project_velo_points_in_img(pts3d, T_cam_velo, Rrect, Prect):
        '''Project 3D points into 2D image. Expects pts3d as a 4xN
           numpy array. Returns the 2D projection of the points that
           are in front of the camera only an the corresponding 3D points.'''
    
        # 3D points in camera reference frame.
        pts3d_cam = Rrect.dot(T_cam_velo.dot(pts3d))
    
        # Before projecting, keep only points with z>0 
        # (points that are in fronto of the camera).
        idx = (pts3d_cam[2,:]>=0)
        pts2d_cam = Prect.dot(pts3d_cam[:,idx])
    
        return pts3d[:, idx], pts2d_cam/pts2d_cam[2,:]
    

    Hope this helps!