Search code examples
pythonopencvcomputer-visioncamera-calibrationhomography

Which openCv function can be used to compute BEV perspective transformation given a point coordinates and the camera extrinsics/intrinsics?


I have the 3x3 intrinsics and 4x3 extrinsics matrices for my camera obtained via cv2.calibrateCamera()

Now I want to use these paramenters to compute the BEV (Bird Eye View) transformation for any given coordinates in a frame obtained from the camera.

Which openCv function can be used to compute the BEV perspective transformation for given point coordinates and the camera extrinsics and/or intrinsics 3x3 matrices?

I found something very related in the following post: https://deepnote.com/article/social-distancing-detector/ based on https://www.pyimagesearch.com/2014/08/25/4-point-opencv-getperspective-transform-example/ ,

they are using cv2.getPerspectiveTransform() to get a 3X3 matrix, but I don't know whether this matrix represents the intrinsics, the extrinsecs or something else. Then they are transforming the list of points using such matrix in the following way:

#Assuming list_downoids is the list of points to be transformed and matrix is the one obtained above
list_points_to_detect = np.float32(list_downoids).reshape(-1, 1, 2)
transformed_points = cv2.perspectiveTransform(list_points_to_detect, matrix)

I really need to know if I can use this cv2.perspectiveTransform function to compute the transformation or if there's another better way to do this using the extrinsics, the intrinsics or both, without having to reuse the frame, since I already have the detected/selected coordinates saved in an array.


Solution

  • After a deep investigation, I found out a good solution:

    The projection matrix is a multiplication between theextrinsic and the intrinsic camera matrices

    cv2.getPerspectiveTransform() gives us the Projection Matrix when we don't have the camera params:

    cv2.warpPerspective() transforms the image itsef.

    For the problem above we don't need these two functions since we already have the extrinsics, the intrinsecs and the coordinates of the points in the image.

    Considering the presented above, I wrote a function to transform into BEV a list o points list_x_y given the intrinsics and the extrinsics:

        def compute_point_perspective_transformation(intrinsics, extrinsics, point_x_y):
        """Auxiliary function to project a specific point to BEV
            
            Parameters
            ----------
            intrinsics (array)     : The camera intrinsics matrix
            extrinsics (array)     : The camera extrinsics matrix
            point_x_y (tuple[x,y]) : The coordinates of the point to be projected to BEV
            
            Returns
            ----------
            tuple[x,y] : the projection of the point
        """
            # Using the camera calibration for Bird Eye View
            intrinsics_matrix = np.array(intrinsics, dtype='float32')
            #In the intrinsics we have parameters such as focal length and the principal point
    
            extrinsics_matrix = np.array(extrinsics, dtype='float32')
            #The extrinsic matrix stores the position of the camera in global space
            #The 1st 3 columns represents the rotation matrix and the last is a translation vector
            extrinsics = extrinsics[:, [0, 1, 3]]
    
            #We removed the 3rd column of the extrinsics because it represents the z coordinate (0)
            projection_matrix = np.matmul(intrinsics_matrix, extrinsics_matrix)
    
            # Compute the new coordinates of our points - cv2.perspectiveTransform expects shape 3
            list_points_to_detect = np.array([[point_x_y]], dtype=np.float32)
            transformed_points = cv2.perspectiveTransform(list_points_to_detect, projection_matrix)
            return transformed_points[0][0][0], transformed_points[0][0][1]