Search code examples
pythonopencvcomputer-visionaugmented-realityhomography

Calculate camera world position with OpenCV Python


I want to calculate my camera's position in world coordinates. This should be fairly easy, but I don't get the results I expect. I believe I've read everything on this topic, but my code isn't working. Here's what I do:

I have a camera looking at an area.

1) I drew a map of the area.

2) I calculated the homography by matching 4 image points to 4 points on my map using cv2.getPerspectiveTransform

3) The H homography transforms every world coordinate to camera coordinate; this is working properly

4) To calculate the camera matrix I followed this:

translation = np.zeros((3,1)) 
translation[:,0] = homography[:,2]

rotation = np.zeros((3,3))
rotation[:,0] = homography[:,0]
rotation[:,1] = homography[:,1]
rotation[:,2] = np.cross(homography[0:3,0],homography[0:3,1])

cameraMatrix = np.zeros((3,4))
cameraMatrix[:,0:3] = rotation
cameraMatrix[:,3] = homography[:,2]

cameraMatrix = cameraMatrix/cameraMatrix[2][3] #normalize the matrix

5) According to this, the camera's position should be calculated like this:

x,y,z = np.dot(-np.transpose(rotation),translation)

The coordinates I'm getting are totally wrong. The problem should be somewhere in step 4 or 5 I guess. What's wrong with my method?


Solution

  • I think I've got it now. The problem was with the method described in step 4. The camera position cannot be calculated from the homography matrix alone. The camera intrinsics matrix is also necessary. So, the correct procedure is the following:

    1) draw a map of the area

    2) calibrate the camera using the chessboard image with cv2.findChessboardCorners this yields the camera matrix and the distortion coefficients

    3) solvePnP with the world coordinates (3D) and image coordinates (2D). The solvePnP returns the object's origo in the camera's coordinate system given the 4 corresponding points and the camera matrix.

    4) Now I need to calculate the camera's position in world coordinates. The rotation matrix is: rotM = cv2.Rodrigues(rvec)[0]

    5) The x,y,z position of the camera is: cameraPosition = -np.matrix(rotM).T * np.matrix(tvec)