Search code examples
pythonopencvcomputer-visioncoordinatesstereo-3d

Calculating real world co-ordinates using stereo images in Python and OpenCV


I'm working on calculating the real world coordinates of an object in a scene by using a pair of stereo images. The images are simulations of perfect pinhole cameras so there is no distortion to correct and there is no rotation. I know OpenCV has a bunch of functions to calibrate stereo cameras and create disparity maps, but if all I want to calculate is the coordinates of one point, is there a simple way to do that?


Solution

  • 1) Case of no rotation, only translation parallel to the horizontal axis of the image plane, cameras with equal focal lengths.

    Denote with "f" the common focal length. Denote with "b" the baseline of the stereo pair, namely the distance between the cameras' optical centers. Given a 3D point P, visible in both cameras at horizontal image coordinates x_left and x_right, denote with "d" their disparity, namely the difference d = x_left - x_right.

    By elementary geometry it then follows that the depth z_left of P in the left camera coordinates is:

    z_left = b * f / d.

    2) Any other case (unequal focal lengths, differences in other intrinsic parameters, non-linear lens distortion, inter-camera rotation, translation not parallel to the x axis, etc.):

    Don't bother, use OpenCV,