Search code examples
computer-visiontriangulationdepth3d-reconstructiondisparity-mapping

How to compute the true depth given a disparity map of rectified images?


I have calculated a disparity map for a given rectified stereopair! I can calculate my depth using the formula

z = (baseline * focal) / (disparity * p)

Let's assume that the baseline, focal length and pixel constant p are known and I used the same camera for both images. Now it is possible that my disparity is in the range of -32..128[pixel]. When I use the above formula I will get infinity/divided by zero for my values of 0 disparity. When i move my disparity values to lets say 1..161 I have chosen the range of my disparity values arbitrary and that's a problem because the function 1/disparity will give a completly different value spacing at 1..161 or 100..260 that isn't even linear. So I wouldn't even get a reconstruction up to (linear)scale because the scale change is non-linear.

How can i determine in what area my disparity has to lie to get a metric reconstruction with the above formula? Or is it simply not possible to reconstruct something metrically with the above formula and rectified images? And if that's the case, why?

(I know I can reproject to my non-rectified images and do a triangulation but I want to know especially WHY or IF it is not possible with the above formula. Thanks to anyone who can help me!)


Solution

  • I did some more research and think I can now answer my question. I think in the comments we talked a bit past each other. Maybe it now gets clearer what i exactly meant.

    Parallel Setup: The formula z = (baseline * focal) / (disparity * p) can only be used if the images are captured by a parallel camera setup. If the cameras are truly parallel, it is not possible to have negative AND positive disparities. So you won't get a disparity value of 0. In that scenario, 0 only corresponds to a point at infinity. If a true parallel setup is present, this formula can be used for a metric reconstruction.

    Converged Setup: In reality your images are mostly captured by a converged camera setup. That means in the stereo-pair images a point of convergence exists, that has a disparity value of 0. The sign of the disparites in front of that point and behind that point will be different. That means your disparity contains values that are negative, positive and equal to zero in the point of convergence. Although your images are rectified, you cannot use the above formula because the images were captured by a converged stereo camera setup. It is not possible to shift your disparity to "only positive signed values" to use the formula correctly. However, the result using shifted values will be "some kind of similar" to the correct 3-D-reconstruction but strangely scaled and distorted by an unknown transformation.