Search code examples
pythonopencvstereoscopy

How to improve depth map and what are my stereo images lacking?


I've been trying to convert stereo images into a depth map with use of opencv, but not matter what I do it seems to come out unreadable.

I was able to get an accurate depth image of example images that were provided in the opencv tutorial but not on any other image. Even when I attempted to download other premade, calibrated stereo image from online I get terrible results that are neither accurate nor are even close to quality that I get with the example images.

here is my main python script that I use to make the depth map:

import numpy as np
import cv2
from matplotlib import pyplot as plt
imgL = cv2.imread('calimg_L.png',0)
imgR = cv2.imread('calimg_R.png',0)
# imgL = cv2.imread('./images/example_L.png',0)
# imgR = cv2.imread('./images/example_R.png',0)
stereo = cv2.StereoSGBM_create(numDisparities=16, blockSize=15)
disparity = stereo.compute(imgR,imgL)
norm_image = cv2.normalize(disparity, None, alpha = 0, beta = 1, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_32F)
cv2.imwrite("disparityImage.jpg", norm_image)
plt.imshow(norm_image)
plt.show()

where calimg_L.png is a calibrated version of the original image.

Here is the code I use to calibrate my images:

import numpy as np
import cv2
import glob

from matplotlib import pyplot as plt
def createCalibratedImage(inputImage, outputName):
    # termination criteria
    criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)

    # prepare object points, like (0,0,0), (1,0,0), (2,0,0) ....,(6,5,0)
    objp = np.zeros((3*3,3), np.float32)
    objp[:,:2] = np.mgrid[0:3,0:3].T.reshape(-1,2)

    # Arrays to store object points and image points from all the images.
    objpoints = [] # 3d point in real world space
    imgpoints = [] # 2d points in image plane.
    # org = cv2.imread('./chess.jpg')
    # orig_cal_img = cv2.resize(org, (384, 288))
    # cv2.imwrite("cal_chess.jpg", orig_cal_img)

    images = glob.glob('./chess_webcam/*.jpg')

    for fname in images:
        print('file in use: ' + fname)
        img = cv2.imread(fname)
        gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

        # Find the chess board corners
        ret, corners = cv2.findChessboardCorners(gray, (3,3),None)

        # print("doing the thing");
        print('status: ' + str(ret));
        # If found, add object points, image points (after refining them)
        if ret == True:
            # print("found something");
            objpoints.append(objp)

            cv2.cornerSubPix(gray,corners,(11,11),(-1,-1),criteria)
            imgpoints.append(corners)

            # Draw and display the corners
            cv2.drawChessboardCorners(img, (3,3), corners,ret)
            cv2.imshow('img',img)
            cv2.waitKey(500)
            ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, gray.shape[::-1],None,None)
            img = inputImage
            h,  w = img.shape[:2]
            newcameramtx, roi=cv2.getOptimalNewCameraMatrix(mtx,dist,(w,h),1,(w,h))
    # undistort
    print('undistorting...')
    mapx,mapy = cv2.initUndistortRectifyMap(mtx,dist,None,newcameramtx,(w,h),5)
    dst = cv2.remap(inputImage ,mapx,mapy,cv2.INTER_LINEAR)
    # crop the image
    x,y,w,h = roi
    dst = dst[y:y+h, x:x+w]
    # cv2.imwrite('calibresult.png',dst)
    cv2.imwrite(outputName + '.png',dst)
    cv2.destroyAllWindows()


original_L = cv2.imread('capture_L.jpg')
original_R = cv2.imread('capture_R.jpg')
createCalibratedImage(original_R, "calimg_R")
createCalibratedImage(original_L, "calimg_L")
print("images calibrated and outputed")

This code was taken from opencv tutorial on how to calibrate images and was provided at least 16 images of the chess board, but was only able to identify the chessboard in about 4 - 5 of them. The reason I used such a relatively small grid search of 3x3 is because anything higher left me without any images to use for calibration due to its inability to find the chessboard.

Here is what I get from an example image(sorry for weird link, couldn't find how to upload): https://ibb.co/DYMcdZc

here is the original: https://ibb.co/gMkqyXD https://ibb.co/YQZY40C

This acts a it should, but when I use it with any other image it gives me a mess, for example:

output: https://ibb.co/kXwgDVn

looks like just a mess of pixels, to be fair when you put it into 'gray' on imshow it looks more readable but it is not very representative of the image's depth, here are the originals: https://ibb.co/vqDKGS0 https://ibb.co/f0X1gMB

Even worse so, when I take images myself and do calibrate them through the chessboard code, it comes out as just a random mess of white and black pixels, and values of some goes into negatives and some pixels are impossibly high value.

tl;dr I can't get any stereo images to be made into a depth map even though the example image works just fine, why is that?


Solution

  • First I want to say that obtaining a good depth map is not such a simple task, and using the basic StereoMatching won't always lead to good results. Nevertheless, something better can be achieved.

    In order:

    • Calibration: you should be able to find the checkerboard in more images, 4/5 is a very low number for calibration, it is very hard to estimate correctly the camera parameters with such low number. How do the images look like? Did you read them as grayscale images? Usually also using a different number for row and column (not 3x3 grid, like 4x3) helps to understand the checkerboard position (otherwise it could be ambiguous which side is up or right, for example, a 90 rotation would result in 0 rotation).
    • Rectification: this can be easily checked by looking at the images. Open two images on two different layers (using GIMP or similar) and check for similar points. After you rectified the images, they should lie on the same line. Are they really on the same line? If yes, rectification work, otherwise, you need a better calibration. The stereo matching won't work without this step.
    • Stereo Matching: if all above steps are correct, then you may have a problem on the parameters of the stereo matching. First thing to check is disparity range (since it looks like you have different resolution between example images and your images, you should check and adapt that value). Min disparity can also help (if you reduce the disparity range, you reduce the error possibilities) and also block size (15 is quite big, smaller is also enough).

    From what you say, my guess would be the problem is on the calibration. You should try to check the rectified images, and if the problem is there try to acquire a new dataset (or find online a better one) and calibrate your images there. Once you can calibrate and rectify your images correctly, you should get better results. I see the code is similar to the tutorial here so I guess that's correct and the main problem are the images. Hope this can help,I can help you more if you test and see where the probelm is!