Search code examples
androidopencv3d-reconstruction

PointCloud from two undistorted images


I want to do some Structure from Motion using OpenCV. This should happen on Android. Currently I am having the cameraMatrix (intrinsic parameters) and the distortion coefficients from the camera calibration.

The user should now take 2 images from building and the app should generate a pointcloud. Note: the user maybe also rotates the camera of the smartphone a little bit as he moves along one side of the building...

At the current point, I have the following information:

  • the undistorted left image
  • the undistorted right image
  • a list of good matches using SIFT
  • the homography matrix
  • the fundamental matrix

I've searched the internet and now I am very confused how I should proceed... Some say I need to use stereoRectify for getting Q and use Q with reprojectImageTo3D() for getting the pointCloud.

Others say that I need to use stereoRectifyUncalibrated and use H1 and H2 from this method to fill all the parameters of triangulatePoints. In triangulatePoints I need the projectionMatrix of each camera/image but from my understanding this seems definitly wrong.

So for me there are some problems:

  • How do I get R and T (Rotation and Translation) from all the information I already have
  • If I use stereoRectify, the first 4 parameters are cameraMatrix1, distortionCoeff1, cameraMatrix2, distortionCoeff2) - If I do not have a stereoCamera like Kinect, are the ameraMatrix1 and cameraMatrix2 equals for my setup (mono camera on a smartphone)
  • How can I obtain Q (guess if I have R and T I can get it from stereoRectify)
  • Is there anonther way of getting the projectioMatrices for each camera so I can use the triangulationmethod provided by OpenCV

I know this are a lot of questions, but googeling confused me so I need to get this straight. I hope someone can help me with my problems.

Thanks

PS as this are more theoretical questions I did not post some code. If you want / need to see code or the values of my camera calibration, just ask and I will add them to my posting.


Solution

  • I wrote something about using Farneback's optical flow for Structure from Motion before. You can read the details here.

    But here's the code snippet, it's a somewhat working, but not great implementation. Hope that you can use it as a reference.

    /* Try to find essential matrix from the points */
    Mat fundamental = findFundamentalMat( left_points, right_points, FM_RANSAC, 0.2, 0.99 );
    Mat essential   = cam_matrix.t() * fundamental * cam_matrix;
    
    /* Find the projection matrix between those two images */
    SVD svd( essential );
    static const Mat W = (Mat_<double>(3, 3) <<
                         0, -1, 0,
                         1, 0, 0,
                         0, 0, 1);
    
    static const Mat W_inv = W.inv();
    
    Mat_<double> R1 = svd.u * W * svd.vt;
    Mat_<double> T1 = svd.u.col( 2 );
    
    Mat_<double> R2 = svd.u * W_inv * svd.vt;
    Mat_<double> T2 = -svd.u.col( 2 );
    
    static const Mat P1 = Mat::eye(3, 4, CV_64FC1 );
    Mat P2 =( Mat_<double>(3, 4) <<
             R1(0, 0), R1(0, 1), R1(0, 2), T1(0),
             R1(1, 0), R1(1, 1), R1(1, 2), T1(1),
             R1(2, 0), R1(2, 1), R1(2, 2), T1(2));
    
    /*  Triangulate the points to find the 3D homogenous points in the world space
        Note that each column of the 'out' matrix corresponds to the 3d homogenous point
     */
    Mat out;
    triangulatePoints( P1, P2, left_points, right_points, out );
    
    /* Since it's homogenous (x, y, z, w) coord, divide by w to get (x, y, z, 1) */
    vector<Mat> splitted = {
        out.row(0) / out.row(3),
        out.row(1) / out.row(3),
        out.row(2) / out.row(3)
    };
    
    merge( splitted, out );
    
    return out;