Search code examples
pythonopencv

Object points and Image points in OpenCV calibrateCamera


I would like some clarification on the parameters for OpenCV's calibrateCamera function. The function is cv.CalibrateCamera2(objectPoints, imagePoints, pointCounts, imageSize, cameraMatrix, distCoeffs, rvecs, tvecs, flags=0)

The 'imagePoints' are the 'detected' corners in the planar calibration pattern, in my understanding. But I don't understand the role of the objectPoints in helping us recover the cameraMatrix, and the way their values are set.


Solution

  • In summary

    • objectPoints are reference points in "world coordinates" that never "move" (because the reference is the pattern plan)
    • objectPoints coordinates are based on the pattern plan reference (0,0,0) point and a reference length
      • Top left corner of the chessboard is taken as reference in most tutorials
      • Reference length is "a square's side length" for the chessboard pattern
      • Z-coordinates are always 0 because the pattern is planar and the chessboard points depth do not change in the chessboard plan
    • objectPoints coordinates are put in relation to their imagePoints pixel coordinates counterpart during the camera calibration
    • cameraCalibration estimates the extrinsic ([R|t] matrix) parameters for camera motion around the static scene
    • cameraCalibration estimates the intrinsic parameters for focal length (fx, fy) and image center (cx, cy)
    • cameraCalibration tries to minimize the runs the re-projection error, that is, the total sum of squared distances between the observed feature/image points and the projected object points

    Also, it is worth noting that "Currently, initialization of intrinsic parameters (when CV_CALIB_USE_INTRINSIC_GUESS is not set) is only implemented for planar calibration patterns (where Z-coordinates of the object points must be all zeros). 3D calibration rigs can also be used as long as initial cameraMatrix is provided." So if you are not providing camera focal length (fx, fy) and image center (cx, cy) intrinsic parameters, you have to use a planar (Z=0) calibration pattern.

    Looking at objectPoints definition in details

    • std::vectorstd::vector<cv::Vec3f> - Vector of vectors of calibration pattern points in the calibration pattern coordinate space ("world coordinates")
      • First vector std::vector

        • Contains as many elements as the number of the pattern views
        • If the same calibration pattern is shown in each view and it is fully visible, all the vectors will be the same [size]
        • [If using] partially occluded patterns, or even different patterns in different views. Then, the vectors will be different [size]
      • Second vector cv::Vec3f

        • Contains as many elements as the number of points to detect in the calibration pattern
      • Third vector Second vector std::vector

        • Contains (X,Y,Z) coordinates on the pattern to be mapped with the 2D coordinates that will be found on the projection of the calibration pattern
        • The points are 3D, but since they are in a pattern coordinate system, then, if the rig is planar, it may make sense to put the model to a XY coordinate plane so that Z-coordinate of each input object point is 0.
        • Or, said in another way, "The world coordinate is attached to the checkerboard and since all the corner points lie on a plane, we can arbitrarily choose Zw for every point to be 0"
      • "In the process of calibration we calculate the camera parameters by a set of know 3D points (Xw, Yw, Zw) [in world coordinates] and their corresponding pixel location (u,v) in the image."

        • You have a pattern, you assign its points their "world coordinates". I.e. relative distance between each points, compared to this world's reference scale, which is "a square's side length"
        • You could have 10 different patterns if you want, and you process them to find their corresponding pixel location (u,v) in the image
        • You do not care about the orientation of the camera compared to the pattern (chessboard), calibrateCamera will estimate those in its algorithm based on the pixel distance between each calibration pattern points and there pattern points "world coordinates"
        • calibrateCamera will perform estimation of the [R|t] matrix of extrinsic parameters
          • "[R|t] translates coordinates of a point (X, Y, Z) to a coordinate system, fixed with respect to the camera.orientation of camera)." "Describes the camera motion around the static scene"
        • calibrateCamera will estimate parameters focal length (fx, fy) and image center (cx, cy) intrinsic parameters
        • "[calibrateCamera will] run the global Levenberg-Marquardt optimization algorithm to minimize the reprojection error, that is, the total sum of squared distances between the observed feature points imagePoints and the projected (using the current estimates for camera parameters and the poses) object points objectPoints"

    References