How to solve Assertion Error when using solvePnP?

I'm newbie in Visual Odometry and is following the tutorial of solving VO using PnP. However when I run the program, I get the following error:

terminate called after throwing an instance of 'cv::Exception'
  what():  OpenCV(4.3.0) /home/wctu/opencv-4.3.0/modules/calib3d/src/solvepnp.cpp:754: error: (-215:Assertion failed) ( (npoints >= 4) || (npoints == 3 && flags == SOLVEPNP_ITERATIVE && useExtrinsicGuess) ) && npoints == std::max(ipoints.checkVector(2, CV_32F), ipoints.checkVector(2, CV_64F)) in function 'solvePnPGeneric'

My code is below:

string datas[2266];
string str1;
std::getline(file, str1);
datas[0] = str1;
for(int i = 1; !file.eof(); i++)
   {
      string str;
      std::getline(file, str);
      datas[i] = str;
      if(str.empty()) break;
      if(str.at(0) == '#') continue; /* comment */
      cout << datas[i-1] << endl << datas[i] << endl;
      Mat image, depth, image1, depth1;
      string rgbFilename1 = datas[i-1].substr(timestampLength + 1, rgbPathLehgth );
      string timestap1 = datas[i-1].substr(0, timestampLength);
      string depthFilename1 = datas[i-1].substr(2*timestampLength + rgbPathLehgth + 3, depthPathLehgth );

      image1 = imread(dirname + rgbFilename1);
      depth1 = imread(dirname + depthFilename1, -1);
      string rgbFilename = str.substr(timestampLength + 1, rgbPathLehgth );
      string timestap = str.substr(0, timestampLength);
      string depthFilename = str.substr(2*timestampLength + rgbPathLehgth + 3, depthPathLehgth );

      image = imread(dirname + rgbFilename);
      depth = imread(dirname + depthFilename, -1);
      CV_Assert(!image.empty());
      CV_Assert(!depth.empty());
      CV_Assert(depth.type() == CV_16UC1);

      cout << i << " " << rgbFilename << " " << depthFilename << endl;

      std::vector<KeyPoint> keypoints_1, keypoints_2;
      vector<DMatch> matches;
      find_feature_matches(image1, image, keypoints_1, keypoints_2, matches);
      cout << "一共找到了" << matches.size() << "组匹配点" << endl;
   
//   // 建立3D点
//Mat d1 = imread(depth1, IMREAD_UNCHANGED);       // 深度图为16位无符号数，单通道图像
      Mat K = (Mat_<double>(3, 3) << 525.0f, 0, 319.5f, 0, 525.0f, 239.5f, 0, 0, 1);
      vector<Point3f> pts_3d;
      vector<Point2f> pts_2d;
      for (DMatch m:matches) {
         ushort d = depth1.ptr<unsigned short>(int(keypoints_1[m.queryIdx].pt.y))[int(keypoints_1[m.queryIdx].pt.x)];
         if (d == 0)   // bad depth
            continue;
         float dd = d / 5000.0;
         Point2d p1 = pixel2cam(keypoints_1[m.queryIdx].pt, K);
         pts_3d.push_back(Point3f(p1.x * dd, p1.y * dd, dd));
         pts_2d.push_back(keypoints_2[m.trainIdx].pt);
      }
      cout << pts_3d[0] << " " << pts_2d[0] << endl;
      cout << "3d-2d pairs: " << pts_3d.size() << " " << pts_2d.size() <<  endl;
   
      chrono::steady_clock::time_point t1 = chrono::steady_clock::now();
      Mat r, t;
      solvePnP(pts_3d, pts_2d, K, Mat(), r, t, false); // 调用OpenCV 的 PnP 求解，可选择EPNP，DLS等方法
      Mat R;
      cv::Rodrigues(r, R); // r为旋转向量形式，用Rodrigues公式转换为矩阵
      chrono::steady_clock::time_point t2 = chrono::steady_clock::now();
      chrono::duration<double> time_used = chrono::duration_cast<chrono::duration<double>>(t2 - t1);
      cout << "solve pnp in opencv cost time: " << time_used.count() << " seconds." << endl;

the argv[1] is the text file that associates the rgb image with the depth, and the form is below:

1311877977.445420 rgb/1311877977.445420.png 1311877977.431871 depth/1311877977.431871.png

I've searched for the solutions online and try everything, but still in vain.

I really appreciate your guys' help, thanks in advance.

**Update: The inputs that occurs exception are below, there are only three pairs:

[0.94783, -1.70307, 7.3738] [383.4, 121.828]
[0.170393, -0.170453, 1.3256] [379.817, 186.325]
[0.610124, -0.161545, 3.4604] [403.108, 223.949]

Solution

The OpenCV function cv::solvePnP makes checks internally if the input data you supplied actually makes sense and actually matches the documentation (assertion). In your case it fails to do so and therefore throws an error message:

terminate called after throwing an instance of 'cv::Exception'
what():  OpenCV(4.3.0) /home/wctu/opencv-4.3.0/modules/calib3d/src/solvepnp.cpp:754: error:
(-215:Assertion failed)
( (npoints >= 4) || (npoints == 3 && flags == SOLVEPNP_ITERATIVE && useExtrinsicGuess) ) && 
npoints == std::max(ipoints.checkVector(2, CV_32F), ipoints.checkVector(2, CV_64F)) in function 'solvePnPGeneric'

So dimensions of the inputs are not right or the files you are using are not appropriate. The error is given in terms of its input arguments. Therefore you will have to look for the corresponding documentation of cv::solvePnP.

bool cv::solvePnP(InputArray  objectPoints,
                  InputArray  imagePoints,
                  InputArray  cameraMatrix,
                  InputArray  distCoeffs,
                  OutputArray rvec,
                  OutputArray tvec,
                  bool        useExtrinsicGuess = false,
                  int         flags = SOLVEPNP_ITERATIVE 
)

Comparing your input arguments to the ones given above you will see that you set useExtrinsicGuess to false and did not supply flags which defaults to SOLVEPNP_ITERATIVE. This already tells you that your error isn't caused by (npoints == 3 && flags == SOLVEPNP_ITERATIVE && useExtrinsicGuess) (as useExtrinsicGuess is set to false) but instead by (npoints >= 4).

Opening the corresponding source-code file on Github or in your source-code folder you will actually see that npoints is defined as

int npoints = std::max(opoints.checkVector(3, CV_32F), opoints.checkVector(3, CV_64F));

Now we have to figure out what checkVector does: See e.g. here It checks the channels and depth of the matrix and returns -1 if the requirement is not satisfied. Otherwise, it returns the number of elements in the matrix. Note that an element may have multiple channels..

This means your code is failing either because the supplied input format for the two data types is not correct or npoints is smaller than 4.

If you again look at the documentation it tells you that objectPoints expects Array of object points in the object coordinate space, Nx3 1-channel or 1xN/Nx1 3-channel, where N is the number of points. vector<Point3d> can be also passed here. while imagePoints expects an Array of corresponding image points, Nx2 1-channel or 1xN/Nx1 2-channel, where N is the number of points.

This is clearly fulfilled by the input pts_3d and pts_2d that you pass as they are std::vector<Point3f> and std::vector<Point3f> respectively. This means the only logical reason is that pts_3d and/or pts2d have actually less than 3 entries which is too little for a unique solution. This means there are insufficient feature matches found in between the supplied images in the step before!. Check again your input files and potentially try with different ones.