Search code examples
c++opencvyolodarknet

Error: Assertion failed (dims() <= 2) in cv::MatSize::operator (), file D:\vcpkg\installed\x64-windows\include\opencv2\core\mat.inl.hpp, line 1198


I get assertion failed on this simple C++ example i found here

#include <iostream>
#include <opencv2/opencv.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/video.hpp>
#include <opencv2/dnn.hpp>
#include <opencv2/videoio.hpp>
#include <opencv2/imgproc.hpp>
using namespace cv;
using namespace std;
using namespace dnn;

int main()
{
    VideoCapture cap("D:/video1.mp4");
    std::string model = "./Models/yolov4_tiny_train2_best.weights";  
    std::string config = "./Models/yolov4_tiny_train2.cfg";

    Net network = readNet(model, config, "Darknet");
    network.setPreferableBackend(DNN_BACKEND_DEFAULT);
    network.setPreferableTarget(DNN_TARGET_OPENCL);

    for (;;)
    {
        if (!cap.isOpened()) {
            cout << "Video Capture Fail" << endl;
            break;
        }
        Mat img;
        cap >> img;
        static Mat blobFromImg;
        bool swapRB = true;
        blobFromImage(img, blobFromImg, 1, Size(416, 416), Scalar(), swapRB, false);
        cout << blobFromImg.size() << endl; #exception here
        float scale = 1.0 / 255.0;
        Scalar mean = 0;
        network.setInput(blobFromImg, "", scale, mean);
        Mat outMat;
        network.forward(outMat);
        int rowsNoOfDetection = outMat.rows;
        int colsCoordinatesPlusClassScore = outMat.cols;
        for (int j = 0; j < rowsNoOfDetection; ++j)
        {
            Mat scores = outMat.row(j).colRange(5, colsCoordinatesPlusClassScore);

            Point PositionOfMax;
            double confidence;
            minMaxLoc(scores, 0, &confidence, 0, &PositionOfMax);

            if (confidence > 0.5)
            {
                int centerX = (int)(outMat.at<float>(j, 0) * img.cols);
                int centerY = (int)(outMat.at<float>(j, 1) * img.rows);
                int width = (int)(outMat.at<float>(j, 2) * img.cols + 20);
                int height = (int)(outMat.at<float>(j, 3) * img.rows + 100);

                int left = centerX - width / 2;
                int top = centerY - height / 2;


                stringstream ss;
                ss << PositionOfMax.x;
                string clas = ss.str();
                int color = PositionOfMax.x * 10;
                putText(img, clas, Point(left, top), 1, 2, Scalar(color, 255, 255), 2, false);
                stringstream ss2;
                ss << confidence;
                string conf = ss.str();

                rectangle(img, Rect(left, top, width, height), Scalar(color, 0, 0), 2, 8, 0);
            }
        }

        namedWindow("Display window", WINDOW_AUTOSIZE);
        imshow("Display window", img);
        waitKey(25);
    }
    return 0;
}

I am confused why this doesn't work and the error message doesn't help at all. When I debugged I found out the exception happens on the line cout << blobFromImg.size() << endl;. Also the rows / cols of blobFromImg are -1 so that explains it.

Note: The video I load is 1280x720 and the darknet configuration is yolov4-tiny.


Solution

  • This can be reproduced in Debug mode, using the following short MCVE:

    #include <iostream>
    #include <opencv2/dnn.hpp>
    
    int main()
    {
        cv::Mat img(cv::Mat::zeros(720, 1280, CV_8UC3));
        cv::Mat blobFromImg;
        cv::dnn::blobFromImage(img, blobFromImg, 1, cv::Size(416, 416), cv::Scalar(), true, false);
        std::cout << blobFromImg.size() << std::endl;
    }
    

    to get

    OpenCV(4.6.0) Error: Assertion failed (dims() <= 2) in cv::MatSize::operator (), file .../opencv2/core/mat.inl.hpp, line 1198
    

    Since the check has been present in the OpenCV code base at least since v4.0, I'd wager that the code was broken from the moment it was published.


    By default, cv::Mat objects used by much of OpenCV are considered 2-dimensional. While colour planes/channels of images could be considered a third dimension, this is not so in OpenCV, where the channels are handled separately. For example, our BGR input image is a 2D Mat, with 720 rows and 1280 columns (and 3 channels).

    However, note the documentation of cv::dnn::blobFromImage:

    Creates 4-dimensional blob from image.

    with "blob" being

    4-dimensional Mat with NCHW dimensions order.

    Note that this Mat is not a 2D one as usual, but a 4D one. Why does this matter? Let's have a look at what the expression blobFromImg.size() actually involves.

    First we access the member variable cv::Mat::size of blobFromImg. This is an instance of cv::MatSize. So far, so good, this can represent an arbitrary number of dimensions. However, next we invoke cv::MatSize::operator(), to convert it to cv::Size. Unfortunately, Size can only represent two dimensions (width and height), which is sufficient for images and other 2D Mats, but ours is 4D.

    Hence, blobFromImg.size() is not a valid approach in this case, if you want to print the full dimensions of the Mat. Since the check is only a CV_DbgAssert, it won't complain in Release mode, but will only return the first two dimensions silently.

    In order for it to work properly, and print all the dimensions, you can write a simple convenience function to convert the MatSize to a string, and print that:

    #include <iostream>
    #include <opencv2/dnn.hpp>
    
    std::string to_string(cv::MatSize const& sz)
    {
        std::ostringstream s;
        s << "[";
        for (int i(0); i < sz.dims(); ++i) {
            if (i != 0) {
                s << " x ";
            }
            s << sz[i];
        }
        s << "]";
        return s.str();
    }
    
    int main()
    {
        cv::Mat img(cv::Mat::zeros(720, 1280, CV_8UC3));
        cv::Mat blobFromImg;
        cv::dnn::blobFromImage(img, blobFromImg, 1, cv::Size(416, 416), cv::Scalar(), true, false);
        std::cout << to_string(blobFromImg.size) << std::endl;
    }
    

    Which outputs:

    [1 x 3 x 416 x 416]
    

    Of course, there are other ways you could implement that functionality, but that's out of the scope of this problem.


    Also the rows / cols of blobFromImg are -1 so that explains it.

    That is a documented behaviour of Mat. Regarding cv::Mat::cols and cv::Mat::rows:

    the number of rows and columns or (-1, -1) when the matrix has more than 2 dimensions