Pybind11 cv::Mat from C++ to Python

I want to write a function that gets an image as parameter and returns another image and bind it into python using pybind11.

The part on how to receive the image as parameter is nicely solve thanks to this question.

On the other hand, the returning image is a bit tricky.

Here my code (I try flipping the image using a cv function as an example):

py::array_t<uint8_t> flipcvMat(py::array_t<uint8_t>& img)
{
    auto rows = img.shape(0);
    auto cols = img.shape(1);
    auto channels = img.shape(2);
    std::cout << "rows: " << rows << " cols: " << cols << " channels: " << channels << std::endl;
    auto type = CV_8UC3;

    cv::Mat cvimg2(rows, cols, type, (unsigned char*)img.data());

    cv::imwrite("/source/test.png", cvimg2); // OK

    cv::Mat cvimg3(rows, cols, type);
    cv::flip(cvimg2, cvimg3, 0);

    cv::imwrite("/source/testout.png", cvimg3); // OK

    py::array_t<uint8_t> output(
                                py::buffer_info(
                                cvimg3.data,
                                sizeof(uint8_t), //itemsize
                                py::format_descriptor<uint8_t>::format(),
                                3, // ndim
                                std::vector<size_t> {rows, cols , 3}, // shape
                                std::vector<size_t> {cols * sizeof(uint8_t), sizeof(uint8_t), 3} // strides
    )
    );
    return output;
}

And I call it from python as:

img = cv2.imread('/source/whatever/ubuntu-1.png')
img3= opencvtest.flipcvMat(img)

My input image is an RGB image.
Both images that are written with cv::imwrite are correct (the original is the same as the input and the 2nd is correctly flipped)

The Problem is On the python side, the returning image seems to be wrong-aligned (I can distinguish some shapes but the pixels are not in the right place.

My guess is that I have a problem while creating the py::buffer_infobut I cannot find it. What could I be doing wrong?

Solution

Yes it indeed is a problem with py::buffer_info, the strides to be more precise

Instead of:

{ cols * sizeof(uint8_t), sizeof(uint8_t), 3 }

The strides should be:

{ sizeof(uint8_t) * cols * 3, sizeof(uint8_t) * 3, sizeof(uint8_t)}