Search code examples
c++imageopencvyuvlibyuv

opencv can't open a yuv422 image while rawpixels.net can display the image


I am trying to open a yuv format image. I can open it with rawpixels.net and display it after setting the following

width:1920
height:1080
predefined format: yuv420 (nv12)
pixel format yuv

But if I open with opencv with the following code I failed to open.

#include <iostream>
#include <opencv2/core.hpp>
#include <opencv2/opencv.hpp>

int main() {
    std::cout << "OpenCV version: " << CV_VERSION << std::endl;


    cv::Mat image = cv::imread("camera_capture_256_2020_10_07_11_11_02.yuv");
    if (image.empty() == true) {

        std::cout << "image empty"<< std::endl;

        return 0;
    }   
        
    cv::imshow("opencv_logo", image);
    cv::waitKey(0);    

    return 0;
}

The program prints as "image empty".

I am puzzled why I can't open the file with opencv.

The sample image is found here.

The yuv image opened with rawpixels.net would look like this.

enter image description here

Thanks,


Solution

  • The very first thing to do when dealing with raw (RGB, BGR, YUV, NV12 and others) images is to know the dimensions in pixels of the image - you are really quite lost without those - though you can do certain tricks to look for correlation to find the row width since each row is essentially similar to the one above normally.


    The next thing is to check the filesize is correct. So if it is RGB and 8-bit 1920x1080, your file must be 1920x1080x3 pixels in size - if not there is a problem. Your image is 1920x1080 and NV12 which is 12-bits or 1.5 bytes per pixel, so I expect your file to be 1920x1080*1.5 bytes. It is not that, so there is immediately a problem. There is either a header, or multiple frames or trailing data or some other issue.

    So, where is the image data in the file? At the start? At the end? One way to solve this is to look at the file as though it was purely a greyscale image and see if there are large blocks of black which are zero bytes or padding. As there is no known image size, I generally take the file size in bytes and go to Wolfram Alpha website and type in "factors of XXX" where XXX is the file size and then choose 2 numbers near the square-root of the file size so I get a square-ish image. So for yours, I chose 2720x3072 and treated your file as a single greyscale image of that size. Using ImageMagick in Terminal:

    magick -depth 8 -size 2720x3072 gray:camera_preview_250_2020_10_07_11_11_02.yuv image.jpg
    

    enter image description here

    I can see, at a glance that the data are at the start of the file and the end of the file is zero-padding, i.e. black. If the black had been at the start of the image, I would have taken the final H x W x 1.5 bytes.

    Another alternative for this step, is to take the file size in bytes and divide it by the image width to get a number of lines and see how that looks. So your file is 8355840 bytes, that would be 8355840/1920 or 4,325 lines. Let's try that:

    magick -depth 8 -size 1920x4352 gray:camera_preview_250_2020_10_07_11_11_02.yuv image.jpg
    

    enter image description here

    That is very encouraging because we can see the Y (greyscale) image at the start of the file and some lower-resolution UV channels following, and the fact that there are not 2 separate channels following probably means they are interlaced, alternating U and V samples rather than planar U samples followed by V samples.


    Ok, if your data is YUV or NV12, the best tool for that is ffmpeg. We already know that the data is at the start of the file and we know the dimensions and the format. We also know that there is padding after the image, so we need to just take the first frame like this:

    ffmpeg -s 1920x1080 -pix_fmt nv12 -i cam*yuv -frames:v 1 image.png
    

    enter image description here


    Now we have confidence about the dimensions and format, we need OpenCV to read that. The normal cv2.imread() cannot read that because it is just raw data, and unlike JPEG or PNG or TIFF, there is no image height and width in a header - it is just pure sensor data.

    So, you need to use the regular C/C++ read() system call to get the first 1920x1080x1.5 bytes. Then you need to call cv2.cvtColor() on the received buffer to convert it to a regular BGR format Mat.