Search code examples
c++ffmpegcuda

How to get the format of nvdec video frame from ffmpeg decoding ? How to use it in cuda structures?


I am using code, almost exactly https://github.com/FFmpeg/FFmpeg/blob/master/doc/examples/hw_decode.c, but I am not transferring the decoded data to the host, I keep it on the device for later use with cuda code (my goal will be to use RGB 8-12 bits per channel data).

The interesting part is:

//after decoding/encoding
status = avcodec_receive_frame(avctx, frame);
if (status != AVERROR(EAGAIN) && status != AVERROR_EOF)
    LogMessage(L"frame timestamp: %i\n", (int)frame->best_effort_timestamp);
if (status == AVERROR(EAGAIN) || status == AVERROR_EOF)
{
    av_frame_free(&frame);
    return 0;
}
if (status < 0)
{
    LogMessage(L"Error while decoding\n");
    goto finish;
}
{
    auto descr = AVPixelFormatMap[(AVPixelFormat)frame->format];
    std::wstring dsdescr(descr.begin(), descr.end());
    LogMessage(L"Pixelformat: %s", dsdescr.c_str());

    cudaPitchedPtr CUdeviceptr0{}; // !! hypothetical usage of cudaPitchedPtr !!
    CUdeviceptr0.ptr = frame->data[0];
    CUdeviceptr0.pitch = frame->linesize[0]; //2048 == pitch ?
    CUdeviceptr0.ysize = frame->height; //1080
    CUdeviceptr0.xsize = frame->width; //1920
    cudaPitchedPtr CUdeviceptr1{};
    CUdeviceptr1.ptr = frame->data[1];
    CUdeviceptr1.pitch = frame->linesize[1];
    CUdeviceptr1.ysize = frame->height;
    CUdeviceptr1.xsize = frame->width;
}

I could not find a way to get the data format ((AVPixelFormat)frame->format value is AV_PIX_FMT_CUDA which translates to: "HW acceleration through CUDA. data[i] contain CUdeviceptr pointers exactly as for system memory frames.").

Unfortunately this says nothing about the pixel format. I believe it's YUV but I dot not have more details.

I understand that I have two arrays: frame->data[0] and frame->data[1].

  1. What is each ?
  • frame->data[0] contains data corresponding to Y or one of RGB channels, and is pitched. When I save that as an BW image, I recognized the image, except that it is one channel.
  • frame->data[1] is set but has no data. Hence, this is not part of the image content.
  • frame->data[2] is NULL.

It seems that U and V are somewhere else.

I would like to translate the data into a cuda native structure like:

cudaPitchedPtr CUdeviceptr{};
CUdeviceptr.ptr = frame->data[0];
CUdeviceptr.ysize = frame->height;
CUdeviceptr.xsize = frame->width;

Which is OK but incomplete, since width==1920 height==1080 (I have I think only Y of YUV).

  1. If this is Y of YUV, where are the U and V data ?

  2. How to get to know more details about the data format (RGB, BGR, YUV, 8/10/12bits per channel, 444, 422, 420 etc) ?


Solution

  • For a given frame of a video that can be decoded with nvdec in ffmpeg,

    (AVPixelFormat)frame->format will give:

    AV_PIX_FMT_CUDA: HW acceleration through CUDA. data[i] contain CUdeviceptr pointers exactly as for system memory frames.

    which is not helpful, but ((AVHWFramesContext*)frame->hw_frames_ctx->data)->sw_format will give:

    AV_PIX_FMT_NV12: planar YUV 4:2:0, 12bpp, 1 plane for Y and 1 plane for the UV components, which are interleaved (first byte U and the following byte V)

    which is everything that is needed.

    frame->data[0] is the luminance layer, fully populated, and frame->data[1] contains the chroma layers, half as big and interleaved populated.

    Both are pitched arrays, which can be used directly as cudaPitchedPtr.ptr. Their pitch is frame->linesize[0] and frame->linesize[1].

    So in a nutshell, I get cudaPitchedPtr from a frame with:

    cudaPitchedPtr MyClass::Frame2CuPitched(AVFrame* frame, int dataIndex, bool halfHeight)
    {
        cudaPitchedPtr cup{};
        cup.ptr = frame->data[dataIndex];
        cup.pitch = frame->linesize[dataIndex]; //2048 == pitch
        cup.ysize = halfHeight ? frame->height / 2 : frame->height; //1080
        cup.xsize = frame->width; //1920
        return cup;
    }