c++ffmpeg hardware-acceleration dxva swscale

How to employ DXVA2 with ffmpeg to decode and get frame

I have searched for a simple example about how to decode H264 stream with hardware acceleration using ffmpeg in Windows but I could not find any. I know that I should employ dxva2 with ffmpeg to have hardware acceleration.

I can decode H264 with ffmpeg on CPU, then convert NV12 format to RGBA and save frames as bmp files, thanks to example project provided in post.

I have followed what is provided in the following post to get help about dxva2: post

I believe I can successfully decode with dxva2; however, when I want to get the decoded frame and convert it to RGBA format and save it as bmp file, I get an error about source pointers.

I decode and retrieve frame as following:

int videoFrameBytes = avcodec_decode_video2(pCodecCtx_hwaccel, pFrameYuv, &got_picture_ptr, avpkt);

if (got_picture_ptr==1)
{
    if(dxva2_retrieve_data_call(pCodecCtx_hwaccel, pFrameYuv) == 0)
    {
        fprintf(stderr, "Got frame successfully\n");
        result = true;
    }
}

and feed the output frame to:

sws_scale(pImgConvertCtx, pFrameYuv->data, pFrameYuv->linesize, 0, height, frame->data, frame->linesize);

I get this error:

[swscaler @ 030c5c20] bad src image pointers

Obviously something is wrong with the pFrameYuv->data but I do not know what.

How can we convert NV12 frame decoded with DXVA2 to RGBA with sws_scale?

Solution

Problem solved.

It was due to improper pixel format type. When creating sws_context, I have used the pixel format of the codec context as shown below: ( Which was OK for SW decoding)

// initialize SWS context for software scaling
sws_ctx = sws_getContext(pCodecCtx->width,
    pCodecCtx->height,
    pCodecCtx->pix_fmt,
    pCodecCtx->width,
    pCodecCtx->height,
    AV_PIX_FMT_RGB24,
    SWS_BILINEAR,
    NULL,
    NULL,
    NULL
    );

And pCodecCtx->pix_fmt was AV_PIX_FMT_YUV420P but decoded frame format is AV_PIX_FMT_NV12 with DXVA2. After setting right format, I could use sws_scale to convert the NV12 frame to RGB.

Right parameters:

// initialize SWS context for software scaling
sws_ctx = sws_getContext(pCodecCtx->width,
    pCodecCtx->height,
    AV_PIX_FMT_NV12,
    pCodecCtx->width,
    pCodecCtx->height,
    AV_PIX_FMT_RGB24,
    SWS_BILINEAR,
    NULL,
    NULL,
    NULL
    );

Also, pay attention to unref the output retrieved frame as follows:

av_frame_unref(pFrameYuv);

Otherwise you have a memory leak.

Plus, as stated in post, dxva2_retrieve_data_call is very inefficient. You should look for another way to get data from GPU.