I am using code, almost exactly https://github.com/FFmpeg/FFmpeg/blob/master/doc/examples/hw_decode.c, but I am not transferring the decoded data to the host, I keep it on the device for later use with cuda code (my goal will be to use RGB 8-12 bits per channel data).
The interesting part is:
//after decoding/encoding
status = avcodec_receive_frame(avctx, frame);
if (status != AVERROR(EAGAIN) && status != AVERROR_EOF)
LogMessage(L"frame timestamp: %i\n", (int)frame->best_effort_timestamp);
if (status == AVERROR(EAGAIN) || status == AVERROR_EOF)
{
av_frame_free(&frame);
return 0;
}
if (status < 0)
{
LogMessage(L"Error while decoding\n");
goto finish;
}
{
auto descr = AVPixelFormatMap[(AVPixelFormat)frame->format];
std::wstring dsdescr(descr.begin(), descr.end());
LogMessage(L"Pixelformat: %s", dsdescr.c_str());
cudaPitchedPtr CUdeviceptr0{}; // !! hypothetical usage of cudaPitchedPtr !!
CUdeviceptr0.ptr = frame->data[0];
CUdeviceptr0.pitch = frame->linesize[0]; //2048 == pitch ?
CUdeviceptr0.ysize = frame->height; //1080
CUdeviceptr0.xsize = frame->width; //1920
cudaPitchedPtr CUdeviceptr1{};
CUdeviceptr1.ptr = frame->data[1];
CUdeviceptr1.pitch = frame->linesize[1];
CUdeviceptr1.ysize = frame->height;
CUdeviceptr1.xsize = frame->width;
}
I could not find a way to get the data format ((AVPixelFormat)frame->format
value is AV_PIX_FMT_CUDA
which translates to: "HW acceleration through CUDA. data[i] contain CUdeviceptr pointers exactly as for system memory frames.").
Unfortunately this says nothing about the pixel format. I believe it's YUV but I dot not have more details.
I understand that I have two arrays: frame->data[0]
and frame->data[1]
.
frame->data[0]
contains data corresponding to Y or one of RGB channels, and is pitched. When I save that as an BW image, I recognized the image, except that it is one channel.frame->data[1]
is set but has no data. Hence, this is not part of the image content.frame->data[2]
is NULL.It seems that U and V are somewhere else.
I would like to translate the data into a cuda native structure like:
cudaPitchedPtr CUdeviceptr{};
CUdeviceptr.ptr = frame->data[0];
CUdeviceptr.ysize = frame->height;
CUdeviceptr.xsize = frame->width;
Which is OK but incomplete, since width==1920 height==1080 (I have I think only Y of YUV).
If this is Y of YUV, where are the U and V data ?
How to get to know more details about the data format (RGB, BGR, YUV, 8/10/12bits per channel, 444, 422, 420 etc) ?
For a given frame of a video that can be decoded with nvdec in ffmpeg,
(AVPixelFormat)frame->format
will give:
AV_PIX_FMT_CUDA: HW acceleration through CUDA. data[i] contain CUdeviceptr pointers exactly as for system memory frames.
which is not helpful, but ((AVHWFramesContext*)frame->hw_frames_ctx->data)->sw_format
will give:
AV_PIX_FMT_NV12: planar YUV 4:2:0, 12bpp, 1 plane for Y and 1 plane for the UV components, which are interleaved (first byte U and the following byte V)
which is everything that is needed.
frame->data[0]
is the luminance layer, fully populated, and
frame->data[1]
contains the chroma layers, half as big and interleaved populated.
Both are pitched arrays, which can be used directly as cudaPitchedPtr.ptr
.
Their pitch is frame->linesize[0]
and frame->linesize[1]
.
So in a nutshell, I get cudaPitchedPtr
from a frame with:
cudaPitchedPtr MyClass::Frame2CuPitched(AVFrame* frame, int dataIndex, bool halfHeight)
{
cudaPitchedPtr cup{};
cup.ptr = frame->data[dataIndex];
cup.pitch = frame->linesize[dataIndex]; //2048 == pitch
cup.ysize = halfHeight ? frame->height / 2 : frame->height; //1080
cup.xsize = frame->width; //1920
return cup;
}