Search code examples
ffmpegformatpixeltransparencyvp9

FFmpeg library is detecting pixel format of VP9 video stream not correctly


I am using a C code to detect pixel format of VP9 video stream in a WebM container. FFmpeg version 6.0, full shared library build, downloaded from official website. Operating system is Windows 10. I feed the library with a VP9 video encoded with alpha channel, pixel format is YUVA420p. It detects pixel format as YUV420p.

I have found a similar question on StackOverflow.com, Is there a way to force FFMPEG to decode a video stream with alpha from ​a WebM video encoded with libvpx-vp9?, but it does not actually help.

When I override the decoder with a libvpx, it continues to detect the pixel format as YUV420p instead of YUVA420p.

C code is following. Note that error handling in code is omitted here for StackOverflow question to be shorter.

AVFormatContext *fmt_ctx = NULL;
int err = avformat_open_input(&fmt_ctx, infp, NULL, NULL);
err = avformat_find_stream_info(fmt_ctx, NULL);
int stream = av_find_best_stream(fmt_ctx, AVMEDIA_TYPE_VIDEO, -1, -1, NULL, 0);
AVCodecParameters *codecpar = fmt_ctx->streams[stream]->codecpar;

const AVCodec *codec = NULL;
if (codecpar->codec_id == AV_CODEC_ID_VP9) {
    codec = avcodec_find_decoder_by_name(CODEC_LIBVPX_VP9);
} else {
    codec = avcodec_find_decoder(codecpar->codec_id);
}

AVCodecContext *ctx = avcodec_alloc_context3(codec);
err = avcodec_parameters_to_context(ctx, codecpar);
av_log(NULL, AV_LOG_DEBUG, "Pixel format: %d.\n", ctx->pix_fmt); //TODO:DEBUG.
err = avcodec_open2(ctx, codec, NULL);

The program tells Pixel format: 0., which means AV_PIX_FMT_YUV420P, not the AV_PIX_FMT_YUVA420P !

If I override pixel format manually, I am able to decode video with alpha channel and to see the transparent background, but it breaks the logics, because when a real YUV420p pixel format comes in and gets overridden by YUVA420p, this will be a problem.

if (codecpar->codec_id == AV_CODEC_ID_VP9) {
    if (strcmp(codec->name, CODEC_LIBVPX_VP9) == 0) {
        if (ctx->pix_fmt == AV_PIX_FMT_YUV420P) {
            ctx->pix_fmt = AV_PIX_FMT_YUVA420P;
        }
    }
}

At the same time ffmpeg tool started from command line with libvpx decoder tells that my video has YUVA420p pixel format. Output is following.

D:\Temp\4>ffmpeg -c:v libvpx-vp9 -i yuva.webm
ffmpeg version 6.0-full_build-www.gyan.dev Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 12.2.0 (Rev10, Built by MSYS2 project)
  configuration: --enable-gpl --enable-version3 --enable-shared --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libdav1d --enable-libdavs2 --enable-libuavs3d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libaom --enable-libjxl --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libvpl --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
  libavutil      58.  2.100 / 58.  2.100
  libavcodec     60.  3.100 / 60.  3.100
  libavformat    60.  3.100 / 60.  3.100
  libavdevice    60.  1.100 / 60.  1.100
  libavfilter     9.  3.100 /  9.  3.100
  libswscale      7.  1.100 /  7.  1.100
  libswresample   4. 10.100 /  4. 10.100
  libpostproc    57.  1.100 / 57.  1.100
[libvpx-vp9 @ 000001ecdf6002c0] v1.13.0-71-g45dc0d34d
    Last message repeated 1 times
Input #0, matroska,webm, from 'yuva.webm':
  Metadata:
    ENCODER         : Lavf60.3.100
  Duration: 00:00:05.55, start: 0.000000, bitrate: 227 kb/s
  Stream #0:0: Video: vp9 (Profile 0), yuva420p(tv, progressive), 1920x1080, SAR 1:1 DAR 16:9, 60 fps, 60 tbr, 1k tbn
    Metadata:
      alpha_mode      : 1
      ENCODER         : Lavc60.3.100 libvpx-vp9
      DURATION        : 00:00:05.550000000
At least one output file must be specified

Here is my YUVA420p in the first video stream at the end of the console output:

Stream #0:0: Video: vp9 (Profile 0), yuva420p(tv, progressive), 1920x1080, SAR 1:1 DAR 16:9, 60 fps, 60 tbr, 1k tbn

The questions are following.

  1. How to detect real pixel format of VP9 video with FFmpeg library in C code reliably ?
  2. Why is the C code not detecting the actual pixel format even with codec overriden to libvpx ?

Thank you.


Solution

  • Thanks to Gyan, I have ended up with a following solution, consisting of two methods (functions). Error handling is hidden.

    errno_t prepare_decoder_normal(
                                   AVFormatContext **fmt_ctx,  // Output parameter.
                                   AVCodec **codec,            // Output parameter.
                                   AVCodecContext **ctx,       // Output parameter.
                                   int *stream,                // Output parameter.
                                   const char *infp,           // Input parameter.
                                   int si,                     // Input parameter.
                                   int pix_fmt_id)             // Input parameter.
    {
        *fmt_ctx = avformat_alloc_context(); // [!] -> avformat_free_context().
        
        errno_t err = avformat_open_input(fmt_ctx, infp, NULL, NULL);
        
        err = avformat_find_stream_info(*fmt_ctx, NULL);
        
    
        if (si >= 0) { // Stream index override.
            *stream = si;
        } else {
            *stream = av_find_best_stream(*fmt_ctx, AVMEDIA_TYPE_VIDEO, -1, -1, NULL, 0); // Video stream index.
        }
    
        AVCodecParameters *codecpar = (*fmt_ctx)->streams[*stream]->codecpar;
        
        if (codecpar->codec_id == AV_CODEC_ID_VP9) {
            // VP9 requires a non-standard approach for decoding.
            avformat_close_input(fmt_ctx);
            return prepare_decoder_vp9(fmt_ctx, codec, ctx, infp, pix_fmt_id, *stream);
        } else {
            *codec = (AVCodec *) avcodec_find_decoder(codecpar->codec_id);
        }
    
        *ctx = avcodec_alloc_context3(*codec); // [!] The resulting struct should be freed with avcodec_free_context().
        
        err = avcodec_parameters_to_context(*ctx, codecpar);
        
        // Unfortunately, FFmpeg library may see a pixel format incorrectly.
        // Here we provide a method to override the automatically selected pixel
        // format.
        if (pix_fmt_id >= 0) {
            (*ctx)->pix_fmt = pix_fmt_id;
            av_log(NULL, AV_LOG_INFO, "Overriding pixel format ID with: %d.\n", pix_fmt_id);
        }
    
        err = avcodec_open2(*ctx, *codec, NULL);
    
        return SUCCESS;
    }
    
    errno_t prepare_decoder_vp9(
                                AVFormatContext **fmt_ctx,  // Output parameter.
                                AVCodec **codec,            // Output parameter.
                                AVCodecContext **ctx,       // Output parameter.
                                const char *infp,           // Input parameter.
                                int pix_fmt_id,             // Input parameter.
                                int stream)                 // Input parameter.
    {
        *fmt_ctx = avformat_alloc_context(); // [!] -> avformat_free_context().
    
        *codec = (AVCodec *) avcodec_find_decoder_by_name(CODEC_LIBVPX_VP9);
        
        (*fmt_ctx)->video_codec = *codec;
    
        int err = avformat_open_input(fmt_ctx, infp, NULL, NULL); // [!] The stream must be closed with avformat_close_input().
    
        err = avformat_find_stream_info(*fmt_ctx, NULL);
    
        AVCodecParameters *codecpar = (*fmt_ctx)->streams[stream]->codecpar;
    
        *ctx = avcodec_alloc_context3(*codec); // [!] The resulting struct should be freed with avcodec_free_context().
    
        err = avcodec_parameters_to_context(*ctx, codecpar);
    
        // Unfortunately, FFmpeg library may see a pixel format incorrectly.
        // Here we provide a method to override the automatically selected pixel
        // format.
        if (pix_fmt_id >= 0) {
            (*ctx)->pix_fmt = pix_fmt_id;
            av_log(NULL, AV_LOG_INFO, "Overriding pixel format ID with: %d.\n", pix_fmt_id);
        }
    
        err = avcodec_open2(*ctx, *codec, NULL);
        
        return SUCCESS;
    }