Search code examples

ffmpeg: programmatically use libavcodec and encode and decode raw bitmap, all in just few milliseconds and small compressed size on Raspberry Pi 4

We need to compress the size of the 1024x2048 image we produce, to size of about jpeg (200-500kb) from raw 32bits RGBA (8Mb) on Raspberry Pi 4. All in c/c++ program.

The compression needs to be just in few milliseconds, otherwise it is pointless to us.

We decided to try supported encoding using ffmpeg dev library and c/c++ code.

The problem we are facing is that when we edited example of the encoding, provided by ffmpeg developers, the times we are dealing are unacceptable.

Here you can see the edited code where the frames are created:

for (i = 0; i < 25; i++)
        auto start_time = std::chrono::high_resolution_clock::now();
        std::cout << "START Encoding frame...\n";

    ret = av_frame_make_writable(frame);
    if (ret < 0)

    //I try here, to convert our 32 bits RGBA image to YUV pixel format:

    for (y = 0; y < c->height; y++)
        for (x = 0; x < c->width; x++)
            int imageIndexY = y * frame->linesize[0] + x;

            uint32_t rgbPixel = ((uint32_t*)OutputDataImage)[imageIndexY];

            double Y, U, V;
            uint8_t R = rgbPixel << 24;
            uint8_t G = rgbPixel << 16;
            uint8_t B = rgbPixel << 8;

            YUVfromRGB(Y, U, V, (double)R, (double)G, (double)B);
            frame->data[0][imageIndexY] = (uint8_t)Y;

            if (y % 2 == 0 && x % 2 == 0)
                int imageIndexU = (y / 2) * frame->linesize[1] + (x / 2);
                int imageIndexV = (y / 2) * frame->linesize[2] + (x / 2);

                frame->data[1][imageIndexU] = (uint8_t)U;
                frame->data[2][imageIndexV] = (uint8_t)Y;

    frame->pts = i;

    /* encode the image */
    encode(c, frame, pkt, f);

        auto end_time = std::chrono::high_resolution_clock::now();
        auto time = end_time - start_time;
        std::cout << "FINISHED Encoding frame in: " << time / std::chrono::milliseconds(1) << "ms.\n";


Here are some important parts of the previous parts of that function:

codec_name = "mpeg4";

codec = avcodec_find_encoder_by_name(codec_name);

c = avcodec_alloc_context3(codec);
c->bit_rate = 1000000;  
c->width = IMAGE_WIDTH;
c->height = IMAGE_HEIGHT;
c->gop_size = 1;
c->max_b_frames = 1;
c->pix_fmt = AV_PIX_FMT_YUV420P;   

IMAGE_WIDTH and IMAGE_HEIGHT are 1024 and 2048 corresponding.

The result I have ran on Raspberry Pi 4 look like this:

START Encoding frame...
Send frame   0
FINISHED Encoding frame in: 40ms.
START Encoding frame...
Send frame   1
Write packet   0 (size=11329)
FINISHED Encoding frame in: 60ms.
START Encoding frame...
Send frame   2
Write packet   1 (size=11329)
FINISHED Encoding frame in: 58ms.

Since I am completely green in encoding and using codecs, my question will be how to do it the best way and correct way, meaning the way which would reduce timing to few ms, and I am not sure the codec was chosen the best for the job, or the pixel format.

The rest of the meaningful code you can see here (the encode() function you can find in the ffmpeg developer example I gave link to above):

void RGBfromYUV(double& R, double& G, double& B, double Y, double U, double V)
    Y -= 16;
    U -= 128;
    V -= 128;
    R = 1.164 * Y + 1.596 * V;
    G = 1.164 * Y - 0.392 * U - 0.813 * V;
    B = 1.164 * Y + 2.017 * U;


  • I used this :

    int RgbToYuv(uint8_t *rgb, AVCodecContext *pCodecCtx, AVFrame * pFrame) {

    enum AVPixelFormat yuv420p_pix_fmt = AV_PIX_FMT_YUV420P, bgra_pix_fmt = AV_PIX_FMT_BGRA;

    // Allocate video frame //pFrame=av_frame_alloc();

    // Allocate an AVFrame structure AVFrame * pFrameRGB = av_frame_alloc(); if(pFrameRGB==NULL) return 0;

    // Assign appropriate parts of buffer to image planes in pFrameRGB // Note that pFrameRGB is an AVFrame, but AVFrame is a superset // of AVPicture

    avpicture_fill((AVPicture *)pFrameRGB, rgb, bgra_pix_fmt, pCodecCtx->width, pCodecCtx->height);

    // initialize SWS context for software scaling SwsContext *sws_ctx = sws_getContext(pCodecCtx->width, pCodecCtx->height, bgra_pix_fmt, pCodecCtx->width, pCodecCtx->height, yuv420p_pix_fmt, SWS_BILINEAR, NULL, NULL, NULL );

    // Convert the image from its native format to RGB
    sws_scale(sws_ctx, (uint8_t const * const *)pFrameRGB->data, pFrameRGB->linesize, 0, pCodecCtx->height, pFrame->data, pFrame->linesize);
    //  konvertering går mot høyre
    return 1;


    Not super fast but it worked