Search code examples
cvideoffmpegyuvlibav

RGB to YUV conversion with libav (ffmpeg) triplicates image


I'm building a small program to capture the screen (using X11 MIT-SHM extension) on video. It works well if I create individual PNG files of the captured frames, but now I'm trying to integrate libav (ffmpeg) to create the video and I'm getting... funny results.

The furthest I've been able to reach is this. The expected result (which is a PNG created directly from the RGB data of the XImage file) is this:

Expected result

However, the result I'm getting is this:

Obtained result

As you can see the colors are funky and the image appears cropped three times. I have a loop where I capture the screen, and first I generate the individual PNG files (currently commented in the code below) and then I try to use libswscale to convert from RGB24 to YUV420:

while (gRunning) {
        printf("Processing frame framecnt=%i \n", framecnt);

        if (!XShmGetImage(display, RootWindow(display, DefaultScreen(display)), img, 0, 0, AllPlanes)) {
            printf("\n Ooops.. Something is wrong.");
            break;
        }

        // PNG generation
        // snprintf(imageName, sizeof(imageName), "salida_%i.png", framecnt);
        // writePngForImage(img, width, height, imageName);

        unsigned long red_mask = img->red_mask;
        unsigned long green_mask = img->green_mask;
        unsigned long blue_mask = img->blue_mask;

        // Write image data
        for (int y = 0; y < height; y++) {
            for (int x = 0; x < width; x++) {
                unsigned long pixel = XGetPixel(img, x, y);

                unsigned char blue = pixel & blue_mask;
                unsigned char green = (pixel & green_mask) >> 8;
                unsigned char red = (pixel & red_mask) >> 16;

                pixel_rgb_data[y * width + x * 3] = red;
                pixel_rgb_data[y * width + x * 3 + 1] = green;
                pixel_rgb_data[y * width + x * 3 + 2] = blue;
            }
        }

        uint8_t* inData[1] = { pixel_rgb_data };
        int inLinesize[1] = { in_w };

        printf("Scaling frame... \n");
        int sliceHeight = sws_scale(sws_context, inData, inLinesize, 0, height, pFrame->data, pFrame->linesize);

        printf("Obtained slice height: %i \n", sliceHeight);
        pFrame->pts = framecnt * (pVideoStream->time_base.den) / ((pVideoStream->time_base.num) * 25);

        printf("Frame pts: %li \n", pFrame->pts);
        int got_picture = 0;

        printf("Encoding frame... \n");
        int ret = avcodec_encode_video2(pCodecCtx, &pkt, pFrame, &got_picture);

//                int ret = avcodec_send_frame(pCodecCtx, pFrame);

        if (ret != 0) {
            printf("Failed to encode! Error: %i\n", ret);
            return -1;
        }

        printf("Succeed to encode frame: %5d - size: %5d\n", framecnt, pkt.size);

        framecnt++;

        pkt.stream_index = pVideoStream->index;
        ret = av_write_frame(pFormatCtx, &pkt);

        if (ret != 0) {
            printf("Error writing frame! Error: %framecnt \n", ret);
            return -1;
        }

        av_packet_unref(&pkt);
    }

I've placed the entire code at this gist. This question right here looks pretty similar to mine, but not quite, and the solution did not work for me, although I think this has something to do with the way the line stride is calculated.


Solution

  • In the end, the error was not in the usage of libav but on the code that fills the pixel data from XImage to the rgb vector. Instead of using:

                    pixel_rgb_data[y * width + x * 3    ] = red;
                    pixel_rgb_data[y * width + x * 3 + 1] = green;
                    pixel_rgb_data[y * width + x * 3 + 2] = blue;
    

    I should have used this:

                    pixel_rgb_data[3 * (y * width + x)    ] = red;
                    pixel_rgb_data[3 * (y * width + x) + 1] = green;
                    pixel_rgb_data[3 * (y * width + x) + 2] = blue;
    

    Somehow I was multiplying only the the horizontal displacement within the matrix, not the vertical displacement. The moment I changed it, it worked perfectly.