I'm trying to compute directly rgb buffer from avframe. Something is going wrong since the obtained image is wrong. Extracting grey image from AVFrame->data[0] is working fine. However I'm not able to extract colored image
inline void YCrCb_to_RGB8(int Y, int Cr, int Cb, int& R, int& G, int& B){
R = (int)(Y + 1.402 *(Cr - 128));
G = (int)(Y - 0.344136*(Cb-128) -0.71414*(Cr-128));
B = (int)(Y + 1.772 *(Cb-128));
if (R < 0) R = 0; else if (R > 255) R = 255;
if (G < 0) G = 0; else if (G > 255) G = 255;
if (B < 0) B = 0; else if (B > 255) B = 255;
}
int getRGB8buffer(AVFrame* pFrame, byte* buffer){
const int width = pFrame->width, height = pFrame->height;
int Y, Cr, Cb;
int R, G, B;
int pixel = 0;
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++)
{
Y = pFrame->data[0][x + y * width];
Cr = pFrame->data[1][x / 2 + ((int)(y / 2)) * pFrame->linesize[1]];
Cb = pFrame->data[2][x / 2 + ((int)(y / 2)) * pFrame->linesize[2]];
YCrCb_to_RGB8(Y, Cr, Cb, R, G, B);
buffer[pixel * 3 + 0] = R;
buffer[pixel * 3 + 1] = G;
buffer[pixel * 3 + 2] = B;
pixel++;
}
}
return 0;
}
When I save the obtained image as ppm using
int save_RGB_frame(unsigned char* buf, int wrap, int xsize,int ysize, const char* filename){
FILE* f;
int i;
f = fopen(filename, "w");
// portable ppm format -> https://en.wikipedia.org/wiki/Netpbm#PPM_example
fprintf(f, "P6\n%d %d\n%d\n", xsize, ysize, 255);
// writing line by line
for (i = 0; i < ysize; i++)
fwrite(buf + i * wrap, 1, xsize*3, f);
fclose(f);
return 0;
}
The resulting image is wrong link the the resulting image https://github.com/hacenesh/ffmpeg_question/blob/main/img_2028144.ppm
The main issue is using fopen(filename, "w")
instead of f = fopen(filename, "wb")
.
In Windows OS, there is an important distinction between binary file and text file.
The default "w"
option, opens the file as text file.
When writing to text file, each new line character \n
is converted to two characters \r\n
.
The additional characters messes up the entire structure of the image.
Note: In case you are using Linux, "wb"
and "w"
supposed to be the same.
Corrected code of save_RGB_frame
:
int save_RGB_frame(unsigned char* buf, int wrap, int xsize, int ysize, const char* filename)
{
FILE* f;
int i;
//f = fopen(filename, "w");
f = fopen(filename, "wb"); //In Windows OS, we must use "wb" for opening a binary file (by default "w" applies text file).
// portable ppm format -> https://en.wikipedia.org/wiki/Netpbm#PPM_example
fprintf(f, "P6\n%d %d\n%d\n", xsize, ysize, 255);
// writing line by line
for (i = 0; i < ysize; i++)
fwrite(buf + i * wrap, 1, xsize*3, f);
fclose(f);
return 0;
}
Saving to PPM image file:
save_RGB_frame(buffer, width*3, width, height, "img.ppm");
Issues in getRGB8buffer
:
The orders of the planes of YUV420p format is Y
then U
then V
.
U
applies Cb
and V
applies Cr
, so the order is: Y
, Cb
, Cr
.
Corrected code of getRGB8buffer
:
int getRGB8buffer(const AVFrame* pFrame, byte* buffer)
{
const int width = pFrame->width, height = pFrame->height;
int Y, Cr, Cb;
int R, G, B;
int pixel = 0;
for (int y = 0; y < height; y++)
{
for (int x = 0; x < width; x++)
{
//YUV420p planes ordering is Y, Cb, Cr (not Y, Cr, Cb).
//Y = pFrame->data[0][x + y * width];
//Cr = pFrame->data[1][x / 2 + ((int)(y / 2)) * pFrame->linesize[1]];
//Cb = pFrame->data[2][x / 2 + ((int)(y / 2)) * pFrame->linesize[2]];
Y = pFrame->data[0][x + y * pFrame->linesize[0]]; //Using pFrame->linesize[0] is prefered.
Cb = pFrame->data[1][x / 2 + ((int)(y / 2)) * pFrame->linesize[1]];
Cr = pFrame->data[2][x / 2 + ((int)(y / 2)) * pFrame->linesize[2]];
YCrCb_to_RGB8(Y, Cr, Cb, R, G, B);
buffer[pixel * 3 + 0] = R;
buffer[pixel * 3 + 1] = G;
buffer[pixel * 3 + 2] = B;
pixel++;
}
}
return 0;
}
Issues in YCrCb_to_RGB8
:
The conversion formula in your question applies JPEG conversion formula.
The default conversion formula used by FFmpeg applies BT.601 "limited range" conversion formula.
In "limited range", Y range is [16, 235] opposed to "full range" [0, 255].
Using "limited range" ("TV range") is much more common compared to "full range" (PC range / JPEG range).
BT.601 may be less common than BT.709 for HD videos, but BT.601 is FFmpeg default conversion (we are going to stick with BT.601).
Note: The conversion formula we are using here is the same as MATLAB function ycbcr2rgb.
inline void YCrCb_to_RGB8(int Y, int Cr, int Cb, int& R, int& G, int& B)
{
//Subtract offsets and cast to double.
//Subtractin 16 from Y assuems "limited range" YCbCr format where Y range is [16, 235] (oppused to "full range" whenre Y range is [0, 255]).
double y = (double)(Y - 16);
double u = (double)(Cb - 128);
double v = (double)(Cr - 128);
//The folloiwng conversion applies BT.601 "limited range" conversion formula.
//Getting the same results as MATLAB function ycbcr2rgb.
//BT.601 "limited range" is also the default conversion used by FFmpeg.
R = (int)std::round(1.1644*y + 1.5960*v);
G = (int)std::round(1.1644*y - 0.3918*u - 0.8130*v);
B = (int)std::round(1.1644*y + 2.0172*u);
R = std::max(std::min(R, 255), 0);
G = std::max(std::min(G, 255), 0);
B = std::max(std::min(B, 255), 0);
}