How to convert RGBA to NV12 using OpenGL?

I need to convert RGBA to NV12 using OpenGL shader as encoder input.

Already have render two different fragment shaders, both textures are from one camera . Using v4l2 to get the camera image(YUV). Then, convert YUV to RGB let OpenGL render. The next step, I need to convert RGB to NV12 as encoder input because the encoder only accept NV12 format.

Solution

Use a compute shader to convert RGB to planar YUV, then downsample the UV plane by a factor of two.

Here's the compute shader:

#version 450 core
layout(local_size_x = 32, local_size_y = 32) in;
layout(binding = 0) uniform sampler2D src;
layout(binding = 0) uniform writeonly image2D dst_y;
layout(binding = 1) uniform writeonly image2D dst_uv;
void main() {
    ivec2 id = ivec2(gl_GlobalInvocationID.xy);
    vec3 yuv = rgb_to_yuv(texelFetch(src, id).rgb);
    imageStore(dst_y, id, vec4(yuv.x,0,0,0));
    imageStore(dst_uv, id, vec4(yuv.yz,0,0));
}

There's lots of different YUV conventions, and I don't know which one is expected by your encoder. So replace rgb_to_yuv above with the inverse of your YUV -> RGB convertion.

Then proceed as follows:

GLuint in_rgb = ...; // rgb(a) input texture
int width = ..., height = ...; // the size of in_rgb

GLuint tex[2]; // output textures (Y plane, UV plane)

glCreateTextures(GL_TEXTURE_2D, tex, 2);
glTextureStorage2D(tex[0], 1, GL_R8, width, height); // Y plane

// UV plane -- TWO mipmap levels
glTextureStorage2D(tex[1], 2, GL_RG8, width, height);

// use this instead if you need signed UV planes:
//glTextureStorage2D(tex[1], 2, GL_RG8_SNORM, width, height);

glBindTextures(0, 1, &in_rgb);
glBindImageTextures(0, 2, tex);
glUseProgram(compute); // the above compute shader

int wgs[3];
glGetProgramiv(compute, GL_COMPUTE_WORK_GROUP_SIZE, wgs);
glDispatchCompute(width/wgs[0], height/wgs[1], 1);

glUseProgram(0);
glGenerateTextureMipmap(tex[1]); // downsamples tex[1] 

// copy data to the CPU memory:
uint8_t *data = (uint8_t*)malloc(width*height*3/2);
glGetTextureImage(tex[0], 0, GL_RED, GL_UNSIGNED_BYTE, width*height, data);
glGetTextureImage(tex[1], 1, GL_RG, GL_UNSIGNED_BYTE, width*height/2,
    data + width*height);

DISCLAIMER:

This code is untested.
It assumes that width and height are divisible by 32.
It might be missing a memory barrier somewhere.
It's not the most efficient way to read data out of the GPU -- you might need to at least read one frame behind while the next one is calculated.