Search code examples
androidopengl-essynchronizationcompute-shaderglsles

GLES compute shader atomic operations for float


I am using computer shader to get a sum value(type is float) like this:

#version 320 es
layout(local_size_x = 640,local_size_y=480,local_size_z=1)
layout(binding = 0) buffer OutputData{
float sum[];
}output;
uniform sampler2D texture_1;
void main()
{
    vec2 texcoord(float(gl_LocalInvocationIndex.x)/640.0f,float(gl_LocalInvocationIndex.y)/480.0f);
    float val = textureLod(texture_1,texcoord,0.0).r;
//where need synchronize
    sum[0] = sum[0]+val;
//Here i want to get the sum of all val in texture_1 first channal
}

I know there are atomic operations like atomicAdd(),but not support float paramater,and barrier() which doesn't seem to solve my problem. Maybe i can encord the float to int,or is there some simple way to solve my problem?


Solution

  • Atomics are generally very poor in terms of performance, especially if heavily contended by parallel access from lots of threads, so I wouldn't recommend them for this use case.

    To keep parallelism here you really need some kind of multi-pass reduction strategy. Pseudo code, something like this:

    array_size = N
    data = input_array
    
    while array_size > 1:
       spawn pass with M = array_size/2 threads.
       thread M: out[M] = data[2*M] + data[2*M+1]
       array_size = M
       data = out
    

    This is a simple 2:1 reduction, so gives O(log2(N)) performance, but you could do more reduction per pass to reduce memory bandwidth of the intermediate storage. For a GPU using textures as input 4:1 is quite nice (you can use textureGather or even a simple linear filter to load multiple samples in a single texturing operation).