What is the correct way to store per-pixel persistence data in the Metal compute kernel?

I am trying to implement the MoG background subtraction algorithm based on the opencv cuda implementation

What I need is to maintain a set of gaussian parameter independently for each pixel location across multiple frame. Currently I am just allocating a single big MTLBuffer to do the job and on every frame, I have to invoke the commandEncoder.setBuffer API. Is there a better way? I read about imageblock but I am not sure if it is relevant.

Also, I would be really happy if you can spot any things that shouldn't be directly translated from cuda to metal.

Solution

Allocate an 8 bit texture and store intermediate values into the texture in your compute shader. Then after this texture is rendered, you can rebind it as an input texture to whatever other methods need to read from it in the rest of the renders. You can find a very detailed example of this sort of thing at this github example project of a parallel prefix sum on top of Metal. This example also shows how to write XCTest regression tests for your Metal shaders. Github MetalPrefixSum