Let's say I want to do an operation (e.g. addition) between two images where each pixel in image Img1 has a corresponding pixel in image Img2. The correspondence vector is stored in a tuple Delta. Basically, something like this:
Img(x, y) = Img1(x, y) + Img2(x + Delta[0](x, y), y + Delta[1](x, y));
This is a memory gather operation. What would be the best way to do describe such a pattern in Halide? How to schedule it?
There isn't really a great way to schedule that. Gathers are slow, even where gather instructions exist. You probably still want to vectorize it over x so that the addressing math and the loads from Img1 and Delta are done using vectors though. I'd just use the obvious thing:
Img.vectorize(x, 8).parallel(y, 4);