Search code examples
swiftshadermetal

How to perform Outer product of 2 vectors in Metal shaders?


So I'm developing a Neural Network to run in iOS on the GPU, so using matrix notation I need (in order to backpropagate the errors) be able to perform an outer product of 2 vectors.


// Outer product of vector A and Vector B
kernel void outerProduct(const device float *inVectorA [[ buffer(0) ]],
                         const device float *inVectorB [[ buffer(1) ]],
                         device float *outVector [[ buffer(2) ]],
                         uint id [[ thread_position_in_grid ]]) {
    
    outVector[id] = inVectorA[id] * inVectorB[***?***]; // How to find this position on the thread group (or grid)?
}

Solution

  • You are using thread_position_in_grid incorrectly. If you are dispatching a 2D grid, it should be uint2 or ushort2, otherwise it only gets the x coordinate. Refer to table 5.7 in Metal Shading Language specification.

    I'm not sure which outer product are we talking about, but I think the output should be a matrix. If you are storing it linearly, then your code to calculate the outVector should look something like this:

    kernel void outerProduct(const device float *inVectorA [[ buffer(0) ]],
                             const device float *inVectorB [[ buffer(1) ]],
                             uint2 gridSize [[ threads_per_grid ]],
                             device float *outVector [[ buffer(2) ]],
                             uint2 id [[ thread_position_in_grid ]]) {
        
        outVector[id.y * gridSize.x + id.x] = inVectorA[id.x] * inVectorB[id.y];
    }
    

    Also, if you are dispatching a grid exactly the size of inVectorAxinVectorB, you can use attribute threads_per_grid on a kernel argument to find out how big the grid is.

    Alternatively, you can just pass the sizes of the vectors alongside the vectors themselves.