Search code examples
c#.netgpualeagpu

Passing more than 16 kernel arguments in Alea.GPU


I have a fairly complex kernel that I am trying to write. As it turns out, I need to pass more than 16 arguments, and apparently Alea GPU has a limitation for 16 arguments. (http://quantalea.com/static/app/manual/reference/alea_cuda_il/alea-cuda-il-ilgpumodule.html)

I know that 16 arguments sounds like a bad idea to begin with... What other options are there? In normal code, I would of course wrap the things into it's own class, but what can I do in GPU code?


Solution

  • In this case, you can retrieve an untyped kernel object through GPUModule.GPUEntities property, then put those arguments into an list of Object type, then you can launch it.

    You can also make some extension methods for that purpose and make them to be type safe, here is an example which I use only 3 arguments for simplicity:

    public static class GPUModuleExtensions
    {
        public static void MyGPULaunch<T1, T2, T3>(
            this ILGPUModule module,
            Action<T1, T2, T3> kernelD, LaunchParam lp,
            T1 arg1, T2 arg2, T3 arg3)
        {
            // get the kernel object by method name
            var kernel = module.GPUEntities.GetKernel(kernelD.Method.Name).Kernel;
            // create parameter list (which is FSharpList)
            var parameterArray = new object[] {arg1, arg2, arg3};
            var parameterList = ListModule.OfArray(parameterArray);
            // use untyped LaunchRaw to launch the kernel
            kernel.LaunchRaw(lp, parameterList);
        }
    }
    
    public class GPUModule : ILGPUModule
    {
        public GPUModule() : base(GPUModuleTarget.DefaultWorker)
        {
        }
    
        [Kernel]
        public void Kernel(deviceptr<int> outputs, int arg1, int arg2)
        {
            var tid = threadIdx.x;
            outputs[tid] = arg1 + arg2;
        }
    
        [Test]
        public void Test()
        {
            const int n = 32;
            var lp = new LaunchParam(1, n);
            using (var outputs = GPUWorker.Malloc<int>(n))
            {
                this.MyGPULaunch(Kernel, lp, outputs.Ptr, 1, 3);
                Console.WriteLine("{0}", (outputs.Gather())[4]);
            }
        }
    }
    

    Note, in this example, I use Action<T1,T2,T3>, but Action type has maximum 16 types, so you might need to define your own delegate to pass more than 16 argument types.