I have a fairly complex kernel that I am trying to write. As it turns out, I need to pass more than 16 arguments, and apparently Alea GPU has a limitation for 16 arguments. (http://quantalea.com/static/app/manual/reference/alea_cuda_il/alea-cuda-il-ilgpumodule.html)
I know that 16 arguments sounds like a bad idea to begin with... What other options are there? In normal code, I would of course wrap the things into it's own class, but what can I do in GPU code?
In this case, you can retrieve an untyped kernel object through GPUModule.GPUEntities
property, then put those arguments into an list of Object
type, then you can launch it.
You can also make some extension methods for that purpose and make them to be type safe, here is an example which I use only 3 arguments for simplicity:
public static class GPUModuleExtensions
{
public static void MyGPULaunch<T1, T2, T3>(
this ILGPUModule module,
Action<T1, T2, T3> kernelD, LaunchParam lp,
T1 arg1, T2 arg2, T3 arg3)
{
// get the kernel object by method name
var kernel = module.GPUEntities.GetKernel(kernelD.Method.Name).Kernel;
// create parameter list (which is FSharpList)
var parameterArray = new object[] {arg1, arg2, arg3};
var parameterList = ListModule.OfArray(parameterArray);
// use untyped LaunchRaw to launch the kernel
kernel.LaunchRaw(lp, parameterList);
}
}
public class GPUModule : ILGPUModule
{
public GPUModule() : base(GPUModuleTarget.DefaultWorker)
{
}
[Kernel]
public void Kernel(deviceptr<int> outputs, int arg1, int arg2)
{
var tid = threadIdx.x;
outputs[tid] = arg1 + arg2;
}
[Test]
public void Test()
{
const int n = 32;
var lp = new LaunchParam(1, n);
using (var outputs = GPUWorker.Malloc<int>(n))
{
this.MyGPULaunch(Kernel, lp, outputs.Ptr, 1, 3);
Console.WriteLine("{0}", (outputs.Gather())[4]);
}
}
}
Note, in this example, I use Action<T1,T2,T3>
, but Action
type has maximum 16 types, so you might need to define your own delegate to pass more than 16 argument types.