I have a few scripts that after running for a while throw a Cuda out of memory exception. Inside of them I am using preallocated arrays, so I did not expect this to be a problem. Nevertheless, after I've turned the scripts into .fs files and compiled them, the profiler was not particularly useful for this task and the cuda-memcheck tool 6.5 (36) threw an CudaInterOp exception when I used it. cuda-memcheck 7.0 (40) actually forced me to reset the PC as the GPU went out.
I am a bit unsure of what to do at the moment. How would one go about fixing the leaks with Alea?
A complete usage with resource management of device reduce would briefly looks like:
// first, create module, which has the compilation stuff
use reduceModule = new DeviceReduceModule<float>(Target, <@ (+) @>)
// second, create a reduce object, which contains temp memory for that maxItems
use reduce = reduceModule.Create(maxItems)
// third, allocate your device memory for input
use numbers = reduce.AllocateStreamBuffer(xxx)
// now you can scatter data and do reduce
numbers.Scatter(your data)
reduce.Reduce(numbers)
// now because you are using "use" keyword, the dispose() will be called implicitly