I wrote an OpenCL kernel that generates random numbers inside a while loop in the device. Once an acceptable random number is obtained, the kernel should exit the loop and give the result back to the host. Typically, the number of iterations per workitem is ~100-1000.
The problem is that this code hangs when I enable the while loop and never returns a result. If I just disable the while loop–i.e. generating only one random number instead of 100s–the kernel works fine.
Anybody has any idea of what might be going on? The kernel code is below and also available at this github repo. One possibility is that the system (MacOS in my case) prevents the GPU from taking a long time executing a task as described here, but I am not sure.
#include <clRNG/mrg31k3p.clh> // for random number generation
#include "exposure.clh" // defines function exposure
__kernel void cr(__global clrngMrg31k3pHostStream* streams, __global float* xa, __global float* ya, const int n) {
int i = get_global_id(0);
float x,y,sampling;
if (i<n) {
// Loop that produces individual CRs
while (1) {
clrngMrg31k3pStream private_stream_d; // This is not a pointer!
clrngMrg31k3pCopyOverStreamsFromGlobal(1, &private_stream_d, &streams[i]);
// random number between 0 and 360
x=360.*clrngMrg31k3pRandomU01(&private_stream_d);
// random number between 0 and 1
y=clrngMrg31k3pRandomU01(&private_stream_d);
// To avoid concentrations towards the poles, generates sin(delta)
// between -1 and +1, then converts to delta
y = asin((float)(2.*y-1.))*180./M_PI_F; // dec
// If sampling<exposure for a given CR, it is accepted
sampling=clrngMrg31k3pRandomU01(&private_stream_d);
if (sampling <= exposure(y)) {
xa[i]=x;
ya[i]=y;
break;
}
}
}
}
You are re-creating the random stream over and over again; perhaps it always creates the same output, which is why your while loop never terminates. Try creating the random stream above your loop that pulls from it.