Below I've posted some code that I'm using to try and get a feel for the CUDA thrust library. Before anyone says anything I know this is an extremely inefficient way to find prime numbers, I just want something to test parallelism. Unfortunatly when I run this I get an error here is what pops up:
Unhandled exception at at 0x76FCC41F in Thrust_2.exe: Microsoft C++ exception: thrust::system::system_error at memory location 0x0022F500.
If I switch the device_vector
to a host_vector
in the doTest function I no longer get the error and the program works flawlessly. Why does this happen and how can I get it to use the device_vector without crashing? I would like to do as much in parallel is possible.
Also the entire program works as intended with a host_vector.
PS:
I'm using VS2012
Cuda: V5.5
GPU: geforce gt 540M
Thrust: Got with cuda.
Thanks in advance!
struct prime{
__host__ __device__
void operator()(long& x){
bool result = true;
long stop = ceil(sqrt((float)x));
if(x%2!=0){
for(int i = 3;i<stop;i+=2){
if(x%i==0){
result = false;
break;
};
}
}else{
result = false;
}
if(!result)
x = -1;
}
};
void doTest(long gen){
using namespace thrust;
device_vector<long> tNum(gen);
sequence(tNum.begin(),tNum.end()); // fails here when using a device_vector
}
int main(){
doTest(1000);
return 0;
}
The issue was i had the wrong compiler arguments, I feel really stupid now...
i was compiling for 1.0, I switched it to 2.0 and now its working.