I have a Dell Precision Rack running Ubuntu Precise and featuring two Tesla C2075 plus a Quadro 600 which is the display device. I have recently finished some tests on my desktop-computer and now tried to port stuff to the workstation.
Since CUDA was not present I installed it according to this guide and adapted the SDK Makefiles according to this suggestions.
What I am now facing is that not a single sample (I did test like 10 different ones) is running. Those are the errors I am getting:
[deviceQuery] starting...
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 10
-> invalid device ordinal
[deviceQuery] test results...
FAILED
> exiting in 3 seconds: 3...2...1...done!
[MonteCarloMultiGPU] starting...
CUDA error at MonteCarloMultiGPU.cpp:235 code=23510 (cudaErrorInvalidDevice) "cudaGetDeviceCount(&GPU_N)"MonteCarloMultiGPU
==================
Parallelization method = threaded
Problem scaling = weak
Number of GPUs = 0
Total number of options = 0
Number of paths = 262144
main(): generating input data...
main(): starting 0 host threads...
Floating point exception (core dumped)
[reduction] starting...
reduction.cpp(124) : cudaSafeCallNoSync() Runtime API error 10 : invalid device ordinal.
[simplePrintf] starting...
simplePrintf.cu(193) : CUDA Runtime API error 10: invalid device ordinal.
As you can see most of the errors are pointing towards a problem with the cudaGetDeviceCount call which return error code 10. According to the manual the problem is:
cudaErrorInvalidDevice: This indicates that the device ordinal supplied by the user does not correspond to a valid CUDA device.
Unfortunately, the only solution I was able to find suggested to check the devices power plugs. I did that and there was nothing wrong with it. Restarting the workstation does not help either.
I'd be happy to supply more details on my configuration. Just leave a comment!
Due to the comments to my original question I was able to find a solution. I followed this guide to learn how to set up the rc.local
correctly (don't forget to chmod
your script).