Search code examples
pythonmachine-learningcudatheanotheano-cuda

Theano and pygpu: errors


I'm using Theano with pygpu.

Generally, it works well until, for reasons I still haven't managed to understand, it shows the following error once I try to import theano:

ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
Traceback (most recent call last):
  File "/home/poko/Software/anaconda2/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 220, in <module>
    use(config.device)
  File "/home/poko/Software/anaconda2/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 207, in use
    init_dev(device, preallocate=preallocate)
  File "/home/poko/Software/anaconda2/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 94, in init_dev
    **args)
  File "pygpu/gpuarray.pyx", line 651, in pygpu.gpuarray.init
  File "pygpu/gpuarray.pyx", line 587, in pygpu.gpuarray.pygpu_init
GpuArrayException: cuInit: CUDA_ERROR_UNKNOWN: unknown error

If I reboot my computer, it works well again, for a while (sometimes for days..).

Now that situation is strange, given that such things either do work, or don't. I have not the faintest about what is generating the error, apart from observing from nvidia-smi that xorg and chrome do suck quite a lot of memory:

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1332      G   /usr/lib/xorg/Xorg                           392MiB |
|    0      2243      G   cinnamon                                     110MiB |
|    0      4927      G   ...-token=39C210A3DFA14C5D81FA629C813B843D   154MiB |
+-----------------------------------------------------------------------------+

Solution

  • It turned out that I can get rid of the error just by unloading nvidia_uvm module, by doing:

    sudo rmmod nvidia_uvm
    

    after which, it will be automatically reloaded.

    Hope this helps should someone else incur in that problem.