Search code examples
theano

Errors after Theano upgrade from 0.7 to bleeding edge


I installed and used Theano 0.7 and everything was working perfectly. But now for the purpose of my future works, I need the bleeding edge version, and the installation went fine.

But when I run this little test (found into the Theano documentation), it generates many errors (see here for the full list).

  • We can observe that the GPU is detected and used, but cuDNN is not found anymore:

Using gpu device 0: GeForce GT 650M (CNMeM is enabled with initial size: 65.0% of memory, CuDNN not available)

  • And then I have an import error, I think it is also about cuDNN:

ImportError: ('The following error happened while compiling the node', <theano.sandbox.cuda.DnnVersion object at 0x114d32710>(), '\n', 'dlopen(/Users/FiReTiTi/.theano/compiledir_Darwin-13.4.0-x86_64-i386-64bit-i386-2.7.11-64/tmpwmA_hw/265abc51f7c376c224983485238ff1a5.so, 2): Library not loaded: @rpath/libcudnn.4.dylib\n Referenced from: /Users/FiReTiTi/.theano/compiledir_Darwin-13.4.0-x86_64-i386-64bit-i386-2.7.11-64/tmpwmA_hw/265abc51f7c376c224983485238ff1a5.so\n Reason: image not found', '[<theano.sandbox.cuda.DnnVersion object at 0x114d32710>()]')

I've checked and cudnn.h is still in /Developer/NVIDIA/CUDA-7.5/include/, in /Developer/NVIDIA/CUDA-7.5/lib/ we still find libcudnn.dylib which is a symbolic link to libcudnn.4.dylib, and everything in /usr/local/cuda points to /Developer/NVIDIA/CUDA-7.5/

Any idea?

[EDIT] In my .profile we find:

export DYLD_LIBRARY_PATH=/Developer/NVIDIA/CUDA-7.5/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=/usr/local/cuda/lib:$DYLD_LIBRARY_PATH

In /usr/local/cuda/lib there is a symbolic link to the cudnn library that is actually in /Developer/NVIDIA/CUDA-7.5/lib.

Here is the result from the command tool -L libcudnn.4.dylib:

libcudnn.4.dylib:
@rpath/libcudnn.4.dylib (compatibility version 0.0.0, current version 4.0.7)
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 855.14.0)
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 120.0.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1197.1.1)

And here is the link between: /usr/local/cuda/lib/libcudnn.dylib -> /Developer/NVIDIA/CUDA-7.5/lib/libcudnn.dylib, and in /Developer/NVIDIA/CUDA-7.5/lib I have libcudnn.dylib -> libcudnn.4.dylib

[EDIT 2]

$ echo $DYLD_LIBRARY_PATH
/usr/local/xuggler/lib:/usr/local/cuda/lib:/Applications/IMOD/lib:

$ echo $LD_LIBRARY_PATH
/usr/local/cuda/lib:

[EDIT 3] Here is the last error displayed. At least one part, because this error appears at each epoch.

With ls -la /usr/local/cuda/lib:

lrwxr-xr-x   1 root  wheel    45B 22 fév 11:42 libcudnn.dylib -> /Developer/NVIDIA/CUDA-7.5/lib/libcudnn.dylib
lrwxr-xr-x   1 root  wheel    48B 26 fév 01:01 libcudnn_static.a -> /Developer/NVIDIA/CUDA-7.5/lib/libcudnn_static.a

Solution

  • This looks like a bug in Theano. It probably would work if they added ["-Wl,-rpath,%s" % l for l in c_lib_dirs()] to the compile args. You should report that upstream here.

    It might work as a workaround if you add the path of libcudnn.4.dylib to your LD_LIBRARY_PATH (or maybe DYLD_LIBRARY_PATH) environment variable, because that is where @rpath will also look at, so that the path @rpath/libcudnn.4.dylib can be resolved.