Search code examples
pythontensorflowcentoscudnn

tensorflow-gpu 2.2 works with CUDA 10.2 but requires cuDNN 7.6.4 which doesn't have a download file in NVIDIA archive for CUDA 10.2


The error is as follows and full log can be found here: https://pastebin.com/raw/0WQw8ktB

2021-06-10 22:03:04.201770: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2021-06-10 22:03:04.420481: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2021-06-10 22:03:05.034154: E tensorflow/stream_executor/cuda/cuda_dnn.cc:319] Loaded runtime CuDNN library: 7.4.2 but source was compiled with:
7.6.4.  CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration. 2021-06-10 22:03:05.038684: E tensorflow/stream_executor/cuda/cuda_dnn.cc:319] Loaded runtime CuDNN library: 7.4.2 but source was compiled with: 7.6.4.  CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.

These are what I see from nvidia archive:

https://developer.nvidia.com/rdp/cudnn-archive

Download cuDNN v7.6.4 (September 27, 2019), for CUDA 10.1
Download cuDNN v7.6.4 (September 27, 2019), for CUDA 10.0
Download cuDNN v7.6.4 (September 27, 2019), for CUDA 9.2
Download cuDNN v7.6.4 (September 27, 2019), for CUDA 9.0

As you see there is no cuDNN for CUDA 10.2 however, I need to use CUDA 10.2 for the rest of my framework. tensorflow-gpu 2.2 works with CUDA 10.2 but I get this error which implies I need to use cuDNN 7.6.4 instead of 7.4.2

python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"
v2.2.0-rc4-8-g2b96f3662b 2.2.0

GPU model and memory:

GeForce 1080 Ti (2x) each 12GB memory

$ stat /usr/local/cuda
  File: ‘/usr/local/cuda’ -> ‘/usr/local/cuda-10.2’
  Size: 20          Blocks: 0          IO Block: 4096   symbolic link
Device: fd00h/64768d    Inode: 67157410    Links: 1
Access: (0777/lrwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Context: unconfined_u:object_r:usr_t:s0
Access: 2021-06-10 22:12:20.673080083 -0400
Modify: 2020-09-21 09:39:18.559883390 -0400
Change: 2020-09-21 09:39:18.559883390 -0400
 Birth: -

and

[GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux

and

Python 3.8.5 (default, Mar 31 2021, 02:37:07)

tensorflow-gpu 2.2 was installed using pip. and

$ lsb_release -a
LSB Version:    :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description:    CentOS Linux release 7.9.2009 (Core)
Release:    7.9.2009
Codename:   Core

I also see this here but I can't find the download file: enter image description here


Solution

  • Installed cuDNN 7.6.5 for CUDA 10.2 using these commands after downloading cudnn-10.2-linux-x64-v7.6.5.32.tgz from NVIDIA official Website:

    $ sudo cp cuda/include/cudnn*.h /usr/local/cuda/include 
    
    $ sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda/lib64 
    
    $ sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*
    

    and then:

    $ export LD_LIBRARY_PATH=/usr/local/cuda-10.2/lib64:$LD_LIBRARY_PATH