The error is as follows and full log can be found here: https://pastebin.com/raw/0WQw8ktB
2021-06-10 22:03:04.201770: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2021-06-10 22:03:04.420481: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2021-06-10 22:03:05.034154: E tensorflow/stream_executor/cuda/cuda_dnn.cc:319] Loaded runtime CuDNN library: 7.4.2 but source was compiled with:
7.6.4. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration. 2021-06-10 22:03:05.038684: E tensorflow/stream_executor/cuda/cuda_dnn.cc:319] Loaded runtime CuDNN library: 7.4.2 but source was compiled with: 7.6.4. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
These are what I see from nvidia archive:
https://developer.nvidia.com/rdp/cudnn-archive
Download cuDNN v7.6.4 (September 27, 2019), for CUDA 10.1
Download cuDNN v7.6.4 (September 27, 2019), for CUDA 10.0
Download cuDNN v7.6.4 (September 27, 2019), for CUDA 9.2
Download cuDNN v7.6.4 (September 27, 2019), for CUDA 9.0
As you see there is no cuDNN for CUDA 10.2 however, I need to use CUDA 10.2 for the rest of my framework. tensorflow-gpu 2.2 works with CUDA 10.2 but I get this error which implies I need to use cuDNN 7.6.4 instead of 7.4.2
python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"
v2.2.0-rc4-8-g2b96f3662b 2.2.0
GPU model and memory:
GeForce 1080 Ti (2x) each 12GB memory
$ stat /usr/local/cuda
File: ‘/usr/local/cuda’ -> ‘/usr/local/cuda-10.2’
Size: 20 Blocks: 0 IO Block: 4096 symbolic link
Device: fd00h/64768d Inode: 67157410 Links: 1
Access: (0777/lrwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
Context: unconfined_u:object_r:usr_t:s0
Access: 2021-06-10 22:12:20.673080083 -0400
Modify: 2020-09-21 09:39:18.559883390 -0400
Change: 2020-09-21 09:39:18.559883390 -0400
Birth: -
and
[GCC 7.3.1 20180303 (Red Hat 7.3.1-5)] on linux
and
Python 3.8.5 (default, Mar 31 2021, 02:37:07)
tensorflow-gpu 2.2 was installed using pip. and
$ lsb_release -a
LSB Version: :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 7.9.2009 (Core)
Release: 7.9.2009
Codename: Core
I also see this here but I can't find the download file:
Installed cuDNN 7.6.5
for CUDA 10.2
using these commands after downloading cudnn-10.2-linux-x64-v7.6.5.32.tgz
from NVIDIA official Website:
$ sudo cp cuda/include/cudnn*.h /usr/local/cuda/include
$ sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda/lib64
$ sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*
and then:
$ export LD_LIBRARY_PATH=/usr/local/cuda-10.2/lib64:$LD_LIBRARY_PATH