Search code examples

TensorFlow GPU: is cudnn optional? Couldn't open CUDA library

I installed the tensorflow-0.8.0 GPU version, tensorflow-0.8.0-cp27-none-linux_x86_64.whl. It says it requires CUDA toolkit 7.5 and CuDNN v4.

# Ubuntu/Linux 64-bit, GPU enabled. Requires CUDA toolkit 7.5 and CuDNN v4.  For
# other versions, see "Install from sources" below.

However, I accidently forget to install CuDNN v4, but it works OK besides the error message, "Couldn't open CUDA library". But it works and says, "Creating TensorFlow device (/gpu:0)".

msg without CuDNN

I tensorflow/stream_executor/] successfully opened CUDA library locally
I tensorflow/stream_executor/] Couldn't open CUDA library LD_LIBRARY_PATH: /usr/local/cuda/lib64:
I tensorflow/stream_executor/cuda/] Unable to load cuDNN DSO
I tensorflow/stream_executor/] successfully opened CUDA library locally
I tensorflow/stream_executor/] successfully opened CUDA library locally
I tensorflow/stream_executor/] successfully opened CUDA library locally
('Extracting', 'MNIST_data/train-images-idx3-ubyte.gz')
/usr/lib/python2.7/ VisibleDeprecationWarning: converting an array with ndim > 0 to an index will result in an error in the future
  chunk = self.extrabuf[offset: offset + size]
/home/ubuntu/TensorFlow-Tutorials/ VisibleDeprecationWarning: converting an array with ndim > 0 to an index will result in an error in the future
  data = data.reshape(num_images, rows, cols, 1)
('Extracting', 'MNIST_data/train-labels-idx1-ubyte.gz')
('Extracting', 'MNIST_data/t10k-images-idx3-ubyte.gz')
('Extracting', 'MNIST_data/t10k-labels-idx1-ubyte.gz')
I tensorflow/stream_executor/cuda/] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/] Found device 0 with properties:
name: GRID K520
major: 3 minor: 0 memoryClockRate (GHz) 0.797
pciBusID 0000:00:03.0
Total memory: 4.00GiB
Free memory: 3.95GiB
I tensorflow/core/common_runtime/gpu/] DMA: 0
I tensorflow/core/common_runtime/gpu/] 0:   Y
I tensorflow/core/common_runtime/gpu/] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GRID K520, pci bus id: 0000:00:03.0)
I tensorflow/core/common_runtime/gpu/] PoolAllocator: After 1704 get requests, put_count=1321 evicted_count=1000 eviction_rate=0.757002 and unsatisfied allocation rate=0.870305
I tensorflow/core/common_runtime/gpu/] Raising pool_size_limit_ from 100 to 110
I tensorflow/core/common_runtime/gpu/] PoolAllocator: After 1704 get requests, put_count=1812 evicted_count=1000 eviction_rate=0.551876 and unsatisfied allocation rate=0.536972
I tensorflow/core/common_runtime/gpu/] Raising pool_size_limit_ from 256 to 281

Later, I installed CuDNN, but I don't see the differences.

msg with CuDNN

I tensorflow/stream_executor/] successfully opened CUDA library locally
I tensorflow/stream_executor/] successfully opened CUDA library locally
I tensorflow/stream_executor/] successfully opened CUDA library locally
I tensorflow/stream_executor/] successfully opened CUDA library locally
I tensorflow/stream_executor/] successfully opened CUDA library locally
('Extracting', 'MNIST_data/train-images-idx3-ubyte.gz')
/usr/lib/python2.7/ VisibleDeprecationWarning: converting an array with ndim > 0 to an index will result in an error in the future
  chunk = self.extrabuf[offset: offset + size]
/home/ubuntu/TensorFlow-Tutorials/ VisibleDeprecationWarning: converting an array with ndim > 0 to an index will result in an error in the future
  data = data.reshape(num_images, rows, cols, 1)
('Extracting', 'MNIST_data/train-labels-idx1-ubyte.gz')
('Extracting', 'MNIST_data/t10k-images-idx3-ubyte.gz')
('Extracting', 'MNIST_data/t10k-labels-idx1-ubyte.gz')
I tensorflow/stream_executor/cuda/] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/] Found device 0 with properties:
name: GRID K520
major: 3 minor: 0 memoryClockRate (GHz) 0.797
pciBusID 0000:00:03.0
Total memory: 4.00GiB
Free memory: 3.95GiB
I tensorflow/core/common_runtime/gpu/] DMA: 0
I tensorflow/core/common_runtime/gpu/] 0:   Y
I tensorflow/core/common_runtime/gpu/] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GRID K520, pci bus id: 0000:00:03.0)
I tensorflow/core/common_runtime/gpu/] PoolAllocator: After 1704 get requests, put_count=1321 evicted_count=1000 eviction_rate=0.757002 and unsatisfied allocation rate=0.870305
I tensorflow/core/common_runtime/gpu/] Raising pool_size_limit_ from 100 to 110
I tensorflow/core/common_runtime/gpu/] PoolAllocator: After 1704 get requests, put_count=1811 evicted_count=1000 eviction_rate=0.552181 and unsatisfied allocation rate=0.537559
I tensorflow/core/common_runtime/gpu/] Raising pool_size_limit_ from 256 to 281

So what's differences with/without CuDNN?


  • cuDNN is used to speedup a few TensorFlow operations such as the convolution. I noticed in your log file that you're training on the MNIST dataset. The reference MNIST model provided with TensorFlow is built around 2 fully connected layers and a softmax. Therefore TensorFlow won't attempt to call cuDNN when training this model.

    I'm not sure that TensorFlow will automatically fallback to a slower convolution algorithm when cuDNN isn't available. If it doesn't you can always disable the use of cuDNN by setting the TF_USE_CUDNN environment variable to 0 before running TensorFlow.