Tensorflow CUDA fails with error "failed to enqueue convolution on stream: CUDNN_STATUS_EXECUTION_FAILED"

Here is some of my console output. I am unsure what is the actually problem. When this is displayed I get a windows prompt stating Python.exe has stop working with the cause being ucrtbase.dll, but I've tried updating that and it still happens so I think that is the result of the real problem. Also I am notified by a taskbar message that my Nvidia Kernal Driver crashed, but recovered.

2017-11-04 17:48:17.363024: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-11-04 17:48:17.375024: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-04 17:48:19.995174: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:955
Found device 0 with properties: 
name: Quadro K1100M
major: 3 minor: 0 memoryClockRate (GHz) 0.7055
pciBusID 0000:01:00.0
Total memory: 2.00GiB
Free memory: 1.93GiB
2017-11-04 17:48:19.995174: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:976] DMA: 0 
2017-11-04 17:48:19.995174: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:986] 0:   Y 
2017-11-04 17:48:20.018175: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1045]
Creating TensorFlow device (/gpu:0) -> (device: 0, name: Quadro K1100M, pci bus id: 0000:01:00.0)
2017-11-04 17:49:35.796510: W C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1.93GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory is available.
2017-11-04 17:49:41.811854: E C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_driver.cc:1068] failed to synchronize the stop event: CUDA_ERROR_UNKNOWN
2017-11-04 17:49:41.811854: E C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_timer.cc:54] Internal: error destroying CUDA event in context 0000000026CFBE70: CUDA_ERROR_UNKNOWN
2017-11-04 17:49:41.811854: E C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_timer.cc:59] Internal: error destroying CUDA event in context 0000000026CFBE70: CUDA_ERROR_UNKNOWN
2017-11-04 17:49:41.811854: F C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\stream_executor\cuda\cuda_dnn.cc:2045] failed to enqueue convolution on stream: CUDNN_STATUS_EXECUTION_FAILED

Solution

If you're still looking for the answer, try reducing the batch size. I'm not entirely sure what is happening with the error (there's no explanation on github either), but reducing the batch size helped me