Search code examples
tensorflowgoogle-cloud-vertex-aicudnn

Upgrading Cudnn version in Vertex AI Notebook [Kernel Restarting Problem]


Problem: Cudnn version incompatiable with tensorflow and Cuda, Kernel dies and unable to start training in Vertex AI.

Current versions:

import tensorflow as tf
from tensorflow.python.platform import build_info as build
print(f"tensorflow version: {tf.__version__}")
print(f"Cuda Version: {build.build_info['cuda_version']}")
print(f"Cudnn version: {build.build_info['cudnn_version']}")
tensorflow version: 2.10.0
Cuda Version: 11.2
Cudnn version: 8

As per the information (shown in attached screenshot) available here, Cudnn version must be 8.1.

enter image description here

A similar question has been asked here that is related to upgrading Cudnn in Google colab. However, it does not solve my issue. Every other online sources are helpful for Anaconda environment only.

How can I upgrade the Cudnn in my case?

Thank you.


Solution

  • I tried several combinations of tensorflow, Cuda, and Cudnn versions in Google Colab and the following version worked [OS: Ubuntu 20.04]:

    tensorflow version: 2.9.2
    Cuda Version: 11.2
    Cudnn version: 8
    

    Therefore, I downgrated the tensorflow version in Vertex AI from 2.10.0 to 2.9.2 and it worked (solved only the incompatibility issue). I'm still searching the solution for Kernel restarting.

    UPDATE::

    The problem of Kernel Restatring got fixed after I changed the Kernel from Tensorflow 2 (Local) to Python (Local) in Vertex AI's Notebook as shown in the attached image [Kernel changing option is available on the right-top near the bug symbol].

    enter image description here