I am trying to run the following Hugging Face Transformers tutorial on GCP's AI Platform Notebook with 32 vCPUs, 208 GB RAM, and 2 NVIDIA Tesla T4s.
However, when I try to run the part
model = DistillBERTClass()
model.to(device)
I get the following Assertion Error:
AssertionError: The NVIDIA driver on your system is too old (found version 10010).
Please update your GPU driver by downloading and installing a new
version from the URL: http://www.nvidia.com/Download/index.aspx
Alternatively, go to: https://pytorch.org to install
a PyTorch version that has been compiled with your version
of the CUDA driver.
However, when I run !nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.01 Driver Version: 418.87.01 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |
| N/A 38C P0 22W / 70W | 10MiB / 15079MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla T4 Off | 00000000:00:05.0 Off | 0 |
| N/A 39C P8 10W / 70W | 10MiB / 15079MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
The version on the NVIDIA driver is compatible with the latest PyTorch version, which I am using. Has anyone else ran into this error, and is there a way around it?
You can try a newer NVIDIA driver version, we support latest CUDA 11 driver version, and then install Pytorch on top of it:
gcloud beta notebooks instances create cuda11 \
--vm-image-project=deeplearning-platform-release \
--vm-image-family=common-cu110-notebooks-debian-9 \
--machine-type=n1-standard-1 \
--location=us-west1-a \
--format=json
Image family: