Search code examples
dockergpuminikubenvidia-docker

Cannot use GPU on Minikube with Docker driver


Goal:

I'm trying to use Nvidia GPU capabilities on a Minikube cluster that uses the default Docker driver.

Problem:

I'm able to use nvidia-docker with the default docker context, but when switching to minikube docker-env I get the following error:

$ docker run --gpus all nvidia/cuda:10.0-base nvidia-smi
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
ERRO[0000] error waiting for container: context canceled

Environment:

  • Ubuntu 18.04
  • Minikube v1.10.0
  • Docker version:
$ docker version
Client: Docker Engine - Community
 Version:           19.03.10
 API version:       1.40
 Go version:        go1.13.10
 Git commit:        9424aeaee9
 Built:             Thu May 28 22:16:49 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          19.03.2
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.9
  Git commit:       6a30dfca03
  Built:            Wed Sep 11 22:45:55 2019
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.3.3-14-g449e9269
  GitCommit:        449e926990f8539fd00844b26c07e2f1e306c760
 runc:
  Version:          1.0.0-rc10
  GitCommit:        
 docker-init:
  Version:          0.18.0
  GitCommit:
  • Nvidia Container Runtime version:
$ nvidia-container-runtime --version
runc version 1.0.0-rc10
commit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
spec: 1.0.1-dev

Additional Info:

The cluster was created with:

minikube start --cpus 3 --memory 8G

The following minikube addons are currently enabled:

$ minikube addons list
|-----------------------------|----------|--------------|
|         ADDON NAME          | PROFILE  |    STATUS    |
|-----------------------------|----------|--------------|
| dashboard                   | minikube | disabled     |
| default-storageclass        | minikube | enabled ✅    |
| efk                         | minikube | disabled     |
| freshpod                    | minikube | disabled     |
| gvisor                      | minikube | disabled     |
| helm-tiller                 | minikube | disabled     |
| ingress                     | minikube | disabled     |
| ingress-dns                 | minikube | disabled     |
| istio                       | minikube | disabled     |
| istio-provisioner           | minikube | disabled     |
| logviewer                   | minikube | disabled     |
| metallb                     | minikube | disabled     |
| metrics-server              | minikube | disabled     |
| nvidia-driver-installer     | minikube | enabled ✅    |
| nvidia-gpu-device-plugin    | minikube | enabled ✅    |
| registry                    | minikube | disabled     |
| registry-aliases            | minikube | disabled     |
| registry-creds              | minikube | disabled     |
| storage-provisioner         | minikube | enabled ✅    |
| storage-provisioner-gluster | minikube | disabled     |
|-----------------------------|----------|--------------|

And this is a working example outside the minikube context:

$ docker run --gpus all nvidia/cuda:10.0-base nvidia-smi
Fri Jun  5 09:23:49 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.59       Driver Version: 440.59       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 106...  Off  | 00000000:01:00.0  On |                  N/A |
|  0%   51C    P8     6W / 120W |   1293MiB /  6077MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Solution

  • This is a community wiki answer. Feel free to edit and expand it if needed.

    Nvidia GPU is not officially supported with the docker driver for Minikube. This leaves you with two possible options:

    1. Try to use NVIDIA Container Toolkit and NVIDIA device plugin. This is a workaround way and might not be the best solution in your use case.

    2. Use the KVM2 driver or None driver. These two are officially supported and documented.

    I hope it helps.