I want to use a CUDA container in Docker as a non root user, but am running into permission problems. Here's an example Dockerfile:
FROM nvidia/cudagl:11.2.2-runtime-ubuntu18.04
RUN useradd -ms /bin/bash testuser -G video,sudo
USER testuser
ENTRYPOINT "/bin/bash"
Running nvidia-smi
gives the following error: Failed to initialize NVML: Insufficient Permissions
My application uses VirtualGL and Xvfb to render Chrome with a GPU if that's relevant. Works perfectly fine with the root user.
TL;DR - Check the gid of vglusers
group on the host. Add this group with the gid in the container, and add the user to this group.
So investigating this a bit, I looked at the nvidia devices in the container:
root@56cef279b83f:/# cd /dev
root@56cef279b83f:/dev# ls -l | grep nvidia
crw-rw---- 1 root 1005 195, 0 Nov 23 23:13 nvidia0
crw-rw---- 1 root 1005 195, 255 Nov 23 23:13 nvidiactl
crw-rw---- 1 root 1005 195, 254 Nov 23 23:13 nvidia-modeset
crw-rw-rw- 1 root root 506, 0 Nov 23 23:13 nvidia-uvm
crw-rw-rw- 1 root root 506, 1 Nov 23 23:13 nvidia-uvm-tools
The nvidia devices belonged to a group with gid 1005. This was odd as there was no group in the container with that ID.
I went to look into the devices on the host, and as per my VGL setup, they belong to root
, or the vglusers
group.
(venv) jsim@goliath:/var/log$ cd /dev/
(venv) jsim@goliath:/dev$ ls -l | grep nvidia
crw-rw---- 1 root vglusers 195, 0 Nov 24 10:13 nvidia0
drwxr-xr-x 2 root root 80 Nov 24 10:31 nvidia-caps
crw-rw---- 1 root vglusers 195, 255 Nov 24 10:13 nvidiactl
crw-rw---- 1 root vglusers 195, 254 Nov 24 10:13 nvidia-modeset
crw-rw-rw- 1 root root 506, 0 Nov 24 10:13 nvidia-uvm
crw-rw-rw- 1 root root 506, 1 Nov 24 10:13 nvidia-uvm-tools
As it turns out, vglusers
has a gid of 1005!
jsim@goliath:/dev$ cat /etc/group | grep vglusers
vglusers:x:1005:jsim
So in my Dockerfile, all I had to do is add the group vglusers
with gid 1005, and add my user to this group. Problem solved.
RUN groupadd -g 1005 vglusers && \
useradd -ms /bin/bash testuser -u 1000 -g 1005 && \
usermod -a -G video,sudo testuser