Search code examples
dockercudanvidiayoctonvidia-jetson

Nvidia Jetson Nano with docker


I am running a custom Yocto image on the Nvidia Jetson Nano that has docker-ce (v19.03.2) included. I am able to run docker without problems.

The problem comes when I want to use docker for vision testing. I need access to host side CUDA and tensorRT. This is accessed through the Nvidia Container Runtime on top of the docker-ce. I have installed Nvidia Container Runtime (v0.9.0 beta) manually (extracted the necessary .deb packages and copy pasted them into the rootfs) to test on my build, and it seems to be working fine.

When I run docker info I can see that the nvidia runtime is available, and it doesn't complain when I run a docker with docker run -it --runtime=nvidia image.

If i run deviceQuery test OUTSIDE docker, i get the following:

$ /usr/bin/cuda-samples/deviceQuery
...
CUDA Driver Version / Runtime Version          10.0 / 10.0
CUDA Capability Major/Minor version number:    5.3
...
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.0, CUDA Runtime Version = 10.0, NumDevs = 1
Result = PASS

However, when I want to run deviceQuery IN a docker to test CUDA availability, it fails:

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
Result = FAIL

Where the Dockerfile is as follows:

FROM nvcr.io/nvidia/l4t-base:r32.2

COPY ./deviceQuery /tmp/deviceQuery

WORKDIR /tmp/

CMD ["./deviceQuery"]

So my questions are these:

Why does the deviceQuery test fail inside the docker even though I have Nvidia Container Runtime installed with docker? - and how can i fix it this issue?

------------EDIT:----------

More on this on this thread from nvidia devtalk.


Solution

  • The .csv that are included in the rootfs from the NVIDIA SDK-manager contains specific lib/dir/sym that are needed for the passing of GPU access to the container. The files that are listed in the .csv files are merged into the container and allows access to these files. What specific files are needed, depends on what is needed in the container.

    It is of course very important that the actual paths to the files listed in the csv files are the same on the host, otherwise the merge will fail. These paths are not the correct paths on the default Yocto setup as they are made for the default NVIDIA SDK-manager image rootfs setup and thus needs to be corrected.

    Once corrected, the access to GPU acceleration in the container should be possible and can be confirmed by doing a deviceQuery test.