Search code examples
cudashared-librarieseclipse-cdtld

Can run executables using CUDA from command-line, but failing to find some .so file when debugging them


I've written a CUDA application which compiles and runs. However, when I try to debug/run them through Eclipse CDT, or through kdbg, I get an error message such as:

/path/to/executable: error while loading shared libraries: libnvToolsExt.so.1: cannot open shared object file: No such file or directory

or a similar message with libcudart.so.10.2 instead.

Why is this happening if the executable runs on its own, and what can I do about it?

Information about my system:

  • A Debian-derived GNU/Linux
  • CUDA 10.2 installed manually (with no distribution-supplied CUDA packages installed)
  • Eclipse CDT version 2018-09 (4.9.0)
  • kdbg version 2.5.5
  • X86_64 machine

Solution

  • The manual installation of the CUDA toolkit (with or without the nVIDIA kernel driver) does not make its libraries prominently "visible" on the system. If you're using a non-CUDA binary (compiler, linker/loader, etc.) - it will simply not be aware of the installation. Specifically, when you try to run an executable built to use shared libraries, the loader - GNU ld on your system - must be able to find those libraries. For a given executable, you can obtain a list of them using readelf (or using other methods). A typical example:

    $ readelf -d my_cuda_app | grep 'NEEDED'
     0x0000000000000001 (NEEDED)             Shared library: [libnvToolsExt.so.1]
     0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
     0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
     0x0000000000000001 (NEEDED)             Shared library: [librt.so.1]
     0x0000000000000001 (NEEDED)             Shared library: [libcudart.so.10.2]
     0x0000000000000001 (NEEDED)             Shared library: [libcupti.so.10.2]
     0x0000000000000001 (NEEDED)             Shared library: [libOpenCL.so.1]
     0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
     0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
     0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
    

    There are (at least) two ways of making a shared library accessible (i.e. add it to GNU ld's search path):

    1. Add the library's directory to the LD_LIBRARY_PATH environment variable.
    2. Add the library's directory to /etc/ld.so.conf, or to a file in /etc/ld.so.conf.d in case your /etc/ld.so.conf recrusively reads its configuration from a subdirectory

    Unfortunately, the manual CUDA installer does not offer you to apply the second approach, nor does it suggest you may want to do so yourself.

    You must have choose the first of these two approaches - and thus can execute your binary from within a shell session. However, Eclipse CDT and kdbg (and possibly other IDEs and debuggers) are rather strict w.r.t. the execution of built programs, and must be "scrubbing" the executables' environment of the LD_LIBRARY_PATH variable.

    Instead of, or in addition to, the LD_LIBRARY_PATH addition - create a file named /etc/ld.so.conf.d/cuda, with your manual CUDA installation's library directory, e.g.:

    /usr/local/cuda-10.2/targets/x86_64-linux/lib
    

    This should allow kdbg and Eclipse CDT to debug your app.