Search code examples
cudaopenmpllvm-clang

Linker errors when compiling OpenMP `target` directive using Clang 6 (trunk) with nvptx target


I want to run OpenMP4+ code on my Nvidia GPU using LLVM/Clang. I downloaded and compiled llvm/clang/omp libraries from trunk following the instructions found here: https://clang.llvm.org/get_started.html and here: https://openmp.llvm.org/. I did not build Compiler-RT and libcxx, but I don't think that would make any difference.

My CMake command was this: cmake -G "Unix Makefiles" ../llvm -DCMAKE_BUILD_TYPE=Release -DOPENMP_ENABLE_LIBOMPTARGET=ON

I wrote a very basic program with a single OpenMP target directive:

int main(void)
{
    #pragma omp target
    {
    }
    return 0;
}

And I compile it with this: /home/user/opt/llvm/bin/clang++ -v main.cpp -fopenmp -lomptarget -fopenmp-targets=nvptx64-nvidia-cuda --cuda-path=/home/user/opt/pgi/linux86-64/2017/cuda/8.0

In case you ask: Yes, I haven't set up my paths for the compiler but I made sure that my LD_LIBRARY_PATH points where the libomptarget is located.

And here is the output/error I get after executing the above command (The last ~10 lines show the error):

<If this is too much information, just go to the last 10 lines to see the error>
clang version 6.0.0 (trunk 312875)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/user/opt/llvm/bin
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/6
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/6.4.0
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/7
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/7.2.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.3
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5.4.1
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.4.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.2.0
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.2.0
Candidate multilib: .;@m64
Selected multilib: .;@m64
Found CUDA installation: /home/user/opt/pgi/linux86-64/2017/cuda/8.0, version 7.0
 "/home/user/opt/llvm/bin/clang-6.0" -cc1 -triple x86_64-unknown-linux-gnu -emit-llvm-bc -emit-llvm-uselists -disable-free -disable-llvm-verifier -discard-value-names -main-file-name main.cpp -mrelocation-model static -mthread-model posix -mdisable-fp-elim -fmath-errno -masm-verbose -mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu x86-64 -dwarf-column-info -debugger-tuning=gdb -v -resource-dir /home/user/opt/llvm/lib/clang/6.0.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/backward -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/backward -internal-isystem /usr/local/include -internal-isystem /home/user/opt/llvm/lib/clang/6.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /usr/local/include -internal-isystem /home/user/opt/llvm/lib/clang/6.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdeprecated-macro -fdebug-compilation-dir /tmp -ferror-limit 19 -fmessage-length 190 -fopenmp -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -o /tmp/main-be2d35.bc -x c++ main.cpp -fopenmp-targets=nvptx64-nvidia-cuda
clang -cc1 version 6.0.0 based upon LLVM 6.0.0svn default target x86_64-unknown-linux-gnu
ignoring nonexistent directory "/include"
ignoring nonexistent directory "/include"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/backward"
ignoring duplicate directory "/usr/local/include"
ignoring duplicate directory "/home/user/opt/llvm/lib/clang/6.0.0/include"
ignoring duplicate directory "/usr/include/x86_64-linux-gnu"
ignoring duplicate directory "/usr/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0
 /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0
 /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/backward
 /usr/local/include
 /home/user/opt/llvm/lib/clang/6.0.0/include
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
 "/home/user/opt/llvm/bin/clang-6.0" -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-linux-gnu -S -disable-free -disable-llvm-verifier -discard-value-names -main-file-name main.cpp -mrelocation-model pic -pic-level 2 -mthread-model posix -mdisable-fp-elim -fmath-errno -no-integrated-as -fuse-init-array -mlink-cuda-bitcode /home/user/opt/pgi/linux86-64/2017/cuda/8.0/nvvm/libdevice/libdevice.compute_20.10.bc -target-feature +ptx42 -target-cpu sm_20 -dwarf-column-info -debugger-tuning=gdb -v -resource-dir /home/user/opt/llvm/lib/clang/6.0.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/backward -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/backward -internal-isystem /usr/local/include -internal-isystem /home/user/opt/llvm/lib/clang/6.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /usr/local/include -internal-isystem /home/user/opt/llvm/lib/clang/6.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdeprecated-macro -fno-dwarf-directory-asm -fdebug-compilation-dir /tmp -ferror-limit 19 -fmessage-length 190 -fopenmp -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -o /tmp/main-7ffbd7.s -x c++ main.cpp -fopenmp-is-device -fopenmp-host-ir-file-path /tmp/main-be2d35.bc
clang -cc1 version 6.0.0 based upon LLVM 6.0.0svn default target x86_64-unknown-linux-gnu
ignoring nonexistent directory "/include"
ignoring nonexistent directory "/include"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/backward"
ignoring duplicate directory "/usr/local/include"
ignoring duplicate directory "/home/user/opt/llvm/lib/clang/6.0.0/include"
ignoring duplicate directory "/usr/include/x86_64-linux-gnu"
ignoring duplicate directory "/usr/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0
 /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/x86_64-linux-gnu/c++/7.2.0
 /usr/lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/backward
 /usr/local/include
 /home/user/opt/llvm/lib/clang/6.0.0/include
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
"/home/user/opt/pgi/linux86-64/2017/cuda/8.0/bin/ptxas" -m64 -O0 -v --gpu-name sm_20 --output-file /tmp/main-64fc86.cubin /tmp/main-ca9e59.s -c
ptxas info    : 1 bytes gmem, 8 bytes cmem[14]
ptxas info    : Compiling entry function '__omp_offloading_803_18004c0_main_l3' for 'sm_20'
ptxas info    : Function properties for __omp_offloading_803_18004c0_main_l3
0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 5 registers, 32 bytes cmem[0]
ptxas info    : Function properties for __omp_offloading_803_18004c0_main_l3_worker
24 bytes stack frame, 8 bytes spill stores, 8 bytes spill loads
"/home/user/opt/pgi/linux86-64/2017/cuda/8.0/bin/nvlink" -o /tmp/main-f247e3.out -v -arch sm_20 -L/home/user/opt/llvm/lib -lomptarget-nvptx main-64fc86.cubin
nvlink error   : Undefined reference to '__kmpc_kernel_init' in 'main-64fc86.cubin'
nvlink error   : Undefined reference to '__kmpc_kernel_deinit' in 'main-64fc86.cubin'
nvlink error   : Undefined reference to '__kmpc_kernel_parallel' in 'main-64fc86.cubin'
nvlink error   : Undefined reference to '__kmpc_kernel_end_parallel' in 'main-64fc86.cubin'
nvlink info    : 1 bytes gmem, 8 bytes cmem[14]
nvlink info    : Function properties for '__omp_offloading_803_18004c0_main_l3':
nvlink info    : used 18 registers, 24 stack, 0 bytes smem, 32 bytes cmem[0], 0 bytes lmem
clang-6.0: error: fatbinary command failed with exit code 255 (use -v to see invocation)

Any idea which library is supposed to have these __kmpc* symbols? I tried to run:

nm libomptarget.so | grep __kmpc_kernel_parallel

and

nm libomptarget.rtl.cuda.so | grep __kmpc_kernel_parallel,

but both commands return nothing.

Finally, if I remove -fopenmp-targets=nvptx64-nvidia-cuda from my compile flags, I get no linker errors. But of course in that case no CUDA code is generated.

Any feedback that may help me to figure out what is going on, where these symbols are supposed to be located, and why they are not there, is more than welcome.


Solution

  • Unfortunately OMP target offloading support for the nvptx device hasn't been upstreamed as yet. A recent branch is available on github with build instructions at the wiki https://github.com/clang-ykt/clang/wiki