Search code examples
c++cudacross-compilingclang++ptx

How to pass compiler flags to nvcc from clang


I am trying to compile CUDA with clang, but the code I am trying to compile depends on a specific nvcc flag (-default-stream per-thread). How can I tell clang to pass the flag to nvcc?

For example, I can compile with nvcc and everythign works fine:

nvcc -default-stream per-thread *.cu -o app

But when I compile from clang, the program does not behave correctly because I can not pass the default-steam flag:

clang++ --cuda-gpu-arch=sm_35 -L/usr/local/cuda/lib64 *.cu -o app -lcudart_static -ldl -lrt -pthread

How do I get clang to pass flags to nvcc?


Solution

  • It looks like it may not be possible.

    nvcc behind the scenes calls either clang/gcc with some custom generated flags and then calls ptxas and some other stuff to create the binary.

    e.g.

    nvcc -default-stream per-thread foo.cu
    # Behind the scenes
    gcc -custom-nvcc-generated-flag -DCUDA_API_PER_THREAD_DEFAULT_STREAM=1 -o foo.ptx
    ptxas foo.ptx -o foo.cubin
    

    When compiling to CUDA from clang, clang compiles directly to ptx and then calls ptxas:

    clang++ foo.cu -o app -lcudart_static -ldl -lrt -pthread
    # Behind the scenes
    clang++ -triple nvptx64-nvidia-cuda foo.cu -o foo.ptx
    ptxas foo.ptx -o foo.cubin
    

    clang never actually calls nvcc. It just targets ptx and calls the ptx assembler.

    Unless you know what custom backend flags will be produced by nvcc and manually include them when calling clang, I'm not sure you can automatically pass an nvcc flag from clang.