How to pass compiler flags to nvcc from clang

I am trying to compile CUDA with clang, but the code I am trying to compile depends on a specific nvcc flag (-default-stream per-thread). How can I tell clang to pass the flag to nvcc?

For example, I can compile with nvcc and everythign works fine:

nvcc -default-stream per-thread *.cu -o app

But when I compile from clang, the program does not behave correctly because I can not pass the default-steam flag:

clang++ --cuda-gpu-arch=sm_35 -L/usr/local/cuda/lib64 *.cu -o app -lcudart_static -ldl -lrt -pthread

How do I get clang to pass flags to nvcc?

Solution

It looks like it may not be possible.

nvcc behind the scenes calls either clang/gcc with some custom generated flags and then calls ptxas and some other stuff to create the binary.

e.g.

nvcc -default-stream per-thread foo.cu
# Behind the scenes
gcc -custom-nvcc-generated-flag -DCUDA_API_PER_THREAD_DEFAULT_STREAM=1 -o foo.ptx
ptxas foo.ptx -o foo.cubin

When compiling to CUDA from clang, clang compiles directly to ptx and then calls ptxas:

clang++ foo.cu -o app -lcudart_static -ldl -lrt -pthread
# Behind the scenes
clang++ -triple nvptx64-nvidia-cuda foo.cu -o foo.ptx
ptxas foo.ptx -o foo.cubin

clang never actually calls nvcc. It just targets ptx and calls the ptx assembler.

Unless you know what custom backend flags will be produced by nvcc and manually include them when calling clang, I'm not sure you can automatically pass an nvcc flag from clang.