I try to use nvcc
to build following multi-thread program which is built by "gcc -pthread a.c
" before:
$ cat a.c
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void *myThreadFun(void *vargp)
{
printf("myThreadFun \n");
return NULL;
}
int main()
{
pthread_t tid;
printf("Before Thread\n");
pthread_create(&tid, NULL, myThreadFun, NULL);
pthread_join(tid, NULL);
printf("After Thread\n");
exit(0);
}
Execute "nvcc -pthread a.c
":
$ nvcc -pthread a.c
nvcc fatal : Unknown option 'pthread'
This topic said nvcc
supports building multi-thread program without using -pthread
option. And my test also seems right:
$ nvcc a.c
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
$ ldd a.out
linux-vdso.so.1 (0x00007ffcff79e000)
librt.so.1 => /usr/lib/librt.so.1 (0x00007fd4f5a43000)
libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fd4f5825000)
libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fd4f5621000)
libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007fd4f5299000)
libm.so.6 => /usr/lib/libm.so.6 (0x00007fd4f4f86000)
libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007fd4f4d6f000)
libc.so.6 => /usr/lib/libc.so.6 (0x00007fd4f49cb000)
/lib64/ld-linux-x86-64.so.2 (0x00007fd4f5c4b000)
But I can't find the proof from nvcc
official document. Could anyone help to affirm it?
No nvcc doesn't support a pthread option. In fact, it knows nothing about pthreads. The pthread dependency is coming from dependencies in the CUDA runtime libraries. It has nothing to do with what is in your code. nvcc doesn't even compile that code, it is passed to your host compiler. nvcc is a compiler driver. It just steers compilation using other compilers. In this case the host C++ compiler.
You can see what actually happens like this:
$ nvcc -arch=sm_52 -v pthread_confusion.c
#$ _SPACE_=
#$ _CUDART_=cudart
#$ _HERE_=/opt/cuda-8.0/bin
#$ _THERE_=/opt/cuda-8.0/bin
#$ _TARGET_SIZE_=
#$ _TARGET_DIR_=
#$ _TARGET_SIZE_=64
#$ TOP=/opt/cuda-8.0/bin/..
#$ NVVMIR_LIBRARY_DIR=/opt/cuda-8.0/bin/../nvvm/libdevice
#$ LD_LIBRARY_PATH=/opt/cuda-8.0/bin/../lib:/opt/cuda-8.0/lib64:/usr/lib/nx/X11/Xinerama:/usr/lib/nx/X11
#$ PATH=/opt/cuda-8.0/bin/../open64/bin:/opt/cuda-8.0/bin/../nvvm/bin:/opt/cuda-8.0/bin:/usr/local/bin:/usr/bin:/bin:/opt/cuda-8.0/bin
#$ INCLUDES="-I/opt/cuda-8.0/bin/..//include"
#$ LIBRARIES= "-L/opt/cuda-8.0/bin/..//lib64/stubs" "-L/opt/cuda-8.0/bin/..//lib64"
#$ CUDAFE_FLAGS=
#$ PTXAS_FLAGS=
#$ gcc -c -x c -D__NVCC__ "-I/opt/cuda-8.0/bin/..//include" -D"__CUDACC_VER__=80044" -D"__CUDACC_VER_BUILD__=44" -D"__CUDACC_VER_MINOR__=0" -D"__CUDACC_VER_MAJOR__=8" -m64 -o "/tmp/tmpxft_000069ca_00000000-4_pthread_confusion.o" "pthread_confusion.c"
#$ nvlink --arch=sm_52 --register-link-binaries="/tmp/tmpxft_000069ca_00000000-2_a_dlink.reg.c" -m64 "-L/opt/cuda-8.0/bin/..//lib64/stubs" "-L/opt/cuda-8.0/bin/..//lib64" -cpu-arch=X86_64 "/tmp/tmpxft_000069ca_00000000-4_pthread_confusion.o" -lcudadevrt -o "/tmp/tmpxft_000069ca_00000000-5_a_dlink.sm_52.cubin"
#$ fatbinary --create="/tmp/tmpxft_000069ca_00000000-3_a_dlink.fatbin" -64 -link "--image=profile=sm_52,file=/tmp/tmpxft_000069ca_00000000-5_a_dlink.sm_52.cubin" --embedded-fatbin="/tmp/tmpxft_000069ca_00000000-3_a_dlink.fatbin.c"
#$ rm /tmp/tmpxft_000069ca_00000000-3_a_dlink.fatbin
#$ gcc -c -x c++ -DFATBINFILE="\"/tmp/tmpxft_000069ca_00000000-3_a_dlink.fatbin.c\"" -DREGISTERLINKBINARYFILE="\"/tmp/tmpxft_000069ca_00000000-2_a_dlink.reg.c\"" -I. "-I/opt/cuda-8.0/bin/..//include" -D"__CUDACC_VER__=80044" -D"__CUDACC_VER_BUILD__=44" -D"__CUDACC_VER_MINOR__=0" -D"__CUDACC_VER_MAJOR__=8" -m64 -o "/tmp/tmpxft_000069ca_00000000-6_a_dlink.o" "/opt/cuda-8.0/bin/crt/link.stub"
#$ g++ -m64 -o "a.out" -Wl,--start-group "/tmp/tmpxft_000069ca_00000000-6_a_dlink.o" "/tmp/tmpxft_000069ca_00000000-4_pthread_confusion.o" "-L/opt/cuda-8.0/bin/..//lib64/stubs" "-L/opt/cuda-8.0/bin/..//lib64" -lcudadevrt -lcudart_static -lrt -lpthread -ldl -Wl,--end-group
Here, the boilerplate linking phase includes both the CUDA runtime library and the pthreads library.
If you want to be explicit with the host compiler, pass -Xcompiler="-pthread"