Search code examples
cudapthreadsnvcc

Does nvcc support "-pthread" option internally?


I try to use nvcc to build following multi-thread program which is built by "gcc -pthread a.c" before:

$ cat a.c
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

void *myThreadFun(void *vargp)
{
    printf("myThreadFun \n");
    return NULL;
}

int main()
{
    pthread_t tid;
    printf("Before Thread\n");
    pthread_create(&tid, NULL, myThreadFun, NULL);
    pthread_join(tid, NULL);
    printf("After Thread\n");
    exit(0);
}

Execute "nvcc -pthread a.c":

$ nvcc -pthread a.c
nvcc fatal   : Unknown option 'pthread'

This topic said nvcc supports building multi-thread program without using -pthread option. And my test also seems right:

$ nvcc a.c
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
$ ldd a.out
    linux-vdso.so.1 (0x00007ffcff79e000)
    librt.so.1 => /usr/lib/librt.so.1 (0x00007fd4f5a43000)
    libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fd4f5825000)
    libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fd4f5621000)
    libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007fd4f5299000)
    libm.so.6 => /usr/lib/libm.so.6 (0x00007fd4f4f86000)
    libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007fd4f4d6f000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007fd4f49cb000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fd4f5c4b000)

But I can't find the proof from nvcc official document. Could anyone help to affirm it?


Solution

  • No nvcc doesn't support a pthread option. In fact, it knows nothing about pthreads. The pthread dependency is coming from dependencies in the CUDA runtime libraries. It has nothing to do with what is in your code. nvcc doesn't even compile that code, it is passed to your host compiler. nvcc is a compiler driver. It just steers compilation using other compilers. In this case the host C++ compiler.

    You can see what actually happens like this:

    $ nvcc -arch=sm_52 -v pthread_confusion.c 
    #$ _SPACE_= 
    #$ _CUDART_=cudart
    #$ _HERE_=/opt/cuda-8.0/bin
    #$ _THERE_=/opt/cuda-8.0/bin
    #$ _TARGET_SIZE_=
    #$ _TARGET_DIR_=
    #$ _TARGET_SIZE_=64
    #$ TOP=/opt/cuda-8.0/bin/..
    #$ NVVMIR_LIBRARY_DIR=/opt/cuda-8.0/bin/../nvvm/libdevice
    #$ LD_LIBRARY_PATH=/opt/cuda-8.0/bin/../lib:/opt/cuda-8.0/lib64:/usr/lib/nx/X11/Xinerama:/usr/lib/nx/X11
    #$ PATH=/opt/cuda-8.0/bin/../open64/bin:/opt/cuda-8.0/bin/../nvvm/bin:/opt/cuda-8.0/bin:/usr/local/bin:/usr/bin:/bin:/opt/cuda-8.0/bin
    #$ INCLUDES="-I/opt/cuda-8.0/bin/..//include"  
    #$ LIBRARIES=  "-L/opt/cuda-8.0/bin/..//lib64/stubs" "-L/opt/cuda-8.0/bin/..//lib64"
    #$ CUDAFE_FLAGS=
    #$ PTXAS_FLAGS=
    #$ gcc -c -x c -D__NVCC__  "-I/opt/cuda-8.0/bin/..//include"   -D"__CUDACC_VER__=80044" -D"__CUDACC_VER_BUILD__=44" -D"__CUDACC_VER_MINOR__=0" -D"__CUDACC_VER_MAJOR__=8" -m64 -o "/tmp/tmpxft_000069ca_00000000-4_pthread_confusion.o" "pthread_confusion.c" 
    #$ nvlink --arch=sm_52 --register-link-binaries="/tmp/tmpxft_000069ca_00000000-2_a_dlink.reg.c" -m64   "-L/opt/cuda-8.0/bin/..//lib64/stubs" "-L/opt/cuda-8.0/bin/..//lib64" -cpu-arch=X86_64 "/tmp/tmpxft_000069ca_00000000-4_pthread_confusion.o"  -lcudadevrt  -o "/tmp/tmpxft_000069ca_00000000-5_a_dlink.sm_52.cubin"
    #$ fatbinary --create="/tmp/tmpxft_000069ca_00000000-3_a_dlink.fatbin" -64 -link "--image=profile=sm_52,file=/tmp/tmpxft_000069ca_00000000-5_a_dlink.sm_52.cubin" --embedded-fatbin="/tmp/tmpxft_000069ca_00000000-3_a_dlink.fatbin.c" 
    #$ rm /tmp/tmpxft_000069ca_00000000-3_a_dlink.fatbin
    #$ gcc -c -x c++ -DFATBINFILE="\"/tmp/tmpxft_000069ca_00000000-3_a_dlink.fatbin.c\"" -DREGISTERLINKBINARYFILE="\"/tmp/tmpxft_000069ca_00000000-2_a_dlink.reg.c\"" -I. "-I/opt/cuda-8.0/bin/..//include"   -D"__CUDACC_VER__=80044" -D"__CUDACC_VER_BUILD__=44" -D"__CUDACC_VER_MINOR__=0" -D"__CUDACC_VER_MAJOR__=8" -m64 -o "/tmp/tmpxft_000069ca_00000000-6_a_dlink.o" "/opt/cuda-8.0/bin/crt/link.stub" 
    #$ g++ -m64 -o "a.out" -Wl,--start-group "/tmp/tmpxft_000069ca_00000000-6_a_dlink.o" "/tmp/tmpxft_000069ca_00000000-4_pthread_confusion.o"   "-L/opt/cuda-8.0/bin/..//lib64/stubs" "-L/opt/cuda-8.0/bin/..//lib64" -lcudadevrt  -lcudart_static  -lrt -lpthread  -ldl  -Wl,--end-group 
    

    Here, the boilerplate linking phase includes both the CUDA runtime library and the pthreads library.

    If you want to be explicit with the host compiler, pass -Xcompiler="-pthread"