Search code examples
cudadynamic-parallelism

Trouble compiling/running CUDA code involving dynamic parallelism


I am trying to use dynamic parallelism with CUDA, but I cannot go through the compilation step.

I am working on a GPU with Compute Capability 3.5 and the CUDA version 7.5.

Depending on the switches in the compile command I use, I am getting different error messages, but using the documentation,

  • I arrived to one line leading to a successful compilation:

    nvcc -arch=compute_35 -rdc=true cudaDynamic.cu -o cudaDynamic.out -lcudadevrt
    

    But when the program is launched, all the program fails. With CUDA-memcheck, for each call to an API function, I get the same error message:

    ========= CUDA-MEMCHECK
    ========= Program hit cudaErrorUnknown (error 30) due to "unknown error" on CUDA API call to ...
    
  • I have also tried this line (taken from CUDA dynamic samples makefile):

    nvcc -ccbin g++ -I../../common/inc -m64 -dc -gencode arch=compute_35,code=compute_35 -o cudaDynamic.out -c cudaDynamic.cu
    

    But upon execution, I get:

    cudaDynamic.out: Permission denied
    

I would like to understand how to correctly compile a CUDA dynamic code, because all the other compilation lines that I have tried so far have failed.


Solution

  • I fixed the problem by fully reinstalling CUDA.

    I'm now able to compile both the CUDA samples and my own code.