Search code examples
c++cudanvcc

__CUDA__ undefined when running nvcc in for a .cu file


I am trying to compile a .cu file which includes a .cuh file with a templated device function using nvcc. That same .cuh file is also included from a .cpp, so I am trying to prevent the templated device function to be visible from the .cpp side. To do so I am using

foo.cuh

#if defined(__CUDA__) && defined(__CUDA_ARCH__)

template <typename T>
__device__ void foo(){...}
#endif

However, when I try to use the device function from a kernel, it says the function doesn't exist while compiling the .cu. The .cu compiles fine if I remove the __CUDA__ check, but in that case it fails later, when compiling the .cpp file.

Am I missing something in the check?


Solution

  • There is no __CUDA__ macro defined by nvcc. Therefore, unless you defined it somewhere using your own methodology (?) your #if directive will always be skipped.

    Perhaps you want __CUDACC__?

    example:

    $ cat test.cuh
    #ifdef __CUDACC__
    __device__ void foo(){};
    #endif
    
    void f();
    
    $ cat main.cpp
    #include <test.cuh>
    
    int main(){
      f();
    }
    $ cat test.cu
    #include <test.cuh>
    __global__ void k(){foo();}
    
    void f(){
    
      k<<<1,1>>>();
      cudaDeviceSynchronize();
    }
    $ nvcc -I. -o test test.cu main.cpp
    $ g++  -I. -c main.cpp
    $ nvcc -I. -o test test.cu main.o
    $ compute-sanitizer ./test
    ========= COMPUTE-SANITIZER
    ========= ERROR SUMMARY: 0 errors
    $
    

    (the __CUDA_ARCH__ macro is not needed here either.)