CUDA assert - overload on host device, why no warnings/errors?

I'm trying to understand how CUDA assert works under the hood. The assert macro calls the __assert_fail function, of which I can find the following signatures:

/usr/include/assert.h:extern void __assert_fail (const char *__assertion, const char *__file,

/usr/local/cuda-10.2/targets/x86_64-linux/include/crt/common_functions.h:extern __host__ __device__ __cudart_builtin__ void __assert_fail(

I can see they have the same signature, however the CUDA version has __host__ __device__ qualifiers.

Normally it's not possible to overload a function based on __host__ __device__, since they are not part of the function signature (unless compiling CUDA code with Clang instead of NVCC):

extern void foo();
extern __host__ __device__ void foo();

Compiling with NVCC and warnings as errors:

main.cu(4): error: a __host__ function("foo") redeclared with __host__ __device__, hence treated as a __host__ __device__ function

So, why isn't this warning/error showing up with assert? Is there any additional magic to make this work?

Solution

Consider the following example program main.cu which is compiled via nvcc --keep main.cu -o main

#include <cassert>

__global__ void kernel(){
    assert(false);
}

int main(){
    kernel<<<1,1>>>();
    cudaDeviceSynchronize();
    assert(false);
}

Compilation of a CUDA program is performed in several steps as described in the toolkit documentation https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#cuda-compilation-trajectory Compiling with the flag --keep keeps all the intermediate files which can be searched for __assert_fail.

File main.cpp1.ii

# 66 "/usr/include/assert.h" 3 4
extern "C" {

extern void __assert_fail (const char *__assertion, const char *__file,
      unsigned int __line, const char *__function)
     throw () __attribute__ ((__noreturn__));
...
}
# 169 "/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/common_functions.h"
extern "C"
{
# 197 "/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/common_functions.h"
extern __attribute__((host)) __attribute__((device)) __attribute__((cudart_builtin)) void __assert_fail(
  const char *, const char *, unsigned int, const char *)
  
...
}

Note the attribute cudart_builtin in the version injected by nvcc. With this attribute, there is no warning of redeclaration.

For your case, this program produces the warning, as you observed

extern void foo();
extern __host__ __device__ void foo();

int main(){
    return 0;
}

However, the following compiles without warning.

extern void foo();
extern __host__ __device__ __attribute__((cudart_builtin)) 
void foo();

int main(){
    return 0;
}

So, you could use this attribute. However, it may have unknown side effects, and foo is in fact no builtin function of the cuda runtime. You should not use this attribute on your functions.

CUDA 11.2 has been used.

CUDA assert - overload on __host__ __device__, why no warnings/errors?

CUDA assert - overload on host device, why no warnings/errors?