Search code examples
c++cudanvcc

CUDA Check out nvcc "-arch"-flag during run time


Is there somehow a possibility to call different kernels depending on whether for example nvcc -arch=sm_11 or nvcc -arch=sm_20 has been used to compile the code? To be a bit more explicit:

if (FANCY_FLAG == CU_TARGET_COMPUTE_11)
    // Do some conversions here..
    krnl1<<<GRID_DIM1, BLOCK_DIM1>>>(converted_value1);
else if (FANCY_FLAG == CU_TARGET_COMPUTE_20)
    krnl2<<<GRID_DIM2, BLOCK_DIM2>>>(value1);

As you can see I found the CUjit_target_enum in cuda.h but I wasn't able to find out whether the nvcc defines any flags which would be equal to one of the enums values.

My intention for this is that I don't know whether my device support double precision floats or not. That would mean I have to convert my data from double to float and hence, run a different kernel (Yes, I'd prefer to run the kernel with double precision over single precision wherever possible).

I'd also appreciate a completely different approach as long as it does the trick.


Solution

    1. In the device code check CUDA_ARCH macro value
    2. In the host code - check major and minor fields of the device properties.