Search code examples
c++cudaconstexprnvcc

nvcc compiler recognizes static constexpr as undefined in device code


This question is a follow-up question of this one.
It is about the nvcc compiler recognizing a static constexpr class variable as undefined in device code, if the variable is odr-used. However, I could not find a reason, why it should not work.
The error message is:

error: identifier "Tester<int> ::ONE" is undefined in device code

compiled with

nvcc -std=c++11 -ccbin=/usr/bin/g++-4.9 -arch=sm_30 main.cu

The nvcc compiler version is release 8.0, V8.0.26.
An minimal example (a shortened version of the MWE in the previous question, concentrating on this particular issue) is given by

#include <iostream>
#include <cstdlib>

#ifdef __CUDACC__
    #define HD __host__ __device__
#else
    #define HD
#endif


HD void doSomething(const int& var ) {};

template<typename T> class Tester
{
public:
    static constexpr int ONE = 1;

    HD void test()
    {
        doSomething( ONE );
    }
};
template<typename T> constexpr int Tester<T>::ONE;


int main()
{
    using t = int;

    Tester<t> tester;
    tester.test();

    return EXIT_SUCCESS;
}

The question is not about fixing this particular code (which would be done by passing var by value instead of const reference - at least the compiler does not complain any more, although it is an odr-use, isn't it?).
The question is, whether this is a bug of the nvcc compiler or if there is some good reason, why this does not work (I did not find any hints on that on the NVIDIA pages...).


Solution

  • I think this extract from the E.2.13. Const-qualified variables section of CUDA documentation explains that:

    Let 'V' denote a namespace scope variable or a class static member variable that has const qualified type and does not have execution space annotations (e.g., __device__, __constant__, __shared__). V is considered to be a host code variable.

    The value of V may be directly used in device code, if:
    • V has been initialized with a constant expression before the point of use, and
    • it has one of the following types:
      • builtin floating point type except when the Microsoft compiler is used as the host compiler,
      • builtin integral type.

    Device source code cannot contain a reference to V or take the address of V.

    I think that your code is violating the last sentence - your code contains a reference.