This question is a follow-up question of this one.
It is about the nvcc
compiler recognizing a static constexpr
class variable as undefined in device code, if the variable is odr-used. However, I could not find a reason, why it should not work.
The error message is:
error: identifier "Tester<int> ::ONE" is undefined in device code
compiled with
nvcc -std=c++11 -ccbin=/usr/bin/g++-4.9 -arch=sm_30 main.cu
The nvcc
compiler version is release 8.0, V8.0.26
.
An minimal example (a shortened version of the MWE in the previous question, concentrating on this particular issue) is given by
#include <iostream>
#include <cstdlib>
#ifdef __CUDACC__
#define HD __host__ __device__
#else
#define HD
#endif
HD void doSomething(const int& var ) {};
template<typename T> class Tester
{
public:
static constexpr int ONE = 1;
HD void test()
{
doSomething( ONE );
}
};
template<typename T> constexpr int Tester<T>::ONE;
int main()
{
using t = int;
Tester<t> tester;
tester.test();
return EXIT_SUCCESS;
}
The question is not about fixing this particular code (which would be done by passing var by value instead of const reference - at least the compiler does not complain any more, although it is an odr-use, isn't it?).
The question is, whether this is a bug of the nvcc
compiler or if there is some good reason, why this does not work (I did not find any hints on that on the NVIDIA pages...).
I think this extract from the E.2.13. Const-qualified variables section of CUDA documentation explains that:
Let 'V' denote a namespace scope variable or a class static member variable that has const qualified type and does not have execution space annotations (e.g., __device__, __constant__, __shared__). V is considered to be a host code variable.
The value of V may be directly used in device code, if:
- V has been initialized with a constant expression before the point of use, and
- it has one of the following types:
- builtin floating point type except when the Microsoft compiler is used as the host compiler,
- builtin integral type.
Device source code cannot contain a reference to V or take the address of V.
I think that your code is violating the last sentence - your code contains a reference.