Search code examples
templatescompilationcompiler-errorscudanvcc

Error compiling template function in CUDA using nvcc


I have the following CUDA code:

enum METHOD_E {
    METH_0 = 0,
    METH_1
};

template <enum METHOD_E METH>
inline __device__ int test_func<METH>()
{
    return int(METH);
}

__global__ void test_kernel()
{
    test_func<METH_0>();
}

void test()
{
    test_kernel<<<1, 1>>>();
}

When I compile I get the following error:

>nvcc --cuda test.cu
test.cu
test.cu(7): error: test_func is not a template

test.cu(14): error: identifier "test_func" is undefined

test.cu(14): error: expected an expression

3 errors detected in the compilation of "C:/Users/BLAH45~1/AppData/Local/Temp/tm
pxft_00000b60_00000000-6_test.cpp1.ii".

Section D.1.4 of the Programming Guide (4.0, the version of the toolkit I'm using) suggests templates should work, but I can't get them to.

Can anyone suggest a change to this code which makes it compile (without removing the templating!)?


Solution

  • Your test_func definition is wrong:

    test_func<METH>() should be simply test_func().

    This works for me:

    enum METHOD_E {
        METH_0 = 0,
        METH_1
    };
    
    template < enum METHOD_E METH>
    __device__
    inline
    int test_func ()
    {
        return int(METH);
    }
    
    __global__ void test_kernel()
    {
        test_func<METH_0>();
    }
    
    void test()
    {
        test_kernel<<<1, 1>>>();
    }