c++templates cuda partial-specialization

Double-templated function instantiation fails

The following code:

template<typename T, MyEnum K> __global__ void myKernel(const T a[]);
template<typename T> __global__ void myKernel<T,SomeValueOfMyEnum>(const T a[]) {
    // implementation
}

Triggers the following error message:

error: an explicit template argument list is not allowed on this declaration

Why?

Notes:

I'm pretty sure this isn't CUDA-related, just a C++ issue.
There are a bunch of questions on partial specialization, but I can't figure out if mine is a dupe of any of them.

Solution

You can't do a partial specialization for a template function, because C++ doesn't define such a thing. You just can do a class template partial specialization [§14.5.5 / temp.class.spec]

Class partial specialization -- A little ugly but maybe it helps you.

enum MyEnum
{
    E1, E2
};

template<typename T, MyEnum K>
struct MyKernel
{
    void operator()(const T a[])
    {
        // ...
    }
};

template<typename T>
struct MyKernel<T, E1>
{
    void operator()(const T a[])
    {
        // ...
    }
};

int main()
{
    MyKernel<int, E1>()( ... ); // <--- To call
}