Search code examples
cudapycuda

operator overloading in Cuda


I successfully created an operator+ between two float4 by doing :

__device__ float4 operator+(float4 a, float4 b) {
 // ...
}

However, if in addition, I want to have an operator+ for uchar4, by doing the same thing with uchar4, i get the following error: "error: more than one instance of overloaded function "operator+" has "C" linkage" "

I get a similar error message when I declare multiple functions with the same name but different arguments. So, two questions :

  • Polymorphism : Is-it possible to have multiple functions with the same name and different arguments in Cuda ? If so, why do I have this error message ?
  • operator+ for float4 : it seems that this feature is already included by including "cutil_math.h", but when I include that (#include <cutil_math.h>) it complains that there is no such file or directory... anything particular I should do ? Note: I am using pycuda, which is a cuda for python.

Thanks!


Solution

  • Note the "has "C" linkage" in the error. You are compiling your code with C linkage (pyCUDA does this by default to circumvent symbol mangling issues). C++ can't support multiple definitions of the same function name using C linkage.

    The solution is to compile code without automatically generated "extern C", and explicitly specify C linkage only for kernels. So your code would looks something like:

    __device__ float4 operator+(float4 a, float4 b) { ... };
    
    extern "C"
    __global__ void kernel() { };
    

    rather than the standard pyCUDA emitted:

    extern "C" 
    {
         __device__ float4 operator+(float4 a, float4 b) { ... };
    
         __global__ void kernel() { };
    }
    

    pycuda.compiler.SourceModule has an option no_extern_c which can be used to control whether extern "C" is emitted by the just in time compilation system or not.