Search code examples
c++cudanvcc

Template parameter as function specifier and compiler optimization


I have found this very useful post and I`d like to clarify something about the compiler optimizations. Lets say we have this function (same like in the original post):

template<int action>
__global__ void kernel()
{
    switch(action) {
       case 1:
       // First code
       break;

       case 2:
       // Second code
       break;
    }
}

Would the compiler do the optimization in the sense of eliminating an unreachable code even in the case I called the function with template variable unknown in the time of compiling - something like creating two separete functions? E.g.:

kernel<argv[1][0]>();

Solution

  • Short answer: no.

    Templates are instantiated and generated purely at compiletime, so you can't use the values in argv, since they are not known at compile time.

    Makes me wonder why you did not just give it a try and threw that code at a compiler - it would have told you that template arguments must be compile time constants.

    Update: Since you told us in the comments that it's not primarily about performance, but about readability, i'd recommend using switch/case:

    template <char c> void kernel() {
      //...
      switch(c) { /* ... */ }
    }
    
    switch (argv[1][0]) {
      case 'a': 
        kernel<'a'>();
        break;
      case 'b': 
        kernel<'b'>();
        break;
      //...
    }
    

    Since the value you have to make the descision on (i.e. argv[1][0]), is only known at runtime, you have to use runtime descision mechanisms. Of those, switch/case is among the fastest, especially if there are not too many different cases (but more than two) and especially if there are no gaps between the cases (i.e. 'a', 'b', 'c', instead of 1, 55, 2048). The compiler then can produce very fast jumptables.