Search code examples
c++templatesc++11concurrencyc++-amp

Using a user-specified function in a parallel_for_each in C++AMP


I am currently in the process of writing a library, and I wish to be able to allow the user to define a function (declared as restrict( amp )) and allow them to pass this function to one of my library functions for use within a concurrency::parallel_for_each loop. For instance:

template <typename T, typename Func>
void Foo( const concurrency::array_view<const T>& avParam, Func f ) 
{
     concurrency::array<T, 1> arrResult( avParam.extent );
     concurrency::parallel_for_each( avParam.extent, [=, &arrResult]( concurrency::index<1> index ) restrict(amp) {
          arrResult[index] = f( avParam[index] );
     } );

     // Do stuff...
}

I would expect this to work providing f is declared as a valid AMP-compatible function, as if I replace the function pointer f directly with the function itself in the kernel; everything works as expected. However, using f results in the following error:

Function pointer, function reference, or pointer to member function is not supported.

Is there any way I can get my desired behavior without preventing my user from using functors other than lambdas?


Solution

  • References and pointers (to a compatible type) may be used locally but cannot be captured by a lambda. Function pointers, pointer-to-pointer, and the like are not allowed; neither are static or global variables.

    Classes must meet more rules if you wish to use instances of them. They must have no virtual functions or virtual inheritance. Constructors, destructors, and other non-virtual functions are allowed. The member variables must all be of compatible types, which could of course include instances of other classes as long as those classes meet the same rules. The actual code in your amp-compatible function is not running on a CPU and therefore can’t do certain things that you might be used to doing:

    • recursion
    • pointer casting
    • use of virtual functions
    • new or delete
    • RTTI or dynamic casting

    You should write your library in terms of lambdas of functors as these can be accessed with a restrict(amp) kernel. You could do the following:

    template <typename T, typename Func>
    void Foo(const concurrency::array_view<const T>& avParam, Func f)
    {
        concurrency::array<T, 1> arrResult(avParam.extent);
        concurrency::parallel_for_each(avParam.extent, [=, &arrResult](concurrency::index<1> index) restrict(amp) 
        {
            arrResult[index] = f(avParam[index]);
        });
    
        // Do stuff...
    }
    
    template <typename T>
    class Bar
    {
    public:
        T operator()(const T& v) const restrict(amp)
        {
            return v + 2;
        }
    };
    
    int _tmain(int argc, _TCHAR* argv[])
    {
        std::vector<int> result(100, 0);
        array_view<const int, 1> result_av(result.size(), result);
    
        Foo(result_av, Bar<int>());
    
        return 0;
    }
    

    One way to think about this is that the functor or lambda equivalent creates a container that the compiler can ensure has no dependencies and the C++ AMP runtime can instantiate on the GPU. This would be much harder to achieve with a function pointer.