I am dispatching a kernel with around 5k blocks. At some point, we need to sort an array within each threadblock. If possible we would like to use a library like thrust.
From the documentation I understand that how sort is executed in thrust depends on the specified execution_policy
. However I don't understand if I can use execution_policies
to specify that I would like to use the threads of my current block for sorting. Can someone explain or hint me towards a good documentation of execution policies and tell me if what I intend to do is feasible?
Turns out that execution policies are basically a bridge design pattern that uses template specialization instead of inheritance to select the appropriate implementation of an algorithm while exposing a stable interface towards the user of the library and avoiding the overhead/necessity of virtual functions. Thank you robert-crovella for the great video.
As for the actual implementation of sorting within a threadblock in thrust, talonmies is right. There simply is no implementation (currently?), I could not find anything in the source code.