im building and project which uses both Thrust (cuda api) and openMP technologies.the main purpose of my program is to present an interface to calculate something , simultaneously speaking. in order to do that i've decided to use the STRATEGY design pattern , which basically means that we need to define a base class with a virtual function , and then other classes to derive from that base class and implement the needed function.
my problem starts here : 1 . can my project has more than 1 .CU file? 2 . can CU files have decleration of classes?
class foo
{
int m_name;
void doSomething();
}
3. this one continues 2. , i've head that DEVICE kernels can not be declared inside classes and has to be done like this :
//header file
__DEVICE__ void kernel(int x, inty)
{.....
}
class a : foo
{
void doSomething();
}
//cu file
void a::doSomething()
{
kernel<<<1,1>>>......();
}
is it the right way? 4.last question is , we i use THRUST , must i use CU files as well?
Thanks , igal
.cu
files in your project.The front end of the compiler processes CUDA source files according to C++ syntax rules. Full C++ is supported for the host code. However, only a subset of C++ is fully supported for the device code as described in Appendix D. As a consequence of the use of C++ syntax rules, void pointers (e.g., returned by malloc()) cannot be assigned to non-void pointers without a typecast.
You're ALMOST correct. You have to use __global__
keyword when declaring your kernel.
__global__ void kernel(int x, inty)
{.....
}
nvcc
. See thrust
documentation for details.In general, you will compile your programs like that:
$ nvcc -c device.cu
$ g++ -c host.cpp -I/usr/local/cuda/include/
$ nvcc device.o host.o
Alternatively, you can use g++ to perform final linking step.
$ g++ tester device.o host.o -L/usr/local/cuda/lib64 -lcudart
On Windows change the paths after -I
and -L
. Also, as far as I know, you have to use cl
compiler (MS Visual Studio).
Note 1:
Watch out for x86/x64 compatibility: if you use 64-bit CUDA Toolkit, use also a 64-bit compiler. (check -m32
and -m64
options of nvcc
also)
Note 2:
device.cu
contains kernels and a function that invokes kernel(s). This function has to be annotated with extern "C"
.
It can contain classes (limitations apply).
host.cpp
contains pure C++ code with a extern "C"
declaration of the function that is in device.cu
(NOT kernel).