I have 2 questions about new OpenMP 4.0.
First one is that I couldn't understand that what is the difference between target and target data? According to specifications target data create a new data environment. So what is the data environment? By the way can we liken OpenMP target data to OpenACC data directives?
The second question is as follows:
extern void init(float*, float*, int);
extern void output(float*, int);
void vec_mult(int N)
{
int i;
float p[N], v1[N], v2[N];
init(v1, v2, N);
#pragma omp target map(to: v1, v2) map(from: p)
#pragma omp parallel for
for (i=0; i<N; i++)
p[i] = v1[i] * v2[i];
output(p, N);
}
According to this example, there is no teams directive. So How should OpenMP compiler configurate device kernel? For if we talk about CUDA, do the invocation may like "kernel_func<<<1,1>>>"
If we want to use parallel for which inside of the target without teams directive, the compiler should generate kernel has 1 block. In the other hand compiler has to spawn the iterations through the threads inside of the block. For this reason kernels should have many threads (of course it's possible to work with 1 thread). The solution to implement this directives,
you can find more solution in this paper: http://rosecompiler.org/ROSE_ResearchPapers/Liao-OpenMP-Accelerator-Model-2013.pdf