For example, I have 2 GPUs and 2 host threads. I cant check it because multigpu PC is far away from me. I want to make the first host thread work with the first GPU and the second host thread work with the second GPU. All host threads consist of many cublas calls. So is it possible to choose the fisrt GPU from the first host thread and the second gpu from the second host thread by cudaSetDevice() call?
For example for the second host thread I will call cudaSetDevice(1)
, and for the first thread I will call cudaSetDevice(0)
.
So is it possible to choose the fisrt GPU from the first host thread and the second gpu from the second host thread by cudaSetDevice() call?
Yes, it's possible. An example is given in the cudaOpenMP
sample code for this type of usage, (excerpting):
....
omp_set_num_threads(num_gpus); // create as many CPU threads as there are CUDA devices
//omp_set_num_threads(2*num_gpus);// create twice as many CPU threads as there are CUDA devices
#pragma omp parallel
{
unsigned int cpu_thread_id = omp_get_thread_num();
unsigned int num_cpu_threads = omp_get_num_threads();
// set and check the CUDA device for this CPU thread
int gpu_id = -1;
--> checkCudaErrors(cudaSetDevice(cpu_thread_id % num_gpus)); // "% num_gpus" allows more CPU threads than GPU devices
...,