Search code examples
c++cudathrust

Which one is faster? raw pointers vs thrust vectors


I am a beginner in Cuda, and I just wanted to ask a simple question that I could not find any clear answer for.

I know that we can define our array in Device memory using a raw pointer:

int *raw_ptr;
cudaMalloc((void **) &raw_ptr, N * sizeof(int));

And, we can also use Thrust to define a vector and push_back our items:

thrust::device_vector<int> D;

Actually, I need a huge amount of memory (like 500M int variables) to apply too many kernels on them in parallel. In terms of accessing the memory by kernels, is (when) using raw pointers faster than Thrust::vector?


Solution

  • The data in thrust::device_vector is ordinary global memory, there is no difference in access speed.

    Note however that the two alternatives you present are not equivalent. cudaMalloc returns uninitialized memory. Memory in thrust::device_vector will be initialized. After allocation it launches a kernel for the initialization of its elements, followed by cudaDeviceSynchronize. This could slow down the code. You need to benchmark your code.