Search code examples
memory-managementcudathrust

Using Thrust Functions with raw pointers: Controlling the allocation of memory


I have a question regarding the thrust library when using CUDA. I am using a thrust function, i.e. exclusive_scan, and I want to use raw pointers. I am using raw (device) pointers because I want to have full control of when the memory is allocated and deallocated.

After the function call, I will hand over the pointer to another data structure and then free the memory in either the destructor of this data structure, or in the next function call, when I recompute my (device) pointers. I came across for example this problem here now, which recommends to wrap the data structure in a device_vector. But then I run into the problem that the memory is freed once my device_vector goes out of scope, which I do not want. Having the device pointer globally is also not an option, since I am hacking code, i.e. it is used as a buffer and I would have to rewrite a lot if I wanted to do something like that.

Does anyone have a good workaround regarding this? The only chance I do see right now is to rewrite the thrust-function on my own, only using raw device-pointers.

EDIT: I misread, I can wrap it in a device_ptr instead of a device_vector. Asking further though, how could I solve this if there wasn't the option of using a device_ptr?


Solution

  • There is no problem using plain pointers in thrust methods.

    For data on the device do:

    ....
    struct DoSomething {
        __device__ int operator()(int item) { return 1; }
    };
    
    int* IntData;
    cudaMalloc(&IntData, sizeof(int) * count);
    auto dev_data = device_pointer_cast(IntData);
    thrust::generate(dev_data, dev_data + count, DoSomething());
    thrust::sort(dev_data, dev_data + count);
    ....
    cudaFree(IntData);
    

    For data on the host use plain malloc/free and raw_pointer_cast instead of device_pointer_cast. See: thrust: Memory management