Below is the pseudo-code of what I want to do.
I already know how to move tensor to GPU (.cuda()
)...
But have no idea about using a GPU pointer to make a new tensor.
Is there any method I've missed?
I don't want to copy devPtr
back to the host side but just make the GPU tensor with the pointer.
int main(void) {
float* devPtr;
cudaMalloc((void**)&devPtr, sizeof(float)*HOSTDATA_SIZE);
cudaMemcpy(devPtr, hostData, sizeof(float)*HOSTDATA_SIZE, cudaMemcpyHostToDevice);
torch::Tensor inA = /* make Tensor with devPtr which is already in GPU */;
torch::Tensor inB = torch::randn({1, 10, 512, 512}).cuda();
torch::Tensor out = torch::matmul(inA, inB);
std::cout << out << std::endl;
return 0;
}
I think this should work, can you confirm ?
auto dims = torch::IntArrayRef{1, 10, 512, 512};
auto gpu_tensor = torch::from_blob(dev_ptr, dims, torch::TensorOptions().device(torch::kCUDA))
Be careful, torch::from_blob does not take ownership of the pointer.If you need to make gpu_tensor independant of the lifetime of dev_ptr
, then you need to clone it.