Search code examples
cudapycuda

Particular Allocating device memory for _global_ function in cuda


want to do this programm on cuda.

1.in "main.cpp"

struct Center{
double * Data;
int dimension;
};
typedef struct Center Center;

//I allow a pointer on N Center elements by the CUDAMALLOC like follow

....
#include "kernel.cu"
....
center *V_dev;
int M =100, n=4; 

cudaStatus = cudaMalloc((void**)&V_dev,M*sizeof(Center));
Init<<<1,M>>>(V_dev, M, N); //I always know the dimension of N before calling

My "kernel.cu" file is something like this

#include "cuda_runtime.h"
#include"device_launch_parameters.h"
... //other include headers to allow my .cu file to know the Center type definition

__global__ void Init(Center *V, int N, int dimension){
V[threadIdx.x].dimension = dimension;
V[threadIdx.x].Data = (double*)malloc(dimension*sizeof(double));
for(int i=0; i<dimension; i++)
    V[threadIdx.x].Data[i] = 0; //For the value, it can be any kind of operation returning a float that i want to be able put here

} 

I'm on visual studio 2008 and CUDA 5.0. When I Build my project, I've got these errors:

error: calling a _host_ function("malloc") from a _global_ function("Init") is not allowed.

I want to know please how can I perform this? (I know that 'malloc' and other cpu memory allocation are not allowed for device memory.


Solution

  • malloc is allowed in device code but you have to be compiling for a cc2.0 or greater target GPU.

    Adjust your VS project settings to remove any GPU device settings like compute_10,sm_10 and replace it with compute_20,sm_20 or higher to match your GPU. (And, to run that code, your GPU needs to be cc2.0 or higher.)