Cuda: Copy 1D Array From CPU to GPU

I am newbie to Cuda, trying to copy array from Host to Device via cudaMemcpy(...) However,the data passed to GPU seems to be totally different (for cost: totally wrong, for G: wrong after index of 5)

My data is a malloc array (written in C) of size 25 for example, I tried to copy through the following way (MAX = 5):

Declaration:

int *cost, int* G
int *dev_cost, *dev_G;

Allocation:

cost = (int*)malloc(MAX* MAX * sizeof(int));
G = (int*)malloc(MAX* MAX* sizeof(int));
cudaMalloc((void**)&dev_cost, MAX*MAX);
cudaMalloc((void**)&dev_G, MAX*MAX);

Data transfer:

cudaMemcpy(dev_cost, cost, MAX*MAX, cudaMemcpyHostToDevice);
cudaMemcpy(dev_G, G, MAX*MAX, cudaMemcpyHostToDevice);

Kernel Trigger:

assignCost<<<1,MAX*MAX>>>(dev_G,dev_cost);

Kernel Function:

__global__ void assignCost(int *G, int *cost)
{
    int tid = threadIdx.x + blockDim.x*blockIdx.x;

    printf("cost[%d]: %d G[%d] = %d\n", tid, cost[tid], tid, G[tid]);
    if(tid<MAX*MAX)
    {
        if (G[tid] == 0)
            cost[tid] = INT_MAX;
        else
            cost[tid] = G[tid];
    }
}

Is there anything wrong with my approach? If then, how should i modify?

Solution

cudaMemcpy(dev_cost, cost, MAX*MAX*sizeof(int), cudaMemcpyHostToDevice);
cudaMemcpy(dev_G, G, MAX*MAX*sizeof(int), cudaMemcpyHostToDevice);

The 3rd argument for cudaMemcpy is the count in bytes. As you have MAX*MAX integers and each integer has a size sizeof(int) bytes, replace MAX*MAX as MAX*MAX*sizeof(int)