Search code examples
cudadynamic-memory-allocationaccess-violationpointer-to-pointer

CUDA: pointer to pointer memory access


I can't figure out what is causing the issue. I get "access violation writing location" error in the last line. Am I not correctly allocating the memory?

    typedef struct {
    doubleXYZW cen_sum; //struct with 4 doubles
    double STS[6];
    XYZW *Points;// //struct with 4 floats
}BUNDLE;

BUNDLE *cpu_data = NULL;
size_t bundle_size = NUM_POINTS * sizeof(XYZW) + sizeof(doubleXYZW) + 6*sizeof(double);
HANDLE_ERROR(cudaMallocHost((BUNDLE**)&cpu_data, bundle_size));
//error in the next line
cpu_data->Points[0].x = 0; //x is the first element in the XYZW struct

Solution

  • You have 2 allocations that must be done, and you are only performing one of them.

    You are allocating some storage for the cpu_data pointer, but you have not allocated any storage for the Points pointer. Therefore when you dereference Points:

    cpu_data->Points[0].x = 0;
             ^      ^
             |      this dereferences the Points pointer (NOT allocated!)
             |
            this dereferences the cpu_data pointer (allocated)
    

    you are dereferencing a pointer that you have not allocated, so it is invalid. Attempting to access something that way will generate an invalid access.

    You have (at least) two options to fix it:

    1. after you have allocated space for cpu_points, you can perform another cudaMallocHost allocation on cpu_points->Points
    2. If you know the size of the Points array (it seems like you do - NUM_POINTS) then you could just statically allocate for it:

      typedef struct {
      doubleXYZW cen_sum; //struct with 4 doubles
      double STS[6];
      XYZW Points[NUM_POINTS];// //struct with 4 floats
      }BUNDLE;
      

    Note that your bundle_size calculation is crafted in such a way that the 2nd method is suggested. If you go with the first method, your bundle_size calculation is incorrect. In any event, with either method, it's easier just to compute bundle_size as sizeof(BUNDLE).

    To be clear, there is nothing CUDA-specific here (the error would be present e.g. if you used malloc instead of cudaMallocHost). The problem is rooted in basic C understanding, not CUDA.