I can't figure out what is causing the issue. I get "access violation writing location" error in the last line. Am I not correctly allocating the memory?
typedef struct {
doubleXYZW cen_sum; //struct with 4 doubles
double STS[6];
XYZW *Points;// //struct with 4 floats
}BUNDLE;
BUNDLE *cpu_data = NULL;
size_t bundle_size = NUM_POINTS * sizeof(XYZW) + sizeof(doubleXYZW) + 6*sizeof(double);
HANDLE_ERROR(cudaMallocHost((BUNDLE**)&cpu_data, bundle_size));
//error in the next line
cpu_data->Points[0].x = 0; //x is the first element in the XYZW struct
You have 2 allocations that must be done, and you are only performing one of them.
You are allocating some storage for the cpu_data
pointer, but you have not allocated any storage for the Points
pointer. Therefore when you dereference Points:
cpu_data->Points[0].x = 0;
^ ^
| this dereferences the Points pointer (NOT allocated!)
|
this dereferences the cpu_data pointer (allocated)
you are dereferencing a pointer that you have not allocated, so it is invalid. Attempting to access something that way will generate an invalid access.
You have (at least) two options to fix it:
cpu_points
, you can perform another cudaMallocHost
allocation on cpu_points->Points
If you know the size of the Points
array (it seems like you do - NUM_POINTS
) then you could just statically allocate for it:
typedef struct {
doubleXYZW cen_sum; //struct with 4 doubles
double STS[6];
XYZW Points[NUM_POINTS];// //struct with 4 floats
}BUNDLE;
Note that your bundle_size
calculation is crafted in such a way that the 2nd method is suggested. If you go with the first method, your bundle_size
calculation is incorrect. In any event, with either method, it's easier just to compute bundle_size
as sizeof(BUNDLE)
.
To be clear, there is nothing CUDA-specific here (the error would be present e.g. if you used malloc
instead of cudaMallocHost
). The problem is rooted in basic C understanding, not CUDA.