Considering the following host function:
uint64_t * SomeDevPtr =...
/* Where SomeDevPtr is a pointer pointed to some device memory address allocated by cudaMalloc(); */
uint32_t * SomeDevIntPtr = reintepret_cast<uint32_t *>(SomeDevPtr);
Because of the function, cudaMalloc
will automatcially fullfill some aligment requirements (I think it is aligned to some 128 byte memory boundary), therefore I think both SomeDevIntPtr
and SomeDevPtr
should be start at exact the same physical memory address at GPU's global memory, am I correct on this?
I just want to make sure about that since some of the functions I wrote depend on it.
A reinterpret_cast
of a pointer to a pointer does not (ie. should not) change the underlying numerical value (bit pattern representation) of a pointer.
Therefore whatever alignment conditions exist will not be affected by that kind of cast.
It's possible of course, to cast a properly aligned pointer to a type that no longer has proper alignment. For example, a properly aligned float
pointer that is not at an evenly-divisible-by-4 float
offset (index) cannot be properly cast to a float4
pointer for CUDA device usage. Some CUDA pointers need to be naturally aligned.
You may also be interested in this question.