Search code examples
c++cudanvidiareinterpret-cast

Behavior of reintepret_cast of CUDA pointers?


Considering the following host function:

uint64_t * SomeDevPtr =...
/* Where SomeDevPtr is a pointer pointed to some device memory address allocated by cudaMalloc(); */

uint32_t * SomeDevIntPtr = reintepret_cast<uint32_t *>(SomeDevPtr);

Because of the function, cudaMalloc will automatcially fullfill some aligment requirements (I think it is aligned to some 128 byte memory boundary), therefore I think both SomeDevIntPtr and SomeDevPtr should be start at exact the same physical memory address at GPU's global memory, am I correct on this?

I just want to make sure about that since some of the functions I wrote depend on it.


Solution

  • A reinterpret_cast of a pointer to a pointer does not (ie. should not) change the underlying numerical value (bit pattern representation) of a pointer.

    Therefore whatever alignment conditions exist will not be affected by that kind of cast.

    It's possible of course, to cast a properly aligned pointer to a type that no longer has proper alignment. For example, a properly aligned float pointer that is not at an evenly-divisible-by-4 float offset (index) cannot be properly cast to a float4 pointer for CUDA device usage. Some CUDA pointers need to be naturally aligned.

    You may also be interested in this question.