Can a byte pointer ever be safely passed to vld2q_u16
?
I'm mostly concerned about static analyzer complaints.
uint16x8x2_t load_interleaved_shorts (const uint8_t* const ptr) {
uint16_t* p16 = (uint16_t*)ptr; // possible undefined behavior ?
return vld2q_u16(p16);
}
In my instance: The pointer is always aligned to a 16 byte boundary. The compiler doesn't known the alignment of the pointer. The code must be portable and strictly follow the C90 standard.
Assumptions:
Replacing vld2q_u16
with vld1q_u8 / vuzpq_u8
would hurt performance.
The probability of the compiler optimizing a scalar pattern into a vld2q_u16
is small.
Edit: suppressed some warnings by casting to a void pointer.
vld2q_u16((const uint16_t*)(const void*)src)
The code must be portable and strictly follow the C90 standard.
... plus everything implied by the presence of ARM NEON intrinsics! (Although that may not help a static analyzer). (Related: Is `reinterpret_cast`ing between hardware SIMD vector pointer and the corresponding type an undefined behavior? discusses that for x86, but your case is a bit different; your pointers are aligned).
In C, it's safe to cast between pointer types (without dereferencing) as long as you never create a pointer with insufficient alignment for its type. You don't need a compile-time-visible guarantee of alignment, you just need to not ever actually create a uint16_t*
that doesn't have alignof(uint16_t)
alignment.
(This makes it unlikely for a static analyzer to complain even if that wasn't the case, unless it could see something like (uint16_t*)(1 + (char*)&something_aligned)
where you take an aligned address and offset it by an odd number, which would be guaranteed to produce a misaligned address.)
And in practice, compilers targeting byte-addressable machines do more or less define the behaviour even for creating misaligned pointers. (For example, Intel intrinsics for unaligned loads depend on creating an unaligned __m128i*
.) As long as you don't deref them, which is unsafe even in practice on targets that allow unaligned loads; see my answer on this Q&A for an example and the blog links that cover other examples.
So you're 100% fine: your code never creates a misaligned uint16_t*
, and doesn't directly dereference it.
If ARM has unaligned-load intrinsics, it would even be safe to form a misaligned uint16_t*
and pass it to the function; the existence/design of the intrinsics API implies that it's safe to use it that way.
Other things that are undefined behaviour but which you aren't doing:
It's technically UB to form a pointer that isn't pointing inside an object, or one-past-end, but in practice mainstream implementations allow that as well.
It's strict-aliasing UB to dereference a uint16_t*
that doesn't point to uint16_t
objects. But any dereferencing only happens inside intrinsic "functions", so you don't have to worry about the strict-aliasing rule. (Which may pointer-cast to some special type and deref, or may pass the pointer on to a __builtin_arm_whatever()
compiler built-in.)
I assume that ARM load/store intrinsics are defined similar to memcpy
, being able to read/write the bytes of any object. So e.g. you could vld2q_u16
on an array of int
, double
, or char
. Intel intrinsics are defined that way (e.g. GCC/clang use __attribute__((may_alias))
.) If not, it wouldn't be safe.
And BTW, the char*
-can-alias-anything rule only works one way. Yes it's safe to point a char*
at a uint16_t
, but if you have an actual array of char buf[100]
, those objects are definitely char objects, and it's UB to access them through a uint16_t*
. However, if you only have char*
, and only one other pointer-type other than char*
is used, then you can look at the memory as having whatever the other type is, and every char*
access aliasing that.