Consider a C99 program that reads from a read-only binary blob linked into the program's binary through a linkerfile. The program knows where the blob starts in memory, but its layout is not known during compilation. The blob consists of unsigned 32-bit, and 64-bit integers. We took care to make sure that their endianness corresponds to (data) endianness on the used platform. We also took care to put the blob in memory such that it is 4B aligned.
Requirements:
(performance) We want to read both 32-bit and 64-bit integers with minimum number of instructions, based on the possibilities of individual platforms (e.g. to use single load instruction where applicable)
(portability) This program must run on ARM, x86_64 and MIPS architectures. Also some architectures have 32-bit system bus, others have 64-bit bus.
-fno-strict-aliasing
and similar.Seemingly, this could be done with type-punning. We know where in the memory is the value we want to read and we can cast the pointer from original (unsigned char*
) to one of uint32_t*
, uint64_t*
.
But C99's strict aliasing rules confuse me.
There will be no aliasing, of that we can be sure - we would not be punning on the same memory location to two different types that are not unsigned char
. The layout of the binary blob does not allow this.
Question:
Is casting a const uint8_t*
to const uint32_t*
, or const uint64_t*
well-defined in C99, as long as we are sure we do not alias the same pointers to both const uint32_t*
and const uint64_t*
?
The strict aliasing rules are effectively (pun intended (the 2nd pun intended too)) 6.5p6 and 6.5p7.
If you read through a declared char buffer, e.g.:
char buf[4096];
//...
read(fd, buf, sizeof(buf);
//...
and want do *(uint32_t*)(buf+position)
then you're definitely violating
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
- a type compatible with the effective type of the object,
If you mmap or malloc the buffer (make the memory dynamically typed), then it's more complicated, but in any case, the standard-compliant way way of reading such a uint32_t
--through memcpy
--works in either case and typically carries no performance penalty because optimizing compilers recognize memcpy
calls and treat them specially.
Example:
#include <stdint.h>
#include <string.h>
uint32_t get32_noalias(void const *P)
{
return *(uint32_t*)(P);
}
static inline uint32_t get32_inl(void const *P)
{
uint32_t const*p32 = P;
//^optional (might not affect codegen)
//to assert that P is well-aligned for uint32_t
uint32_t x; memcpy(&x,p32,sizeof(x));
return x;
}
//should generate same code as get32_noalias
//but without violating 6.5p7 when P points to a char[] buffer
uint32_t get32(void const *P)
{
return get32_inl(P);
}
https://gcc.godbolt.org/z/sGf4rf
Generated assembly on x86-64:
get32_noalias: # @get32_noalias
movl (%rdi), %eax
retq
get32: # @get32
movl (%rdi), %eax
retq
While*(uint32_t*)p
probably won't blow up in your case in practice (if you only do readonly accesses or readonly accesses intertwined with char-based writes like those done by the read
syscall, then it "practically" shouldn't blow up), I don't see a reason to avoid the fully-standard compliant memcpy
-based solution.