Search code examples
calignmentmemory-alignment

C Avoiding Alignment Issues


Could some please explain, what is really wrong with the following example, especially the part with "which might result in the 32-bit unsigned long being loaded from an address that is not a multiple of four":

"The compiler generally prevents alignment issues by naturally aligning all data types. In fact, alignment issues are normally not major concerns of the kernel developersthe gcc folks have to worry about them. Issues arise, however, when the programmer plays too closely with pointers and accesses data outside the environment anticipated by the compiler.

Accessing an aligned address with a recast pointer of a larger-aligned address causes an alignment issue (whatever that might mean for a particular architecture). That is, this is bad news:

char dog[10];
char *p = &dog[1];
unsigned long l = *(unsigned long *)p;

This example treats the pointer to a char as a pointer to an unsigned long, which might result in the 32-bit unsigned long being loaded from an address that is not a multiple of four.

If you are thinking, "When in the world would I do this?" you are probably right. Nevertheless, it has come up, and it will again, so be careful. The real-world examples might not be so obvious."

Though I don't really understand the problem, can it be solved by using the following code and if so, why?

char * dog = (char *)malloc(10 * sizeof(char));
char *p = dog +1;
unsigned long l = *(unsigned long*)p;

Solution

  • Your proposed solution is pretty much the same as the quoted one, so it suffers from the same problem.

    Misalignment problem

    When you reserve memory, the compiler reserves it with the required alignment, either with the usage of automatic variables (char dog[10]), either with malloced variables.

    When you fool the compiler by doing pointer arithmetic tricks, like the one you are doing, then it cannot guarantee that access alignment will be correct.

    Why is this problematic?

    Because, depending on the hardware architecture you are using, the compiler may emit instructions that require 2 or 4 byte alignment. For instance, ARM has several instructions that require data to be 2 byte aligned (this is, its address has to be even). Thus, your code built for an ARM processor would likely to emit an access violation.

    How would you solve your problem then?

    Usually, with a memcpy:

    char *dog = malloc(10 * sizeof(char));
    char *p = dog;
    unsigned long l;
    
    memcpy(&l, p+1, sizeof(l));
    //You can use l safely now.
    
    //Copy back l to the array:
    memcpy(p+1, &l, sizeof(l));