Search code examples
cclangundefined-behaviormemcpyubsan

Load of misaligned address and UBsan finding


This question is not about the definition of unaligned data accesses, but why memcpy silences the UBsan findings whereas type casting does not, despite generating the same assembly code.

I have some example code to parse a protocol that sends a byte array segmented into groups of six bytes.

void f(u8 *ba) {
    // I know this array's length is a multiple of 6
    u8 *p = ba;
    u32 a = *(u32 *)p;
    printf("a = %d\n", a);
    p += 4;
    u16 b = *(u16 *)p;
    printf("b = %d\n", b);

    p += 2;
    a = *(u32 *)p;
    printf("a = %d\n", a);
    p += 4;
    b = *(u16 *)p;
    printf("b = %d\n", b);
}

After incrementing my pointer by 6 and doing another 32-bit read, the UBSan reports an error about a misaligned load. I suppress this error using memcpy instead of type-punning, but I don't have a good understanding why. To be clear, here is the same routine without UBSan errors,

void f(u8 *ba) {
    // I know this array's length is a multiple of 6 (
    u8 *p = ba;
    u32 a;
    memcpy(&a, p, 4);
    printf("a = %d\n", a);
    p += 4;
    memcpy(&b, p, 2);
    printf("b = %d\n", b);

    p += 2;
    memcpy(&a, p, 4);
    printf("a = %d\n", a);
    p += 4;
    memcpy(&b, p, 2);
    printf("b = %d\n", b);
}

Both routines compile to identical assembly code (using movl for the 32-bit read and movzwl for the 16-bit read), so why is one undefined behaviour when the other is not? Does memcpy have some special properties that guarantee something?

I don't want to use memcpy here because I can't rely on compilers doing a good enough job optimising it.


Solution

  • UB sanitizer is used to detect that the code is not strictly-conforming and depends, in fact, on undefined behaviour that is not guaranteed.

    Actually the C standard says that the behaviour is undefined as soon as you cast a pointer to a type for which the address is not suitably aligned. C11 (draft, n1570) 6.3.2.3p7:

    A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned 68) for the referenced type, the behavior is undefined.

    I.e.

    u8 *p = ba;
    u32 *a = (u32 *)p; // undefined behaviour if misaligned. No dereference required
    

    The presence of this cast allows a compiler to presume that ba was aligned to 4-byte boundary (on a platform where u32 is required to be thus aligned, which many compilers will do on x86), after which it can generate code that assumes the alignment.

    Even on x86 platform, there are instructions that fail spectacularly: innocent-looking code can be compiled into machine code that will cause an abort at runtime. UBSan is supposed to catch this in code that would otherwise look sane and behave "as expected" when you run it, but then fail if compiled with another set of options or different optimization level.

    The compiler can generate the correct code for memcpy - and often will, but it is just because the compiler will know that the unaligned access would work and perform well enough on the target platform.

    Lastly:

    I don't want to use memcpy here because I can't rely on compilers doing a good enough job optimising it.

    What you're saying here is: "I want my code to work reliably only whenever compiled by garbage or two-decades-old compilers that generate slow code. Definitely not when compiled with the ones that could optimize it to run fast."