Search code examples
cgccclangcompiler-optimizationstrict-aliasing

Stricit aliasing violation: Why gcc and clang generate different output?


When the typecasting violates the strict aliasing rule in C and C++, a compiler may optimize in such a way that wrong constant value can be propagated and unaligned access could be allowed, which results in performance degradation or bus errors.

I wrote a simple example to see how the compiler optimize the constant when I violate the strict aliasing rule in GCC & Clang.

Here is the code and instructions that I got.

#include <stdio.h>
#include <stdlib.h>

int
foo () //different result in C and C++
{
    int x = 1;
    long *fp = (long *)&x;
    *fp = 1234L;

    return x;
}

//int and long are not compatible 
//Wrong constant propagation as a result of strict aliasing violation
long
bar(int *ip, long *lp)
{
    *lp = 20L;
    *ip = 10;

    return *lp;
}

//char is always compatible with others
//constant is not propagated and memory is read
char
car(char *cp, long *lp)
{
    *cp = 'a';
    *lp = 10L;
    return *cp;
}

When I compile the code with the GCC 8.2 with -std=c11 -O3 option.

foo:
  movl $1234, %eax
  ret
bar:
  movq $20, (%rsi)
  movl $20, %eax
  movl $10, (%rdi)
  ret
car:
  movb $97, (%rdi)
  movq $10, (%rsi)
  movzbl (%rdi), %eax
  ret

When I compile the code with the clang 7.0 with -std=c11 -O3 option.

foo: # @foo
  movl $1, %eax
  retq
bar: # @bar
  movq $20, (%rsi)
  movl $10, (%rdi)
  movl $20, %eax
  retq
car: # @car
  movb $97, (%rdi)
  movq $10, (%rsi)
  movb (%rdi), %al
  retq

bar and car function generate almost same instruction sequences and the return values are same in both case; bar violates the rule, and constant is propagated; and car doesn't violate and the correct value is read from the memory.

However, for the foo function which violates the strict aliasing rule generate different output output in GCC and Clang; gcc propagates the correct value stored in the memory (but not with the memory reference), and clang propagates a wrong value. It seems that two compilers both apply the constant propagation as its optimization, but why two compilers generate a different result? Is it mean that GCC automatically finds out strict aliasing violation in the foo function and propagate the correct value?

Why they show different instruction streams and result?


Solution

  • Why can we say the bar doesn't violate the strict aliasing rule?

    If the code that calls bar does not violate strict aliasing, bar will not violate strict aliasing either.

    Let me give an example.

    Suppose we call bar like this:

    int x;
    long y;
    bar(&x, &y);
    

    Strict aliasing requires that two pointers of different types do not refer to the same memory. &x and &y are different types, and they refer to different memory. This does not violate strict aliasing.

    On the other hand, let's say we call it like this:

    long y;
    bar((int *) &y, &y);
    

    Now we've violated strict aliasing. However, the violation is the caller's fault.